Blog
- 10-07-25LLM World Models
- 10-04-25October Paper Reading
- 09-23-25hear the wind sing
- 09-22-25choice
- 09-21-25virtual cell (notes)
- 09-07-25September Paper Reading
- 08-29-25Late August Paper Reading
- 08-28-25Do LLMs have good music taste?
- 08-24-25Review: Killing Commendatore
- 08-20-25Notes on Gravity and Grace
- 08-16-25Mid August Paper Reading
- 08-09-25How FlashAttention Works
- 08-01-25Early August Paper Reading
- 07-31-25Optimizers
- 07-23-25Late July Paper Reading
- 07-15-25Mid July Paper Reading
- 07-03-25Early July Paper Reading
- 07-02-25DeepSeek's High Level Software Magic
- 06-15-25Mid June Paper Reading
- 06-13-25Transformer in PyTorch
- 06-11-25A First Look at Mechanistic Interpretability
- 06-10-25Decoder-Only Architectures, KV Cache, and MLA
- 06-09-25Notes on Latent Attention
- 06-08-25Understanding Attention
- 06-03-25Early June Paper Reading