# LLM from scratch

## Resources
- [Build a Large Language Model](https://www.manning.com/books/build-a-large-language-model-from-scratch)
- [Writing an LLM from scratch, part 28](https://www.gilesthomas.com/2025/12/llm-from-scratch-28-training-a-base-model-from-scratch)
- [nanochat](https://github.com/karpathy/nanochat)

## TODO:
- chat cli, evaluate each epoch
- better arch (read nanochat)
- count tokens
- download more data (code, full fineweb)
- better train progress bar
- Notes

- TrainTestIterator
  - total length
  - deterministic shuffle
  - prepare in parallel
  - refactor new() into builder
  - small texts (<|bos|>?)

- Training
  - multi-device training
  - model parameters in file