Refactor
This commit is contained in:
25
README.md
Normal file
25
README.md
Normal file
@@ -0,0 +1,25 @@
|
||||
# LLM from scratch
|
||||
|
||||
## Resources
|
||||
- [Build a Large Language Model](https://www.manning.com/books/build-a-large-language-model-from-scratch)
|
||||
- [Writing an LLM from scratch, part 28](https://www.gilesthomas.com/2025/12/llm-from-scratch-28-training-a-base-model-from-scratch)
|
||||
- [nanochat](https://github.com/karpathy/nanochat)
|
||||
|
||||
## TODO:
|
||||
- chat cli, evaluate each epoch
|
||||
- better arch (read nanochat)
|
||||
- count tokens
|
||||
- download more data (code, full fineweb)
|
||||
- better train progress bar
|
||||
- Notes
|
||||
|
||||
- TrainTestIterator
|
||||
- total length
|
||||
- deterministic shuffle
|
||||
- prepare in parallel
|
||||
- refactor new() into builder
|
||||
- small texts (<|bos|>?)
|
||||
|
||||
- Training
|
||||
- multi-device training
|
||||
- model parameters in file
|
||||
Reference in New Issue
Block a user