22d3b0a94ef80997dfb214e6813146365a4edb3e
LLM from scratch
Resources
TODO:
-
chat cli, evaluate each epoch
-
better arch (read nanochat)
-
count tokens
-
download more data (code, full fineweb)
-
better train progress bar
-
Notes
-
TrainTestIterator
- total length
- deterministic shuffle
- prepare in parallel
- refactor new() into builder
- small texts (<|bos|>?)
-
Training
- multi-device training
- model parameters in file
Description
Languages
Rust
97.8%
Nix
2.2%