Trionyx-2-2B
I trained another 2B parameter LLM from scratch today. Am running SFT now, but I am excited that the base eval shows I have noticably improved on the last run on a number of key benchmarks:
Here is a quick comparison.
| Task | Trionyx 2B | Trionyx-2 2B | Change |
|---|---|---|---|
| ARC Easy | 0.354 | 0.610 | +0.256 |
| ARC Challenge | 0.046 | 0.420 | +0.374 |
| COPA | 0.240 | 0.610 | +0.370 |
| CommonsenseQA | 0.079 | 0.380 | +0.301 |
| PIQA | 0.296 | 0.630 | +0.334 |
| SQuAD | 0.235 | 0.485 | +0.250 |
| CoQA | 0.245 | 0.365 | +0.120 |
Pretty pleased with this!