VibeThinker: 3B param model that beats Opus 4.5 on reasoning with novel SFT+GRPO
- Posted 16 hours ago by timhigins
- 337 points
40 comments
Loading..
Loading..
Loading..
Loading..
Loading..
Loading..
Loading..
Loading..
Loading..
Loading..
Loading..
Loading..
Loading..
Loading..
Loading..
Loading..
Loading..
Loading..
Loading..
Loading..
Loading..
Loading..
Loading..
Loading..
Loading..
Loading..
Loading..
Loading..
Loading..
Loading..
Loading..
Loading..
Loading..
Loading..
Loading..
Loading..
Loading..
Loading..
Loading..
Loading..