Sparser, Faster, Lighter Transformer Language Models
Posted 3 hours ago by
matt_d
2
points
https://arxiv.org/abs/2603.23198
0
comments