Sparser, Faster, Lighter Transformer Language Models

  • Posted 3 hours ago by matt_d
  • 2 points
https://arxiv.org/abs/2603.23198

0 comments