JetSpec Enables Up to 9.64x Lossless LLM Inference Speedup with Up to 1000TPS

  • Posted 3 hours ago by snyhlxde
  • 4 points
https://haoailab.com/blogs/parallel-tree-decoding/

1 comments

    Loading..