Go back

JetSpec Enables Up to 9.64x Lossless LLM Inference Speedup with Up to 1000TPS

Posted 3 hours ago by snyhlxde
4 points

https://haoailab.com/blogs/parallel-tree-decoding/

1 comments

Loading..