JetSpec Enables Up to 9.64x Lossless LLM Inference Speedup with Up to 1000TPS
Posted 3 hours ago by
snyhlxde
4
points
https://haoailab.com/blogs/parallel-tree-decoding/
1
comments
Loading..