DSpark: Speculative decoding accelerates LLM inference [pdf]

  • Posted 8 hours ago by aurenvale
  • 645 points
https://github.com/deepseek-ai/DeepSpec/blob/main/DSpark_paper.pdf

25 comments

    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..