KVarN: Native vLLM backend for KV-cache quantization by Huawei

  • Posted 3 hours ago by theanonymousone
  • 66 points
https://github.com/huawei-csl/KVarN

3 comments

    Loading..
    Loading..
    Loading..