KVarN: Native vLLM backend for KV-cache quantization by Huawei
Posted 3 hours ago by
theanonymousone
66
points
https://github.com/huawei-csl/KVarN
3
comments
Loading..
Loading..
Loading..