SuperInfer: SLO-Aware Rotary Scheduling and Memory Management for LLM Inference
Posted 2 hours ago by
matt_d
2
points
https://supercomputing-system-ai-lab.github.io/projects/superinfer/
0
comments