Accelerating LLM Inference on AMD GPUs with Low-Latency GEMMs
Posted 3 hours ago by
matt_d
2
points
https://rocm.blogs.amd.com/software-tools-optimization/accelerating-llm-inference-on-amd-gpus-with-low-latency-gemms/README.html
0
comments