Accelerating LLM Inference on AMD GPUs with Low-Latency GEMMs

  • Posted 3 hours ago by matt_d
  • 2 points
https://rocm.blogs.amd.com/software-tools-optimization/accelerating-llm-inference-on-amd-gpus-with-low-latency-gemms/README.html

0 comments