Real-world GPT-OSS-20B benchmark on L4, L40S and H100 (latency, tokens/SEC)

Posted 9 hours ago by dotnot
1 points

https://devforth.io/insights/self-hosted-gpt-real-response-time-token-throughput-and-cost-on-l4-l40s-and-h100-for-gpt-oss-20b/

0 comments