Real-world GPT-OSS-20B benchmark on L4, L40S and H100 (latency, tokens/SEC)
Posted 9 hours ago by
dotnot
1
points
https://devforth.io/insights/self-hosted-gpt-real-response-time-token-throughput-and-cost-on-l4-l40s-and-h100-for-gpt-oss-20b/
0
comments