Results against FastAPI (100 concurrent clients):
- JSON: 8,400 req/s vs 4,500 req/s (+87%)
- CPU-bound: 1,425 req/s vs 266 req/s (+435%)
The CPU-bound result is the interesting one. Async can't parallelize CPU work - it's fundamentally single-threaded. With free-threaded Python, adding more threads actually scales:
- 4 threads: 608 req/s
- 8 threads: 1,172 req/s (1.9x)
- 16 threads: 1,297 req/s (2.1x)
The framework is ~500 lines across 5 files. Key implementation choices:
- ThreadPoolExecutor for workers
- HTTP/1.1 keep-alive connections
- Radix tree router for O(1) matching
- Pydantic for validation
- Optional orjson for faster serialization
This is experimental and not production-ready, but it's an interesting datapoint for what's possible when Python drops the GIL.