Stack: FastAPI backend, Clerk for auth, Elo-style rating system for ranked matches. The obvious objection is LLM-as-judge reliability, I don’t think it’s perfect, but I’ve found it’s decent at scoring argument structure and rebuttals rather than just rewarding confident-sounding text. Curious what people think breaks it, I’m sure there are ways to game the judging that I haven’t found yet.
Show HN: Debategle – ranked 1v1 debates judged by an LLM
- Posted 2 hours ago by sawsymikey
- 3 points