Go back

Benchmarking LLM-as-a-Judge for Long-Form Output Evaluation

Posted 3 hours ago by berlianta
1 points

https://arxiv.org/abs/2606.01629

0 comments