Vakra: Reasoning, Tool Use, and Failure Modes of Agents
Posted 3 hours ago by
gmays
2
points
https://huggingface.co/blog/ibm-research/vakra-benchmark-analysis
0
comments