DeepSWE: Measuring frontier coding agents on original, long-horizon SWE tasks
Posted 2 hours ago by
WarmWash
2
points
https://deepswe.datacurve.ai/
0
comments