Benchmark

MP-Bench Leaderboard

Multi-perspective failure attribution in multi-agent systems.
Ranked by nDCG@5 (Exp) on manual and automatic annotation splits.

Models
Best nDCG@5
Last updated
# Model nDCG@5 Score bar Date Links

Loading…

Have a new model to evaluate? Follow the submission guide and open a PR — results are computed automatically.

Submit your model →