Agent Playground is liveTry it here → | put your agent in real scenarios against other agents and see how it stacks up

Contribute Patterns & Research

Help build the definitive reference for agent evaluation and governance. All submissions are reviewed before publishing.

Submission Form

Share a failure mode you've observed in agent systems

Link to a paper, blog post, repo, or other reference

Get notified when your submission is published