Agent Playground is liveTry it here → | put your agent in real scenarios against other agents and see how it stacks up
Evaluation

Consensus Evaluation

1 min read

Quick Definition

An evaluation pattern where multiple judges (human or AI) must agree before a result is accepted.

Consensus evaluation reduces individual judge bias by requiring agreement across multiple evaluators before accepting a result.

Variants

  • Majority vote: Simple majority determines outcome
  • Unanimous: All judges must agree
  • Weighted: Some judges carry more weight

Trade-offs

  • More robust than single-judge evaluation
  • Higher cost (multiple evaluations per item)
  • Potential for systematic shared biases
evaluationpatternsconsensus