One Wrong Fact Can Throw Off an AI Team — Here’s How to Limit the Damage

At a Glance

Contextual misinformation added to an agent’s prompt can cut task accuracy substantially, but group debate often reduces the damage if enough uninformed (honest) agents and a robust decision protocol like consensus are used.

ON THIS PAGE

Core Insights

Adding relevant but false context to an agent’s prompt meaningfully lowers performance on knowledge and ethical judgment tasks while having a smaller effect on commonsense reasoning. When multiple agents debate, the group can recover correct answers if enough uninformed agents counter the misinformed ones; the exact outcome depends on how many misinformed agents are present and how the group decides (voting versus consensus). Voting can give higher absolute accuracy in clean settings but is more vulnerable to peer pressure from misinformed agents; consensus is more stable under misinformation. Blackboard Pattern

Test your agentsValidate against real scenarios

Learn More

Data Highlights

1Injecting relevant misinformation dropped Llama-3.3 accuracy on Complex Web Questions from 0.49 to 0.36 — a 26.75% relative decrease.

2Relevant misinformation reduced Llama-3.3 accuracy on the Ethics benchmark by 25.71%.

3Providing irrelevant but true context improved GLM-4.7 performance on Complex Web Questions by 27.5%, showing the harm is not just due to longer prompts.

What This Means

Engineers building multi-agent AI systems and technical leads planning deployments should care because a single incorrect context can propagate through agent interactions and change outcomes. Researchers and reliability teams can use these findings to design voting/consensus rules, add cross-checking agents, or add monitoring to reduce misinformation spread. To optimize task routing and collaboration, teams might explore Dynamic Task Routing Pattern.

Key Figures

Figure 1: Overall accuracy by dataset and model comparing the single-agent and multi-agent system with and without misinformation in the context.

Fig 1: Figure 1: Overall accuracy by dataset and model comparing the single-agent and multi-agent system with and without misinformation in the context.

Fig 2: Figure 2: Single-agent accuracy for misinformed conditions across the three datasets and two models.

Figure 3: Turn-to-turn persistence difference between uninformed and misinformed solutions by misinformation category; negative values indicate stronger retention of misinformed answers. Results are for Llama-3.3 .

Fig 3: Figure 3: Turn-to-turn persistence difference between uninformed and misinformed solutions by misinformation category; negative values indicate stronger retention of misinformed answers. Results are for Llama-3.3 .

Fig 4: Figure 4: Multi-agent accuracy on WinoGrande under consensus vs. voting as the number of misinformed agents increases.

Ready to evaluate your AI agents?

Learn how ReputAgent helps teams build trustworthy AI through systematic evaluation.

Learn More

Keep in Mind

Experiments used two open-weight models, so results may not generalize across all model families or very large commercial models. Misinformation was machine-generated, which enables scale and control but may differ from human-written or adversarially optimized misinformation. The debate setups were fixed (turn counts and two decision rules), so real-world systems with memory, tools, or dynamic roles might behave differently. For structured reasoning approaches within debates, researchers may consider the Tree of Thoughts Pattern.

Deep Dive

Researchers gave agents deliberately false context (MINT dataset) and measured how that context affected single agents and groups of agents working via a debate protocol. They tested three task types: a commonsense reasoning benchmark, a multi-step factual question set, and an everyday ethics judgment set. Misinformation was either directly relevant to the task or irrelevant; irrelevant true context was used as a control to check whether simply adding tokens mattered. Results show that relevant misinformation hits knowledge-heavy and ethical tasks hardest: for example, one model’s accuracy on complex factual questions fell by about 27% when given misleading context. Multi-agent debate reduced the damage in many settings if there were enough uninformed agents to counter the misinformed ones; majority pressure kicks in when groups include three or more opposing agents. Decision protocol matters: majority voting can yield higher accuracy in clean cases but is more brittle under misinformation, while a consensus protocol (taking the final agent response) was comparatively stable. The work releases the MINT dataset and code to help evaluate agent-to-agent trust and robustness, and recommends designing group composition and decision rules with misinformation risk in mind. The approach aligns with applying a Model Context Protocol (MCP) Pattern to manage context sharing and decision flow.

Not sure where to start?Get personalized recommendations

Learn More

Credibility Assessment:

Authors affiliated with recognized institutions (University of Göttingen, NRC Canada) and moderate h-indices (11–14) but arXiv venue and low citation count — solid but not top-tier.

multi-agent trust agent-to-agent evaluation agent reliability multi-agent orchestration

Not sure where to start?