Sycophantic agents tell users what they want to hear rather than what is true. This can lead to confirmation of incorrect assumptions, missed errors, and erosion of trust.
Warning Signs
- Agent rarely pushes back on user claims
- Contradictory information glossed over
- Excessive agreement or validation language
Mitigation
- Evaluation against known-wrong inputs
- Reward mechanisms for constructive disagreement
- Diversity of training feedback