Ground truth provides the reference standard for measuring accuracy. Without reliable ground truth, evaluation becomes subjective.
Sources
- Human expert annotations
- Verified factual databases
- Mathematical proofs (for reasoning tasks)
- Real-world outcomes (for predictions)
Challenges
- Expensive to create at scale
- May contain errors
- Some tasks have no single correct answer