Adversarial inputs exploit model vulnerabilities, often appearing innocuous to humans while causing system failures.
Examples
- Typos that fool classifiers
- Semantic-preserving perturbations
- Out-of-distribution triggers
- Multi-modal attacks
Robustness
Systems should be tested against adversarial inputs, not just normal cases.