Context window limits constrain what information an agent can consider at once. Larger windows enable more complex tasks but increase cost and latency.
Current Limits
- GPT-4: 8K-128K tokens
- Claude: 100K-200K tokens
- Gemini: Up to 1M tokens
Strategies for Limits
- Summarization
- Retrieval augmentation
- Sliding windows
- Hierarchical processing