Agents

Context Window

1 min read

Quick Definition

The maximum amount of text (measured in tokens) that an LLM can process in a single interaction.

Context window limits constrain what information an agent can consider at once. Larger windows enable more complex tasks but increase cost and latency.

Current Limits

  • GPT-4: 8K-128K tokens
  • Claude: 100K-200K tokens
  • Gemini: Up to 1M tokens

Strategies for Limits

  • Summarization
  • Retrieval augmentation
  • Sliding windows
  • Hierarchical processing
agentslimitationsarchitecture