Evaluation

Throughput

1 min read

Definition

The number of requests or tasks an agent system can process per unit time.

Throughput determines system capacity and scaling requirements. Higher throughput means more users served with same infrastructure.

Measurement

  • Requests per second (RPS)
  • Tasks completed per hour
  • Tokens processed per minute

Factors

  • Model size and hardware
  • Batching efficiency
  • Queue management
  • Rate limiting
evaluationperformancescaling