ProtocolProduction ReadyMCP

openinference

Name: openinference
Rating: 3.1 (908 reviews)
Author: Arize-ai

by Arize-ai

OpenTelemetry instrumentation for LLMs and agent observability

Python

Updated Apr 4, 2026

908

Stars

208

Forks

View on GitHub

Summary

Instrument AI applications with OpenTelemetry to capture request traces, metrics, and logs specific to LLM and agent workflows. Provides reusable instrumentation for popular agent and retrieval frameworks so observability is embedded at the SDK level. Includes semantic attributes and spans tailored to LLM calls, embeddings, and retrievals to make agent interactions queryable and linkable across services. See the Human-in-the-Loop pattern for governance considerations and the Agent Registry Pattern to discover and organize tools.

Key Benefits

As agents coordinate and delegate, seeing who did what and why becomes essential for trust and debugging. Openinference brings standardized telemetry to AI pipelines so you can correlate agent decisions, model behavior, and infrastructure signals. That makes it possible to surface agent failure modes, establish an agent track record, and feed continuous evaluation and governance pipelines. For resilience, consider the Cascading Reliability Failures.

Ideal For

Teams instrumenting multi-agent or LLM-based systems that need structured traces, metrics, and logs for monitoring and post-hoc evaluation. Consider adopting the Agent Service Mesh Pattern to manage complex tool interactions and security.

Real-World Examples

Trace and correlate LLM calls across microservices to diagnose agent failures
Emit standardized metrics and spans for agent delegation and decision points
Capture retrieval and embedding events for debugging retrieval-augmented agents
Feed structured telemetry into continuous agent evaluation and governance pipelines

Avoid common pitfalls

Learn what failures to watch for