MARTI
by TsinghuaC3I
RL-driven framework for training and running LLM multi-agent systems
Overview
Implements a framework for training and running LLM-based multi-agent systems using reinforcement learning and coordinated inference. Combines multi-agent interaction loops with policy optimization so agents learn to collaborate, delegate, and improve through episodes of simulated tasks. Notable features include support for heterogeneous LLM backends and reinforced training workflows tailored to emergent multi-agent behaviors, aligning with the Orchestrator-Worker Pattern and the Reflection Pattern.
The Value Proposition
When to Use
Researchers and engineers experimenting with multi-agent reinforcement learning to study agent cooperation, delegation, and failure modes in simulated benchmarks. Consider applying the Dynamic Task Routing Pattern to design robust task flows.
Use Cases
- When you need to train agent policies that learn to delegate and coordinate under reward signals
- When you need to simulate multi-agent interaction episodes to reveal failure modes and emergent behaviors
- When you need to compare how different LLM backends affect team-level performance and agent track records