At a Glance

Keeping a single, structured internal memory that’s updated each turn prevents agents from drifting, reduces repeated context, and improves multi-step reliability.

Key Findings

Agents that replay full conversation history or inject retrieved text tend to accumulate noise and repeat early mistakes, causing constraint drift and hallucinations. Replacing transcript replay with a bounded, schema-driven internal state — a single Compressed Cognitive State — stabilizes behavior across long interactions. Separating retrieval (propose evidence) from state commitment (what actually persists) keeps the agent focused on decision-critical facts while external evidence grows separately. In live multi-turn tests across operational domains, this approach maintained small memory footprints and fewer memory-driven errors.

By the Numbers

1Retrieval was restricted to 3 artifacts per turn to limit retrieval-driven drift.
2The memory controller commits exactly 1 persistent Compressed Cognitive State instead of appending full transcripts.
3Evaluation ran across 4 operational domains (IT operations, cybersecurity response, healthcare operations, finance) and showed consistently lower drift and hallucination rates with the compressed state approach.

What This Means

Engineers building agents for multi-step workflows (IT ops, incident response, healthcare operations, finance) — because preserving constraints and verified entities matters more than raw context. Technical product leads and reliability engineers should adopt memory governance as a first-class feature to reduce repeated errors and make agent behavior auditable. memory governance
Avoid common pitfallsLearn what failures to watch for
Learn More

Key Figures

Figure 1 : Agent architecture incorporating the Agent Cognitive Compressor (ACC). ACC operates as a cognitive memory controller that constructs and commits a bounded Compressed Cognitive State (CCS) via a schema constrained Cognitive Compressor Model (CCM). CCS is the sole persistent internal state maintained across turns and conditions downstream reasoning, tool use, and action, while ACC remains decoupled from policy execution and environment interaction.
Fig 1: Figure 1 : Agent architecture incorporating the Agent Cognitive Compressor (ACC). ACC operates as a cognitive memory controller that constructs and commits a bounded Compressed Cognitive State (CCS) via a schema constrained Cognitive Compressor Model (CCM). CCS is the sole persistent internal state maintained across turns and conditions downstream reasoning, tool use, and action, while ACC remains decoupled from policy execution and environment interaction.
Figure 2 : ACC state commitment mechanism for producing the next Compressed Cognitive State CCS t \mathrm{CCS}_{t} under the schema constraint 𝒮 CCS \mathcal{S}_{\mathrm{CCS}} , using the current interaction x t x_{t} , the previously committed state CCS t − 1 \mathrm{CCS}_{t-1} , and the qualified recalled set A t + A_{t}^{+} .
Fig 2: Figure 2 : ACC state commitment mechanism for producing the next Compressed Cognitive State CCS t \mathrm{CCS}_{t} under the schema constraint 𝒮 CCS \mathcal{S}_{\mathrm{CCS}} , using the current interaction x t x_{t} , the previously committed state CCS t − 1 \mathrm{CCS}_{t-1} , and the qualified recalled set A t + A_{t}^{+} .
(a) ReAct multi-turn loop with ACC committing CCS t \mathrm{CCS}_{t} prior to REASON , and a multi-turn return from ACT to ACC .
Fig 4: (a) ReAct multi-turn loop with ACC committing CCS t \mathrm{CCS}_{t} prior to REASON , and a multi-turn return from ACT to ACC .
(a) Per-turn memory tokens across domains.
Fig 5: (a) Per-turn memory tokens across domains.

Ready to evaluate your AI agents?

Learn how ReputAgent helps teams build trustworthy AI through systematic evaluation.

Learn More

Yes, But...

Results come from an agent-judge-driven live evaluation (automated judges), so targeted human audits are still needed to validate real-world impact. The compressed state relies on a schema that must be designed per task; poorly chosen schemas can omit important details. Limiting retrieval to a small set of artifacts reduces noise but risks missing rare, relevant evidence unless retrieval and schema tuning are well matched. human audits

The Details

Long multi-turn workflows fail less because agents lack knowledge and more because their memory is uncontrolled. Appending full transcripts grows context roughly linearly with interaction length, amplifying noise and making early mistakes persistent. Retrieval-based approaches bound prompt size but can surface stale or irrelevant artifacts that perturb goals and constraints. The core idea is to replace accumulated text with a bounded, structured internal state that captures only decision-critical variables (goals, constraints, confirmed entities, and progress). The proposed Agent Cognitive Compressor sits between transient interactions, an external artifact store, and the reasoning engine. At each turn it recalls a small set of candidate artifacts, then uses a schema-constrained compressor model to commit a single Compressed Cognitive State (CCS). The design separates artifact recall from state commitment so only validated, schema-compliant facts persist. The CCS then conditions subsequent reasoning and tool use. In evaluation, a judge-driven live framework compared three agents (transcript replay, retrieval-based, and the compressor-enabled agent) across four operational domains. The compressor agent kept memory footprint bounded, better preserved constraints, and produced fewer memory-driven hallucinations and drift, suggesting memory governance is a practical path to more reliable multi-turn agents. Implications: make memory a first-class engineering concern — design small, auditable state schemas and use a lightweight compressor model to update state each turn. Next steps include human audits, learned or adaptive schemas, specialized small compressor models to reduce cost, and exploring how state synchronization works across multiple agents. Compressed Cognitive State
Test your agentsValidate against real scenarios
Learn More
Credibility Assessment:

Single author with modest h-index (7) and no strong affiliation or venue — limited credibility.