Agent Playground is liveTry it here → | put your agent in real scenarios against other agents and see how it stacks up

The Big Picture

When you describe yourself to a shopping agent, the agent’s behavior reveals your maximum willingness to pay almost exactly; giving the agent a secret numeric budget does not hide that information and instead compresses estimates toward an average.

The Evidence

Natural-language buyer profiles produce shopping behavior that lets a separate reader recover the buyer’s maximum price nearly one-for-one inference. By contrast, giving the agent a numeric budget and instructing it to keep the number secret yields transcripts that mostly compress to the population mean and reveal far less. The leakage happens because the agent stays faithful to the described role (role coherence), so behavior itself becomes a signal of how much the buyer will pay — not because the agent broke a privacy instruction. Reflection Pattern

Data Highlights

1Verbal-profile condition: inferred-versus-target slope = 1.00 (bootstrap 95% CI [0.96, 1.05]); aggregate mean absolute error = $48 over the $50–$500 target range.
2Numeric-budget condition: inferred-versus-target slope = 0.21 (bootstrap 95% CI [0.17, 0.26]); aggregate mean absolute error = $92, showing strong compression toward the prior.
3Rank recovery: verbal profiles achieved perfect rank correlation (Spearman ρ = 1.00) across six profile cells (N = 60 trials per cell; 360 interactions in the verbal condition).

What This Means

Engineers building shopping or delegation agents and product leaders designing agent-driven commerce should care because natural-profile personalization can unintentionally reveal customers’ maximum willingness to pay. Privacy and trust teams should also pay attention: prompt-level warnings or output redaction won’t fully stop this leakage, so architecture-level fixes are required. Multi-Agent System
Not sure where to start?Get personalized recommendations
Learn More

Key Figures

Figure 3: Inferred versus target willingness to pay, by condition. Each point is a cell mean with bootstrap 95% confidence intervals as error bars ( N = 60 N=60 trials per cell, 2000 cell-stratified resamples). The dashed line is the 45-degree y = x y=x reference. The numeric condition compresses to the prior mean (slope 0.21), while the verbal condition tracks the target nearly one-for-one (slope 1.00). The two confidence bands do not overlap at any cell.
Fig 3: Figure 3: Inferred versus target willingness to pay, by condition. Each point is a cell mean with bootstrap 95% confidence intervals as error bars ( N = 60 N=60 trials per cell, 2000 cell-stratified resamples). The dashed line is the 45-degree y = x y=x reference. The numeric condition compresses to the prior mean (slope 0.21), while the verbal condition tracks the target nearly one-for-one (slope 1.00). The two confidence bands do not overlap at any cell.

Ready to evaluate your AI agents?

Learn how ReputAgent helps teams build trustworthy AI through systematic evaluation.

Learn More

Considerations

Results come from simulated multi-turn dialogues about wireless headphones with large language model agents; behavior may vary across product categories, models, or real-user interactions. The study isolates role coherence from simple instruction failures, but different model families or marketplace dynamics could change effect sizes. Proposed defenses trade off some personalization for privacy and need system-level changes rather than only prompt tweaks. Multi-Agent System

Methodology & More

An experiment put a buyer agent and a seller agent into multi-turn shopping dialogs about wireless headphones across six target willingness-to-pay levels ($50 to $500). Buyers were instructed either with a natural-language consumer profile (no dollar amounts) or with an explicit numeric budget told to be kept confidential. A separate inference agent read the transcripts and estimated the buyer’s maximum price. With verbal profiles, inferred prices tracked targets nearly perfectly (slope ≈ 1.00, MAE $48); with private numeric budgets the inferred estimates compressed to the mean (slope ≈ 0.21, MAE $92). The core mechanism is role coherence: when asked to act as a particular consumer, the buyer agent generates questions, comparisons, and tradeoffs appropriate to that persona, and those behavioral choices correlate with willingness to pay. That means privacy leakage is a byproduct of accurate, helpful delegation rather than an instruction-following failure. Prompt-level privacy directives or post-hoc redaction do not eliminate the signal; realistic defenses will need changes to how profiles are shared or aggregated (for example, anonymizing intermediaries, rotating profiles, or federated aggregation), each of which reduces personalization to some degree. Guardrails Pattern
Avoid common pitfallsLearn what failures to watch for
Learn More
Credibility Assessment:

ArXiv preprint with very low h-index authors (h=1), no affiliations or citations — limited identifiable credibility.