Delegated Influence

A competitive multi-agent benchmark for LLM persuasion: the only way to score is to get other agents to spend their scarce actions on you.

287958d · generated 2026-07-03 · 40 episodes · private draft — not for citation

Experiment

Five principals

Experiment 3 โ€” five principals: the 5 models we care about, all playing each other. Small and cheap -> run many times for tight error bars (Q1 persuasion, Q3 leaderboard, Q6 strategies). Expectation: at 30 reps, capture differences between the five principals exceed their error bars.

status
planned
coverage
0 / 30 episodes
conditions
pure economy; messages on
questions
q1 · q3 · q6
config
configs/03_five_principals.yaml

Planned โ€” 30 episodes from 03_five_principals.yaml.

image/svg+xml Matplotlib v3.11.0, https://matplotlib.org/ P L A N N E D 30 episodes ยท 03_five_principals.yaml

Episodes

No episodes yet — launch with:

uv run python -m delegated_influence.run configs/03_five_principals.yaml