Question 8
Does restricted communication make the task more strategically interesting?
Open messaging is where the strategy lives: more relay chains (2.6 vs 0.3 per episode, both near chance), bigger coalitions (2.8 vs 1.8 of 5 agents) and more concentrated scores (top scorer takes 44% vs 30% of pulls); third-party credit runs the other way (4.1% vs 0.07% of pulls) — uncontrolled pools.
Four measures of how rich play is, each as a pair of bars: everyone able to message everyone (oxblood) vs messages passing only around a ring (grey); dashed marks on the chains pair show the shuffled-order chance level; error bars are 95% CIs where they exist.
Evidence links
Every mark is backed by a transcript: marks deep-link to the episode behind them, so clicking a mark opens the transcript reader at that episode’s event. The same episodes are listed in the table below.
Reading
"Interesting" operationalized as a four-way richness panel: does restricting who can talk change what strategies exist — relay chains, sustained coalitions, brokered pulls, score concentration — not just who wins?
Both topologies have episodes (complete n = 28, ring n = 12). Chains: 2.57 per episode on the complete graph (95% CI [1.36, 4.18]) against a shuffled-order null of 2.96, and 0.33 (CI [0.00, 0.67]) against 0.28 on the ring — in both, observed sits at its null. Coalitions: 2.8 of 5 seats in a sustained mutual-pulling pair on the complete graph vs 1.8 on the ring (CIs [2.32, 3.25] and [1.33, 2.33]) — the separation is suggestive, but the intervals still overlap, barely. Brokered pulls are rare everywhere yet rarer with full talk: 0.07% of directed pulls carried verified third-party credit on the complete graph vs 4.13% on the ring (no CI; rare events). Score concentration: the top scorer took 44.3% of all pulls awarded on the complete graph vs 29.5% on the ring (CIs [38.3%, 50.4%] and [27.4%, 32.1%]); those intervals do not overlap. Pooled net capture stays zero by construction in each pool (95% CI complete [-0.013, +0.013]; ring [-0.016, +0.016]). Read together: full talk concentrates — more coalition seats, a bigger winner's share — while the ring pushes credit through brokers. But the two pools differ in more than topology (personas, credit settings, model mixes), so none of this is read as a topology effect. Next: attack_ring vs attack_complete (375 episodes each), the controlled contrast.
Statistics
- measured
- yes
- needs
- —
- by_topology.complete.n_episodes
- 28
- by_topology.complete.net_capture.value
- -2.48e-19
- by_topology.complete.net_capture.ci
- [-0.0129, 0.0129]
- by_topology.complete.net_capture.n
- 560
- by_topology.complete.cascades.value
- 2.57
- by_topology.complete.cascades.ci
- [1.36, 4.18]
- by_topology.complete.cascades.n
- 28
- by_topology.ring.n_episodes
- 12
- by_topology.ring.net_capture.value
- -4.39e-17
- by_topology.ring.net_capture.ci
- [-0.0161, 0.016]
- by_topology.ring.net_capture.n
- 240
- by_topology.ring.cascades.value
- 0.333
- by_topology.ring.cascades.ci
- [0, 0.667]
- by_topology.ring.cascades.n
- 12
- panel.complete.n_episodes
- 28
- panel.complete.chains.value
- 2.57
- panel.complete.chains.ci
- [1.36, 4.18]
- panel.complete.chains.n
- 28
- panel.complete.chain_null.value
- 2.96
- panel.complete.chain_null.ci
- —
- panel.complete.chain_null.n
- 28
- panel.complete.coalition_agents.value
- 2.79
- panel.complete.coalition_agents.ci
- [2.32, 3.25]
- panel.complete.coalition_agents.n
- 28
- panel.complete.brokered_pull_rate.value
- 0.000714
- panel.complete.brokered_pull_rate.ci
- —
- panel.complete.brokered_pull_rate.n
- 28
- panel.complete.top1_share.value
- 0.443
- panel.complete.top1_share.ci
- [0.383, 0.504]
- panel.complete.top1_share.n
- 28
- panel.ring.n_episodes
- 12
- panel.ring.chains.value
- 0.333
- panel.ring.chains.ci
- [0, 0.667]
- panel.ring.chains.n
- 12
- panel.ring.chain_null.value
- 0.279
- panel.ring.chain_null.ci
- —
- panel.ring.chain_null.n
- 12
- panel.ring.coalition_agents.value
- 1.83
- panel.ring.coalition_agents.ci
- [1.33, 2.33]
- panel.ring.coalition_agents.n
- 12
- panel.ring.brokered_pull_rate.value
- 0.0413
- panel.ring.brokered_pull_rate.ci
- —
- panel.ring.brokered_pull_rate.n
- 12
- panel.ring.top1_share.value
- 0.295
- panel.ring.top1_share.ci
- [0.274, 0.321]
- panel.ring.top1_share.n
- 12
summary.questions.q8 rendered verbatim; missing values shown as —.
Episodes
| episode | condition | focal model | capture (by focal) | cascades | gini |
|---|---|---|---|---|---|
| calibration--gemma-3-4b_r0 | complete/pure/msg-on | gemma-3-4b-it | 0.268 | 1 | 0.59 |
| calibration--gemma-3-4b_r1 | complete/pure/msg-on | gemma-3-4b-it | 0.154 | 0 | 0.494 |
| calibration--gemma-3-4b_r2 | complete/pure/msg-on | gemma-3-4b-it | 0.142 | 3 | 0.464 |
| calibration--gemma-3-4b_r3 | complete/pure/msg-on | gemma-3-4b-it | 0.0489 | 1 | 0.416 |
| calibration--gemma-3-4b_r4 | complete/pure/msg-on | gemma-3-4b-it | 0.00401 | 2 | 0.333 |
| calibration--gpt-5.4-mini_r0 | complete/pure/msg-on | gpt-5.4-mini | 0.0398 | 3 | 0.162 |
| calibration--gpt-5.4-mini_r1 | complete/pure/msg-on | gpt-5.4-mini | -0.0193 | 0 | 0.159 |
| calibration--gpt-5.4-mini_r2 | complete/pure/msg-on | gpt-5.4-mini | 0.0168 | 0 | 0.187 |
| calibration--gpt-5.4-mini_r3 | complete/pure/msg-on | gpt-5.4-mini | 0.0146 | 2 | 0.157 |
| calibration--gpt-5.4-mini_r4 | complete/pure/msg-on | gpt-5.4-mini | -0.00468 | 0 | 0.198 |
| calibration--opus-4.8_r0 | complete/pure/msg-on | opus-4.8 | 0.0931 | 1 | 0.426 |
| calibration--opus-4.8_r1 | complete/pure/msg-on | opus-4.8 | 0.0814 | 3 | 0.309 |
| calibration--opus-4.8_r2 | complete/pure/msg-on | opus-4.8 | 0.0407 | 0 | 0.329 |
| calibration--opus-4.8_r3 | complete/pure/msg-on | opus-4.8 | 0.118 | 0 | 0.327 |
| calibration--opus-4.8_r4 | complete/pure/msg-on | opus-4.8 | 0.18 | 0 | 0.473 |
| calibration--qwen3-235b-thinking_r0 | complete/pure/msg-on | qwen3-235b-thinking | 0.0378 | 0 | 0.347 |
| calibration--qwen3-235b-thinking_r1 | complete/pure/msg-on | qwen3-235b-thinking | 0.0392 | 2 | 0.366 |
| calibration--qwen3-235b-thinking_r2 | complete/pure/msg-on | qwen3-235b-thinking | -0.0135 | 2 | 0.307 |
| calibration--qwen3-235b-thinking_r3 | complete/pure/msg-on | qwen3-235b-thinking | 0.163 | 0 | 0.495 |
| calibration--qwen3-235b-thinking_r4 | complete/pure/msg-on | qwen3-235b-thinking | 0.00106 | 2 | 0.375 |
| calibration--sonnet-4.6_r0 | complete/pure/msg-on | sonnet-4.6 | 0.18 | 2 | 0.627 |
| calibration--sonnet-4.6_r1 | complete/pure/msg-on | sonnet-4.6 | 0.187 | 3 | 0.598 |
| calibration--sonnet-4.6_r2 | complete/pure/msg-on | sonnet-4.6 | 0.172 | 3 | 0.585 |
| calibration--sonnet-4.6_r3 | complete/pure/msg-on | sonnet-4.6 | 0.17 | 2 | 0.593 |
| calibration--sonnet-4.6_r4 | complete/pure/msg-on | sonnet-4.6 | 0.129 | 3 | 0.436 |
| credit_smoke--creditsmoke_s41 | complete/pure/msg-on | — | — | 11 | 0.161 |
| credit_smoke--creditsmoke_s42 | complete/pure/msg-on | — | — | 17 | 0.232 |
| credit_smoke--creditsmoke_s43 | complete/pure/msg-on | — | — | 9 | 0.129 |
| credit_smoke_ring--2026-06-27T14-43-44-00-00--ring_s41 | ring/pure/msg-on | — | — | 0 | 0.088 |
| credit_smoke_ring--2026-06-27T14-43-44-00-00--ring_s42 | ring/pure/msg-on | — | — | 1 | 0.164 |
| credit_smoke_ring--2026-06-27T14-43-44-00-00--ring_s43 | ring/pure/msg-on | — | — | 0 | 0.303 |
| credit_smoke_ring--2026-06-28T15-48-13-00-00--ring_s41 | ring/pure/msg-on | — | — | 0 | 0.219 |
| credit_smoke_ring--2026-06-28T15-48-13-00-00--ring_s42 | ring/pure/msg-on | — | — | 0 | 0.148 |
| credit_smoke_ring--2026-06-28T15-48-13-00-00--ring_s43 | ring/pure/msg-on | — | — | 0 | 0.0818 |
| credit_smoke_ring--2026-06-28T15-54-28-00-00--ring_s41 | ring/pure/msg-on | — | — | 1 | 0.158 |
| credit_smoke_ring--2026-06-28T15-54-28-00-00--ring_s42 | ring/pure/msg-on | — | — | 2 | 0.167 |
| credit_smoke_ring--2026-06-28T15-54-28-00-00--ring_s43 | ring/pure/msg-on | — | — | 0 | 0.191 |
| credit_smoke_ring--2026-06-28T15-59-57-00-00--ring_s41 | ring/pure/msg-on | — | — | 0 | 0.19 |
| credit_smoke_ring--2026-06-28T15-59-57-00-00--ring_s42 | ring/pure/msg-on | — | — | 0 | 0.263 |
| credit_smoke_ring--2026-06-28T15-59-57-00-00--ring_s43 | ring/pure/msg-on | — | — | 0 | 0.13 |
All episodes measured so far (40), sorted by condition then id; episode links open the transcript reader.