Q8 · Does restricted communication make the task more strategically interesting?

Question 8

Does restricted communication make the task more strategically interesting?

Open messaging is where the strategy lives: more relay chains (2.6 vs 0.3 per episode, both near chance), bigger coalitions (2.8 vs 1.8 of 5 agents) and more concentrated scores (top scorer takes 44% vs 30% of pulls); third-party credit runs the other way (4.1% vs 0.07% of pulls) — uncontrolled pools.

Four measures of how rich play is, each as a pair of bars: everyone able to message everyone (oxblood) vs messages passing only around a ring (grey); dashed marks on the chains pair show the shuffled-order chance level; error bars are 95% CIs where they exist.

n = 28 episodes with messaging open + 12 ring; uncontrolled pools (personas, credit settings and model mixes differ) — the controlled contrast is configs/04_attack_complete.yaml vs configs/06_attack_ring.yaml. Chains ship beside their shuffled-order chance level (no CI); third-party credit = share of pulls whose credit claim survives the message-history check (no CI); sustained coalition = mutual pulling 3 rounds in a row; the top scorer's share counts self-pulls in its denominator. 95% episode-bootstrap CI, 2000 trials. · Grouped bars compare the two topologies on four strategy signals — relay chains (with their shuffled-order null beside them), seats in sustained coalitions, pulls carrying verified broker credit, and the top scorer's share of all pulls. The pools differ in more than topology, so this describes the runs so far, not a topology effect.

Evidence links

Every mark is backed by a transcript: marks deep-link to the episode behind them, so clicking a mark opens the transcript reader at that episode’s event. The same episodes are listed in the table below.

Reading

"Interesting" operationalized as a four-way richness panel: does restricting who can talk change what strategies exist — relay chains, sustained coalitions, brokered pulls, score concentration — not just who wins?

Both topologies have episodes (complete n = 28, ring n = 12). Chains: 2.57 per episode on the complete graph (95% CI [1.36, 4.18]) against a shuffled-order null of 2.96, and 0.33 (CI [0.00, 0.67]) against 0.28 on the ring — in both, observed sits at its null. Coalitions: 2.8 of 5 seats in a sustained mutual-pulling pair on the complete graph vs 1.8 on the ring (CIs [2.32, 3.25] and [1.33, 2.33]) — the separation is suggestive, but the intervals still overlap, barely. Brokered pulls are rare everywhere yet rarer with full talk: 0.07% of directed pulls carried verified third-party credit on the complete graph vs 4.13% on the ring (no CI; rare events). Score concentration: the top scorer took 44.3% of all pulls awarded on the complete graph vs 29.5% on the ring (CIs [38.3%, 50.4%] and [27.4%, 32.1%]); those intervals do not overlap. Pooled net capture stays zero by construction in each pool (95% CI complete [-0.013, +0.013]; ring [-0.016, +0.016]). Read together: full talk concentrates — more coalition seats, a bigger winner's share — while the ring pushes credit through brokers. But the two pools differ in more than topology (personas, credit settings, model mixes), so none of this is read as a topology effect. Next: attack_ring vs attack_complete (375 episodes each), the controlled contrast.

Statistics

measured: yes
needs: —
by_topology.complete.n_episodes: 28
by_topology.complete.net_capture.value: -2.48e-19
by_topology.complete.net_capture.ci: [-0.0129, 0.0129]
by_topology.complete.net_capture.n: 560
by_topology.complete.cascades.value: 2.57
by_topology.complete.cascades.ci: [1.36, 4.18]
by_topology.complete.cascades.n: 28
by_topology.ring.n_episodes: 12
by_topology.ring.net_capture.value: -4.39e-17
by_topology.ring.net_capture.ci: [-0.0161, 0.016]
by_topology.ring.net_capture.n: 240
by_topology.ring.cascades.value: 0.333
by_topology.ring.cascades.ci: [0, 0.667]
by_topology.ring.cascades.n: 12
panel.complete.n_episodes: 28
panel.complete.chains.value: 2.57
panel.complete.chains.ci: [1.36, 4.18]
panel.complete.chains.n: 28
panel.complete.chain_null.value: 2.96
panel.complete.chain_null.ci: —
panel.complete.chain_null.n: 28
panel.complete.coalition_agents.value: 2.79
panel.complete.coalition_agents.ci: [2.32, 3.25]
panel.complete.coalition_agents.n: 28
panel.complete.brokered_pull_rate.value: 0.000714
panel.complete.brokered_pull_rate.ci: —
panel.complete.brokered_pull_rate.n: 28
panel.complete.top1_share.value: 0.443
panel.complete.top1_share.ci: [0.383, 0.504]
panel.complete.top1_share.n: 28
panel.ring.n_episodes: 12
panel.ring.chains.value: 0.333
panel.ring.chains.ci: [0, 0.667]
panel.ring.chains.n: 12
panel.ring.chain_null.value: 0.279
panel.ring.chain_null.ci: —
panel.ring.chain_null.n: 12
panel.ring.coalition_agents.value: 1.83
panel.ring.coalition_agents.ci: [1.33, 2.33]
panel.ring.coalition_agents.n: 12
panel.ring.brokered_pull_rate.value: 0.0413
panel.ring.brokered_pull_rate.ci: —
panel.ring.brokered_pull_rate.n: 12
panel.ring.top1_share.value: 0.295
panel.ring.top1_share.ci: [0.274, 0.321]
panel.ring.top1_share.n: 12

summary.questions.q8 rendered verbatim; missing values shown as —.

Episodes

episode	condition	focal model	capture (by focal)	cascades	gini
calibration--gemma-3-4b_r0	complete/pure/msg-on	gemma-3-4b-it	0.268	1	0.59
calibration--gemma-3-4b_r1	complete/pure/msg-on	gemma-3-4b-it	0.154	0	0.494
calibration--gemma-3-4b_r2	complete/pure/msg-on	gemma-3-4b-it	0.142	3	0.464
calibration--gemma-3-4b_r3	complete/pure/msg-on	gemma-3-4b-it	0.0489	1	0.416
calibration--gemma-3-4b_r4	complete/pure/msg-on	gemma-3-4b-it	0.00401	2	0.333
calibration--gpt-5.4-mini_r0	complete/pure/msg-on	gpt-5.4-mini	0.0398	3	0.162
calibration--gpt-5.4-mini_r1	complete/pure/msg-on	gpt-5.4-mini	-0.0193	0	0.159
calibration--gpt-5.4-mini_r2	complete/pure/msg-on	gpt-5.4-mini	0.0168	0	0.187
calibration--gpt-5.4-mini_r3	complete/pure/msg-on	gpt-5.4-mini	0.0146	2	0.157
calibration--gpt-5.4-mini_r4	complete/pure/msg-on	gpt-5.4-mini	-0.00468	0	0.198
calibration--opus-4.8_r0	complete/pure/msg-on	opus-4.8	0.0931	1	0.426
calibration--opus-4.8_r1	complete/pure/msg-on	opus-4.8	0.0814	3	0.309
calibration--opus-4.8_r2	complete/pure/msg-on	opus-4.8	0.0407	0	0.329
calibration--opus-4.8_r3	complete/pure/msg-on	opus-4.8	0.118	0	0.327
calibration--opus-4.8_r4	complete/pure/msg-on	opus-4.8	0.18	0	0.473
calibration--qwen3-235b-thinking_r0	complete/pure/msg-on	qwen3-235b-thinking	0.0378	0	0.347
calibration--qwen3-235b-thinking_r1	complete/pure/msg-on	qwen3-235b-thinking	0.0392	2	0.366
calibration--qwen3-235b-thinking_r2	complete/pure/msg-on	qwen3-235b-thinking	-0.0135	2	0.307
calibration--qwen3-235b-thinking_r3	complete/pure/msg-on	qwen3-235b-thinking	0.163	0	0.495
calibration--qwen3-235b-thinking_r4	complete/pure/msg-on	qwen3-235b-thinking	0.00106	2	0.375
calibration--sonnet-4.6_r0	complete/pure/msg-on	sonnet-4.6	0.18	2	0.627
calibration--sonnet-4.6_r1	complete/pure/msg-on	sonnet-4.6	0.187	3	0.598
calibration--sonnet-4.6_r2	complete/pure/msg-on	sonnet-4.6	0.172	3	0.585
calibration--sonnet-4.6_r3	complete/pure/msg-on	sonnet-4.6	0.17	2	0.593
calibration--sonnet-4.6_r4	complete/pure/msg-on	sonnet-4.6	0.129	3	0.436
credit_smoke--creditsmoke_s41	complete/pure/msg-on	—	—	11	0.161
credit_smoke--creditsmoke_s42	complete/pure/msg-on	—	—	17	0.232
credit_smoke--creditsmoke_s43	complete/pure/msg-on	—	—	9	0.129
credit_smoke_ring--2026-06-27T14-43-44-00-00--ring_s41	ring/pure/msg-on	—	—	0	0.088
credit_smoke_ring--2026-06-27T14-43-44-00-00--ring_s42	ring/pure/msg-on	—	—	1	0.164
credit_smoke_ring--2026-06-27T14-43-44-00-00--ring_s43	ring/pure/msg-on	—	—	0	0.303
credit_smoke_ring--2026-06-28T15-48-13-00-00--ring_s41	ring/pure/msg-on	—	—	0	0.219
credit_smoke_ring--2026-06-28T15-48-13-00-00--ring_s42	ring/pure/msg-on	—	—	0	0.148
credit_smoke_ring--2026-06-28T15-48-13-00-00--ring_s43	ring/pure/msg-on	—	—	0	0.0818
credit_smoke_ring--2026-06-28T15-54-28-00-00--ring_s41	ring/pure/msg-on	—	—	1	0.158
credit_smoke_ring--2026-06-28T15-54-28-00-00--ring_s42	ring/pure/msg-on	—	—	2	0.167
credit_smoke_ring--2026-06-28T15-54-28-00-00--ring_s43	ring/pure/msg-on	—	—	0	0.191
credit_smoke_ring--2026-06-28T15-59-57-00-00--ring_s41	ring/pure/msg-on	—	—	0	0.19
credit_smoke_ring--2026-06-28T15-59-57-00-00--ring_s42	ring/pure/msg-on	—	—	0	0.263
credit_smoke_ring--2026-06-28T15-59-57-00-00--ring_s43	ring/pure/msg-on	—	—	0	0.13

All episodes measured so far (40), sorted by condition then id; episode links open the transcript reader.

Downloads

q8.svg · summary.json