Show code
p_eq_pdGT Behaviour · GTEMO Experiment
Eric Guerci
March 22, 2026
| T (col) | D (col) | |
|---|---|---|
| T (row) | 18, 18 ♦ | 0, 22 |
| D (row) | 22, 0 | 8, 8 ★ |
★ = Nash equilibrium (dominant strategy). ♦ = Pareto optimum. D strictly dominates T for both players: the unique Nash equilibrium (D, D) = 8/8€ is Pareto-inferior to mutual cooperation (T, T) = 18/18€. Cheap talk may shift cooperation toward the Pareto-optimal outcome.
Describe the equilibrium outcomes reached by each couple in Part 1 and Part 2, and test whether cheap talk shifted coordination rates using paired McNemar tests.
coord_pd |>
dplyr::select(part, x, n, pct, ci95) |>
gt::gt() |>
gt::tab_header(title = "PD — Coordination rates: Part 1 vs Part 2",
subtitle = "95% Clopper-Pearson CI") |>
gt::cols_label(part = "Phase", x = "n coordinated", n = "N",
pct = "%", ci95 = "95% CI") |>
gt::tab_style(style = gt::cell_text(weight = "bold"),
locations = gt::cells_column_labels())
p_coord_pd| PD — Coordination rates: Part 1 vs Part 2 | ||||
| 95% Clopper-Pearson CI | ||||
| Phase | n coordinated | N | % | 95% CI |
|---|---|---|---|---|
| Part 1 | 2 | 14 | 14.3% | [1.8%, 42.8%] |
| Part 2 | 6 | 14 | 42.9% | [17.7%, 71.1%] |
tab_mc_pd |>
gt::gt() |>
gt::tab_header(title = "McNemar tests — PD (couple level)",
subtitle = "Paired Part 1 vs Part 2") |>
gt::cols_label(label = "Test", statistic = "χ²", p_value = "p-value",
note = "Note") |>
gt::fmt_number(columns = c(statistic, p_value), decimals = 4, rows = !is.na(statistic)) |>
gt::tab_style(style = gt::cell_text(weight = "bold"),
locations = gt::cells_body(columns = p_value,
rows = !is.na(p_value) & p_value < 0.05)) |>
gt::tab_style(style = gt::cell_text(weight = "bold"),
locations = gt::cells_column_labels())| McNemar tests — PD (couple level) | |||
| Paired Part 1 vs Part 2 | |||
| Test | χ² | p-value | Note |
|---|---|---|---|
| PD — coord Part1 vs Part2 | 1.1250 | 0.2888 | OK |
| PD — mutual cooperation Part1 vs Part2 | 0.8000 | 0.3711 | OK |
Mutual cooperation (T, T) in Part 1: 35.7% of pairs → Part 2: 14.3%. A significant McNemar result would indicate that cheap talk systematically elevated mutual cooperation above the Nash equilibrium level.
tab_cond_coord_pd |>
gt::gt() |>
gt::tab_header(title = "PD — Coordination and cooperation by session gender",
subtitle = "χ² with Monte Carlo simulated p-value (B = 2000), couple level") |>
gt::cols_label(Outcome = "Outcome", Factor = "Factor", Test = "χ²(sim.) test") |>
gt::tab_style(style = gt::cell_text(weight = "bold"),
locations = gt::cells_column_labels()) |>
gt::opt_stylize(style = 1) |>
gt::tab_options(table.font.size = 13)
p_coord_gender_pd| PD — Coordination and cooperation by session gender | ||
| χ² with Monte Carlo simulated p-value (B = 2000), couple level | ||
| Outcome | Factor | χ²(sim.) test |
|---|---|---|
| Coordination Part 2 | Session gender | χ²(sim.): p = 0.270 ns |
| Coordination Part 1 | Session gender | χ²(sim.): p = 0.468 ns |
| Mutual cooperation Part 2 | Session gender | χ²(sim.): p = 0.464 ns |
| Mutual cooperation Part 1 | Session gender | χ²(sim.): p = 1.000 ns |
Describe the marginal distributions of choices and signals (Part 1 choice → signal sent → Part 2 choice), and examine how the opponent’s signal shapes Part 2 behaviour. All proportions use exact 95% Clopper-Pearson CIs.
| PD — Choice and signal distributions | ||||
| 95% Clopper-Pearson CI | ||||
| Choice / Signal | n | N | % | 95% CI |
|---|---|---|---|---|
| Part 1 | ||||
| T | 17 | 28 | 60.7% | [40.6%, 78.5%] |
| D | 11 | 28 | 39.3% | [21.5%, 59.4%] |
| Signal | ||||
| T | 17 | 28 | 60.7% | [40.6%, 78.5%] |
| D | 11 | 28 | 39.3% | [21.5%, 59.4%] |
| Part 2 | ||||
| T | 10 | 28 | 35.7% | [18.6%, 55.9%] |
| D | 18 | 28 | 64.3% | [44.1%, 81.4%] |
Cooperation (T) in Part 1: 60.7%. The dominant signal was T (60.7%). Cooperation in Part 2: 35.7%. An increase in T from Part 1 to Part 2 would be consistent with cheap talk successfully promoting Pareto-improving coordination despite the dominance of defection.
tab_mcnemar_pd |>
gt::gt() |>
gt::tab_header(
title = "McNemar test — PD: cooperated_part1 vs cooperated_part2",
subtitle = "Paired within-individual"
) |>
gt::cols_label(statistic = "χ²", p_value = "p-value", n = "n", note = "Note") |>
gt::fmt_number(columns = c(statistic, p_value), decimals = 4) |>
gt::tab_style(style = gt::cell_text(weight = "bold"),
locations = gt::cells_body(columns = p_value,
rows = !is.na(p_value) & p_value < 0.05)) |>
gt::tab_style(style = gt::cell_text(weight = "bold"),
locations = gt::cells_column_labels())| McNemar test — PD: cooperated_part1 vs cooperated_part2 | |||
| Paired within-individual | |||
| χ² | p-value | n | Note |
|---|---|---|---|
| 2.7690 | 0.0961 | 28 | OK |
A significant McNemar result (p < 0.05) indicates that pre-play cheap talk systematically shifted individual cooperation rates — consistent with the hypothesis that signals influence behaviour even in PD where defection is the dominant strategy.
Decision sequence. After sending their own signal and before making the Part 2 choice, each player observes the opponent’s signal. The Part 2 decision is taken with a two-dimensional information set: (own signal sent) × (opponent’s signal received). The four possible information sets are: T/T, T/D, D/T, D/D.
In the Prisoner’s Dilemma, defection (D) is the dominant strategy for both players, regardless of what the opponent signals. The near-identical cooperation rates — 35.3% when the opponent signals T vs 36.4% when the opponent signals D — confirm this theoretical prediction. The opponent’s signal cannot credibly promote cooperation because D strictly dominates T at any belief about the opponent’s choice. This means: (1) when the opponent signals T (intending cooperation), many still defect (64.7%) because they rationally distrust the signal; and (2) when the opponent signals D, some still cooperate (36.4%) — not out of strategic response, but likely due to social preferences, altruism, or misunderstanding the game. The absence of a signal effect contrasts sharply with Stag Hunt, where mutual cooperation is an equilibrium: there, cheap talk meaningfully shifts behaviour (55.6% when opp signals T vs 30.4% when opp signals D), because the signal can be credible.
The critical test is the (T/T) information set — where both players have promised cooperation. Even here, full cooperation in Part 2 may fall well below 100%, consistent with the theoretical prediction that cheap talk is non-binding and strategic distrust persists. When the opponent signals D, cooperation typically falls further.
tab_cond_sig_pd |>
gt::gt() |>
gt::tab_header(title = "PD — Choice and signal distributions by gender and role",
subtitle = "χ² with Monte Carlo simulated p-value (B = 2000)") |>
gt::cols_label(Outcome = "Outcome", Factor = "Factor", Test = "χ²(sim.) test") |>
gt::tab_style(style = gt::cell_text(weight = "bold"),
locations = gt::cells_column_labels()) |>
gt::opt_stylize(style = 1) |>
gt::tab_options(table.font.size = 13)| PD — Choice and signal distributions by gender and role | ||
| χ² with Monte Carlo simulated p-value (B = 2000) | ||
| Outcome | Factor | χ²(sim.) test |
|---|---|---|
| Part 1 = T | Gender | χ²(sim.): p = 0.695 ns |
| Part 1 = T | Role | χ²(sim.): p = 0.428 ns |
| Signal = T | Gender | χ²(sim.): p = 1.000 ns |
| Signal = T | Role | χ²(sim.): p = 0.422 ns |
| Part 2 = T | Gender | χ²(sim.): p = 0.108 ns |
| Part 2 = T | Role | χ²(sim.): p = 0.698 ns |
Examine whether signals are honest (= same as the action eventually taken in Part 2) and consistent with Part 1 choices. Assess the prevalence of strategy switches between Part 1 and Part 2.
tab_sec2_pd |>
gt::gt() |>
gt::tab_header(title = "PD — Signal honesty and consistency",
subtitle = "95% Clopper-Pearson CI. Each row is a binary indicator (1 = yes, 0 = no).") |>
gt::cols_label(variable = "Measure", n = "n (=1)", N = "N",
pct = "%", ci95 = "95% CI") |>
gt::tab_style(style = gt::cell_text(weight = "bold"),
locations = gt::cells_column_labels()) |>
gt::tab_footnote(
footnote = "1 if the signal sent equals the Part 2 choice (e.g. sent T and chose T in Part 2). Measures whether players followed through on their signal.",
locations = gt::cells_body(columns = variable, rows = 1)
) |>
gt::tab_footnote(
footnote = "1 if the signal sent equals the Part 1 choice (e.g. signalled T and had also chosen T in Part 1). Measures whether the signal reflects past behaviour — independent of Part 2.",
locations = gt::cells_body(columns = variable, rows = 2)
) |>
gt::tab_footnote(
footnote = "1 if the Part 2 choice differs from the Part 1 choice (choice1 \u2260 choice2). Measures switching behaviour across rounds, independent of the signal. N may differ from rows 1\u20132 due to missing values in different variables.",
locations = gt::cells_body(columns = variable, rows = 3)
) |>
gt::tab_options(table.font.size = 13)| PD — Signal honesty and consistency | ||||
| 95% Clopper-Pearson CI. Each row is a binary indicator (1 = yes, 0 = no). | ||||
| Measure | n (=1) | N | % | 95% CI |
|---|---|---|---|---|
| Signal honest (signal = choice2)1 | 15 | 28 | 53.6% | [33.9%, 72.5%] |
| Signal consistent with Part 1 (signal = choice1)2 | 14 | 28 | 50.0% | [30.6%, 69.4%] |
| Strategy changed Part 1 → Part 2 (choice1 ≠ choice2)3 | 13 | 28 | 46.4% | [27.5%, 66.1%] |
| 1 1 if the signal sent equals the Part 2 choice (e.g. sent T and chose T in Part 2). Measures whether players followed through on their signal. | ||||
| 2 1 if the signal sent equals the Part 1 choice (e.g. signalled T and had also chosen T in Part 1). Measures whether the signal reflects past behaviour — independent of Part 2. | ||||
| 3 1 if the Part 2 choice differs from the Part 1 choice (choice1 ≠ choice2). Measures switching behaviour across rounds, independent of the signal. N may differ from rows 1–2 due to missing values in different variables. | ||||
tab_binom_honest_pd |>
gt::gt() |>
gt::tab_header(title = "Binomial test: P(Honest) vs H₀ = 0.50",
subtitle = "Two-sided test; 95% Clopper-Pearson CI") |>
gt::cols_label(x = "n honest", n = "N", pct = "%", ci95 = "95% CI",
p_value = "p-value") |>
gt::tab_style(style = gt::cell_text(weight = "bold"),
locations = gt::cells_body(columns = p_value,
rows = p_value < 0.05)) |>
gt::tab_style(style = gt::cell_text(weight = "bold"),
locations = gt::cells_column_labels())| Binomial test: P(Honest) vs H₀ = 0.50 | ||||
| Two-sided test; 95% Clopper-Pearson CI | ||||
| n honest | N | % | 95% CI | p-value |
|---|---|---|---|---|
| 15 | 28 | 53.6% | [33.9%, 72.5%] | 0.8506 |
In PD, a signal is honest if the player’s Part 2 action matches what they signalled. The aggregate honesty rate must be interpreted with caution: honesty is only behaviorally meaningful for T signals. A player who signals D and then defects is simply playing the dominant strategy — their “honesty” would have occurred regardless of the signal. By contrast, a player who signals T and then cooperates is making a costly choice (forgoing the defection payoff), so P(choice2 = T | signal = T) is the relevant measure of credible cheap talk. The overall honesty rate conflates these two very different cases and tends to be inflated by the trivially honest D→D cases.
tab_cond_honest_pd |>
gt::gt() |>
gt::tab_header(title = "PD — Signal honesty and strategy change by gender and role",
subtitle = "χ² with Monte Carlo simulated p-value (B = 2000)") |>
gt::cols_label(Outcome = "Outcome", Factor = "Factor", Test = "χ²(sim.) test") |>
gt::tab_style(style = gt::cell_text(weight = "bold"),
locations = gt::cells_column_labels()) |>
gt::opt_stylize(style = 1) |>
gt::tab_options(table.font.size = 13)| PD — Signal honesty and strategy change by gender and role | ||
| χ² with Monte Carlo simulated p-value (B = 2000) | ||
| Outcome | Factor | χ²(sim.) test |
|---|---|---|
| Signal honest | Gender | χ²(sim.): p = 0.457 ns |
| Signal honest | Role | χ²(sim.): p = 0.456 ns |
| Strategy changed | Gender | χ²(sim.): p = 1.000 ns |
| Strategy changed | Role | χ²(sim.): p = 0.454 ns |
After Part 2, each player answered two incentivised questions about their beliefs:
Belief 1 — First-order belief: “What do you think your opponent chose in Part 2?” (T or D). Scored correct (GT_right_guess1 = 1) if the player’s prediction matched the opponent’s actual Part 2 choice. Bonus: +2€ if correct.
Belief 2 — Second-order belief: “What do you think your opponent believes you chose in Part 2?” (T or D). Scored correct (GT_right_guess2 = 1) if the player correctly identified what the opponent believed about the player’s own choice. Bonus: +2€ if correct.
The belief accuracy score = GT_right_guess1 + GT_right_guess2 ∈ {0, 1, 2}. The belief bonus = score × 2€ ∈ {0€, 2€, 4€}.
Describe the distribution of belief accuracy scores (0 = both beliefs wrong; 1 = one correct; 2 = both correct) and the associated belief bonus payoff. Assess whether belief accuracy is correlated with coordination outcomes at the couple level.
| PD — Belief accuracy: hypothesis comparison | |||||||||
| H1–H2: individual level. H3: couple level. Stat = Spearman ρ or φ (phi). | |||||||||
| Level | Hypothesis | X | Y | Test | Expected | Stat | p | n | |
|---|---|---|---|---|---|---|---|---|---|
| H1 | Individual | Reflective thinkers (high CRT) predict opponent’s choice more accurately | CRT score (0–4) | Belief accuracy (0–2) | Spearman ρ | Positive | 0.041 | 0.835 | 28 |
| H2 | Individual | Players who made more quiz errors have less accurate beliefs | Quiz errors [log(1+x)] | Belief accuracy (0–2) | Spearman ρ | Negative | -0.097 | 0.623 | 28 |
| H3 | Couple | Couples where both players have perfect beliefs coordinate more in Part 2 | Coord. Part 2 (0/1) | Both perfect beliefs (0/1) | Fisher exact + φ1 | Positive | 0.059 | 1.000 | 14 |
| 1 H3 stat = phi coefficient (φ); p-value from two-sided Fisher exact test on 2×2 contingency table. | |||||||||
H1 tests whether more reflective players (higher CRT) are better at predicting their opponent — if strategic reasoning drives belief formation, a positive Spearman ρ is expected. H2 tests whether players who struggled with game comprehension (more quiz errors) hold less accurate beliefs — expected direction is negative. H3 tests whether couples where both players had perfect beliefs were more likely to coordinate in Part 2 — the only couple-level hypothesis; uses Fisher exact given small N and binary outcomes. Note that H1 and H2 operate at the individual level while H3 is at the couple level; they are not directly comparable.
Estimate an ordered logit (proportional-odds model) for the belief accuracy score (0 = both wrong, 1 = one correct, 2 = both correct) using Theory of Mind (MASC) and IRI subscales as predictors. The proportional-odds assumption implies a single log-odds shift per unit increase in each predictor, shared across both thresholds (0→1 and 1→2).
With n = 28 participants (score 0: n=6; 1: n=13; 2: n=9), EPV is computed as min(n₀, n₂) / k: M1 = 6, M2 = 3, M3 = 2. All are well below the recommended 10. All results are exploratory and should be treated as hypothesis-generating.
\[ \begin{aligned} \text{M1:} \quad & \text{logit}\,P(Y \le j) = \alpha_j - \beta_1\,\text{MASC}_z \\ \text{M2:} \quad & \text{logit}\,P(Y \le j) = \alpha_j - \beta_1\,\text{MASC}_z - \beta_2\,\text{IRI-PT}_z \\ \text{M3:} \quad & \text{logit}\,P(Y \le j) = \alpha_j - \beta_1\,\text{MASC}_z - \beta_2\,\text{IRI-PT}_z - \beta_3\,\text{IRI-PD}_z \end{aligned} \]
MASC = Theory of Mind total score; IRI-PT = Perspective Taking subscale (cognitive empathy — most directly linked to predicting opponents’ decisions); IRI-PD = Personal Distress subscale (self-oriented distress — may impair strategic prediction). All scores z-standardised. OR > 1 shifts probability towards higher belief accuracy.
| PD — Ordered logit: determinants of belief accuracy | |||||||
| DV = belief accuracy score (0/1/2, ordered). n = 28 (score 0: n=6; 1: n=13; 2: n=9). MASC/IRI z-scored. OR > 1 increases probability of higher accuracy.1 | |||||||
| Predictor | β | SE | t | p | OR | OR 2.5% | OR 97.5% |
|---|---|---|---|---|---|---|---|
| M1: MASC only | |||||||
| MASC ToM score (z) | -0.124 | 0.371 | -0.334 | 0.7385 | 0.883 | 0.427 | 1.828 |
| M2: MASC + IRI-PT | |||||||
| MASC ToM score (z) | -0.079 | 0.386 | -0.205 | 0.8372 | 0.924 | 0.428 | 1.988 |
| IRI Perspective Taking (z) | -0.164 | 0.374 | -0.439 | 0.6610 | 0.849 | 0.399 | 1.773 |
| M3: MASC + IRI-PT + IRI-PD | |||||||
| MASC ToM score (z) | -0.063 | 0.399 | -0.157 | 0.8752 | 0.939 | 0.424 | 2.077 |
| IRI Perspective Taking (z) | -0.158 | 0.376 | -0.419 | 0.6749 | 0.854 | 0.400 | 1.801 |
| IRI Personal Distress (z) | 0.056 | 0.350 | 0.160 | 0.8727 | 1.058 | 0.529 | 2.140 |
| 1 Ordered logit (proportional-odds, MASS::polr). p-values from two-tailed z-test on t-statistic. CI from profile likelihood where convergent, otherwise Wald. All models MLE; EPV < 10 — interpret cautiously. | |||||||
MASC ToM (M1–M3): OR = 0.883, p = 0.7385. Higher Theory of Mind ability may improve belief accuracy by enabling better prediction of opponents’ decisions — an OR > 1 is consistent with this interpretation. IRI Perspective Taking (M2–M3): OR = 0.849, p = 0.661 — cognitive empathy is directly relevant to inferring opponents’ intended strategies; a positive OR would support the link between perspective-taking and prediction accuracy. IRI Personal Distress (M3): OR = 1.058, p = 0.8727 — self-oriented distress may interfere with accurate belief formation (OR < 1 expected). Given EPV ≤ 6 across all models, all estimates carry substantial uncertainty.
This analysis restricts the sample to participants who received a D signal from their opponent (opp_signal_received = D). The dependent variable is whether they nonetheless chose T (cooperate). Given the small subsample size, EPV may be below 10 and Firth penalised logit is applied automatically.
\[ \begin{aligned} \text{M1:} \quad & \text{logit}\,P(T) = \beta_0 + \beta_1\,\text{quiz\_err} \\ \text{M2:} \quad & \text{logit}\,P(T) = \beta_0 + \beta_1\,\text{quiz\_err} + \beta_2\,\text{CRT} \\ \text{M3:} \quad & \text{logit}\,P(T) = \beta_0 + \beta_1\,\text{CRT} \end{aligned} \]
Sample: opp_signal_received = D only. n = 11, events = 4. EPV: M1 = 4, M2 = 2, M3 = 4. All estimated with Firth penalised logit.
| PD — Cooperation when opponent signals D | |||||||
| DV = choice2=T | opp_signal=D. n=11, events=4. EPV: M1=4, M2=2, M3=4. All Firth penalised logit (brglm2).1 | |||||||
| Predictor | β | SE | z | p | OR | OR 2.5% | OR 97.5% |
|---|---|---|---|---|---|---|---|
| M1: Quiz only | |||||||
| Quiz errors [log(1+x)] | 1.028 | 0.646 | 1.59 | 0.112 | 2.79 | 0.79 | 9.91 |
| M2: Quiz + CRT | |||||||
| Quiz errors [log(1+x)] | 0.957 | 0.785 | 1.22 | 0.223 | 2.60 | 0.56 | 12.13 |
| CRT score (0–4) | -0.010 | 0.830 | -0.01 | 0.991 | 0.99 | 0.19 | 5.03 |
| M3: CRT only | |||||||
| CRT score (0–4) | 0.696 | 0.698 | 1.00 | 0.319 | 2.01 | 0.51 | 7.87 |
| 1 Firth penalised logit used for all models (EPV < 10). OR > 1 → increases P(cooperate | opp signals D). 95% Wald CI. | |||||||
Quiz errors (OR = 2.6, p = 0.223): a higher error rate on the comprehension quiz may reflect lower understanding of the game, potentially increasing naive cooperation even after a D signal. CRT score (OR = 0.99, p = 0.991): more reflective thinkers may be more sensitive to the dominant strategy argument and less likely to cooperate when signalled D. EPV = 2 — estimates are exploratory and should be interpreted with caution.
---
title: "PD — Prisoner's Dilemma"
subtitle: "GT Behaviour · GTEMO Experiment"
author: "Eric Guerci"
date: today
format:
html:
theme: flatly
toc: true
toc-depth: 2
toc-title: "Contents"
number-sections: false
code-fold: true
code-summary: "Show code"
code-tools: true
fig-width: 9
fig-height: 4
fig-dpi: 150
smooth-scroll: true
embed-resources: true
execute:
echo: true
warning: false
message: false
---
```{r setup}
#| include: false
source("code.R")
```
::: {.callout-tip icon="false"}
##### Payoff matrix — Row payoff, Column payoff
| | **T** (col) | **D** (col) |
|:---:|:---:|:---:|
| **T** (row) | 18, 18 ♦ | 0, 22 |
| **D** (row) | 22, 0 | **8, 8** ★ |
★ = Nash equilibrium (dominant strategy). ♦ = Pareto optimum. D strictly dominates T for both players: the unique Nash equilibrium (D, D) = 8/8€ is Pareto-inferior to mutual cooperation (T, T) = 18/18€. Cheap talk may shift cooperation toward the Pareto-optimal outcome.
:::
---
```{=html}
<details open>
<summary><strong>1 — Equilibria & coordination</strong></summary>
<div style="padding: 0.75em 0.5em 0.5em 0.5em;">
```
### Objective
Describe the **equilibrium outcomes** reached by each couple in Part 1 and Part 2, and test whether cheap talk shifted coordination rates using paired McNemar tests.
::: {.callout-tip icon="false"}
##### Equilibrium labels (PD)
- **Coop-Coop (T,T)** — Pareto-optimal; both cooperate ♦
- **Nash (D,D)** — Nash equilibrium; both defect ★
:::
### Equilibrium distributions
```{r}
#| label: fig-eq-pd
#| fig-cap: "PD — Equilibrium distributions in Part 1 (top) and Part 2 (bottom)."
#| fig-height: 8
#| fig-width: 8
p_eq_pd
```
### Coordination rates: Part 1 vs Part 2
:::: {layout="[1,1]" layout-valign="top"}
```{r}
#| label: tab-coord-rates-pd
coord_pd |>
dplyr::select(part, x, n, pct, ci95) |>
gt::gt() |>
gt::tab_header(title = "PD — Coordination rates: Part 1 vs Part 2",
subtitle = "95% Clopper-Pearson CI") |>
gt::cols_label(part = "Phase", x = "n coordinated", n = "N",
pct = "%", ci95 = "95% CI") |>
gt::tab_style(style = gt::cell_text(weight = "bold"),
locations = gt::cells_column_labels())
```
```{r}
#| label: fig-coord-pd
#| fig-cap: "PD — Coordination rate in Part 1 vs Part 2 (couple level). 95% Clopper-Pearson CI."
#| fig-height: 4
#| fig-width: 4
p_coord_pd
```
::::
### McNemar tests
```{r}
#| label: tab-mc-pd
tab_mc_pd |>
gt::gt() |>
gt::tab_header(title = "McNemar tests — PD (couple level)",
subtitle = "Paired Part 1 vs Part 2") |>
gt::cols_label(label = "Test", statistic = "χ²", p_value = "p-value",
note = "Note") |>
gt::fmt_number(columns = c(statistic, p_value), decimals = 4, rows = !is.na(statistic)) |>
gt::tab_style(style = gt::cell_text(weight = "bold"),
locations = gt::cells_body(columns = p_value,
rows = !is.na(p_value) & p_value < 0.05)) |>
gt::tab_style(style = gt::cell_text(weight = "bold"),
locations = gt::cells_column_labels())
```
```{r}
#| echo: false
coop1_pct <- scales::percent(coop_rates_pd$coop1, accuracy = 0.1)
coop2_pct <- scales::percent(coop_rates_pd$coop2, accuracy = 0.1)
```
::: callout-note
Mutual cooperation (T, T) in Part 1: **`r coop1_pct`** of pairs → Part 2: **`r coop2_pct`**. A significant McNemar result would indicate that cheap talk systematically elevated mutual cooperation above the Nash equilibrium level.
:::
### Conditioning on session gender
:::: {layout="[1,1]" layout-valign="top"}
```{r}
#| label: tab-cond-coord-pd
tab_cond_coord_pd |>
gt::gt() |>
gt::tab_header(title = "PD — Coordination and cooperation by session gender",
subtitle = "χ² with Monte Carlo simulated p-value (B = 2000), couple level") |>
gt::cols_label(Outcome = "Outcome", Factor = "Factor", Test = "χ²(sim.) test") |>
gt::tab_style(style = gt::cell_text(weight = "bold"),
locations = gt::cells_column_labels()) |>
gt::opt_stylize(style = 1) |>
gt::tab_options(table.font.size = 13)
```
```{r}
#| label: fig-coord-gender-pd
#| fig-cap: "PD — P(Coordination Part 2) by session gender (couple level). Error bars = 95% Clopper-Pearson CI."
#| fig-height: 4
#| fig-width: 4
p_coord_gender_pd
```
::::
```{=html}
</div>
</details>
```
---
```{=html}
<details>
<summary><strong>2 — Choice & signal distributions</strong></summary>
<div style="padding: 0.75em 0.5em 0.5em 0.5em;">
```
### Objective
Describe the **marginal distributions of choices and signals** (Part 1 choice → signal sent → Part 2 choice), and examine how the opponent's signal shapes Part 2 behaviour. All proportions use exact 95% Clopper-Pearson CIs.
### Distributions table
```{r}
#| label: tab-pd-dist
tab_pd
```
### Proportions by phase
```{r}
#| label: fig-pd-dist
#| fig-cap: "PD — Proportions of each choice/signal type with 95% Clopper-Pearson CIs. Left: Part 1 choices; centre: signals sent; right: Part 2 choices."
#| fig-height: 4
p_pd_dist
```
::: {.callout-note icon="false"}
##### Part 1 → Part 2 snapshot
```{r}
#| echo: false
pd_t1 <- tab_pd_dist_long |> dplyr::filter(phase == "Part 1", level == "T") |> dplyr::pull(pct)
pd_t2 <- tab_pd_dist_long |> dplyr::filter(phase == "Part 2", level == "T") |> dplyr::pull(pct)
pd_sig <- tab_pd_dist_long |> dplyr::filter(phase == "Signal") |> dplyr::arrange(dplyr::desc(n))
pd_t1 <- if (length(pd_t1) == 0) "—" else pd_t1
pd_t2 <- if (length(pd_t2) == 0) "—" else pd_t2
```
Cooperation (T) in Part 1: **`r pd_t1`**. The dominant signal was **`r pd_sig$level[1]`** (`r pd_sig$pct[1]`). Cooperation in Part 2: **`r pd_t2`**. An increase in T from Part 1 to Part 2 would be consistent with cheap talk successfully promoting Pareto-improving coordination despite the dominance of defection.
:::
### Within-subject shift: McNemar test (Part 1 vs Part 2)
```{r}
#| label: tab-mcnemar-pd
tab_mcnemar_pd |>
gt::gt() |>
gt::tab_header(
title = "McNemar test — PD: cooperated_part1 vs cooperated_part2",
subtitle = "Paired within-individual"
) |>
gt::cols_label(statistic = "χ²", p_value = "p-value", n = "n", note = "Note") |>
gt::fmt_number(columns = c(statistic, p_value), decimals = 4) |>
gt::tab_style(style = gt::cell_text(weight = "bold"),
locations = gt::cells_body(columns = p_value,
rows = !is.na(p_value) & p_value < 0.05)) |>
gt::tab_style(style = gt::cell_text(weight = "bold"),
locations = gt::cells_column_labels())
```
::: callout-note
A significant McNemar result (p < 0.05) indicates that pre-play cheap talk systematically shifted individual cooperation rates — consistent with the hypothesis that signals influence behaviour even in PD where defection is the dominant strategy.
:::
### Opponent signal and the information set before Part 2
::: callout-important
**Decision sequence.** After sending their own signal and *before* making the Part 2 choice, each player observes the **opponent's signal**. The Part 2 decision is taken with a two-dimensional information set: **(own signal sent) × (opponent's signal received)**. The four possible information sets are: T/T, T/D, D/T, D/D.
:::
```{r}
#| label: fig-sig-heatmap-pd
#| fig-cap: "PD — Joint distribution of own signal × opponent signal received. Values show count and share."
#| fig-height: 4
#| fig-width: 7
p_sig_heatmap_pd
```
```{r}
#| label: fig-choice2-oppsig-pd
#| fig-cap: "PD — P(choice₂ = T) stratified by opponent's signal. 95% Clopper-Pearson CI."
#| fig-height: 4
#| fig-width: 6
p_choice2_by_oppsig_pd
```
::: {.callout-note icon="false"}
##### Cheap talk in PD: signal has no credible effect on Part 2 behaviour
In the Prisoner's Dilemma, defection (D) is the dominant strategy for both players, regardless of what the opponent signals. The near-identical cooperation rates — 35.3% when the opponent signals T vs 36.4% when the opponent signals D — confirm this theoretical prediction. The opponent's signal cannot credibly promote cooperation because D strictly dominates T at any belief about the opponent's choice. This means: (1) when the opponent signals T (intending cooperation), many still defect (64.7%) because they rationally distrust the signal; and (2) when the opponent signals D, some still cooperate (36.4%) — not out of strategic response, but likely due to social preferences, altruism, or misunderstanding the game. The absence of a signal effect contrasts sharply with Stag Hunt, where mutual cooperation is an equilibrium: there, cheap talk meaningfully shifts behaviour (55.6% when opp signals T vs 30.4% when opp signals D), because the signal can be credible.
:::
```{r}
#| label: fig-choice2-infoset-pd
#| fig-cap: "PD — P(choice₂ = T) by full information set (own/opp). Colour = own signal. 95% CI. Information sets with no observations are omitted."
#| fig-height: 5
#| fig-width: 9
p_choice2_infoset_pd
```
::: {.callout-note icon="false"}
##### PD interpretation
The critical test is the (T/T) information set — where both players have promised cooperation. Even here, full cooperation in Part 2 may fall well below 100%, consistent with the theoretical prediction that cheap talk is non-binding and strategic distrust persists. When the opponent signals D, cooperation typically falls further.
:::
```{r}
#| label: fig-follow-opp-pd
#| fig-cap: "PD — Proportion of players whose Part 2 choice matches the opponent's signal received."
#| fig-height: 4
#| fig-width: 5
p_follow_opp_pd
```
### Conditioning on gender and role
```{r}
#| label: tab-cond-sig-pd
tab_cond_sig_pd |>
gt::gt() |>
gt::tab_header(title = "PD — Choice and signal distributions by gender and role",
subtitle = "χ² with Monte Carlo simulated p-value (B = 2000)") |>
gt::cols_label(Outcome = "Outcome", Factor = "Factor", Test = "χ²(sim.) test") |>
gt::tab_style(style = gt::cell_text(weight = "bold"),
locations = gt::cells_column_labels()) |>
gt::opt_stylize(style = 1) |>
gt::tab_options(table.font.size = 13)
```
```{r}
#| label: fig-cond-sig-pd
#| fig-cap: "PD — P(Signal = T) by gender and role. Error bars = 95% Clopper-Pearson CI. Dashed line = 50%."
#| fig-height: 4
#| fig-width: 9
p_cond_sig_pd
```
```{=html}
</div>
</details>
```
---
```{=html}
<details>
<summary><strong>3 — Signal honesty & consistency</strong></summary>
<div style="padding: 0.75em 0.5em 0.5em 0.5em;">
```
### Objective
Examine whether signals are **honest** (= same as the action eventually taken in Part 2) and **consistent** with Part 1 choices. Assess the prevalence of strategy switches between Part 1 and Part 2.
### Honesty and consistency proportions
```{r}
#| label: tab-sec2-pd
tab_sec2_pd |>
gt::gt() |>
gt::tab_header(title = "PD — Signal honesty and consistency",
subtitle = "95% Clopper-Pearson CI. Each row is a binary indicator (1 = yes, 0 = no).") |>
gt::cols_label(variable = "Measure", n = "n (=1)", N = "N",
pct = "%", ci95 = "95% CI") |>
gt::tab_style(style = gt::cell_text(weight = "bold"),
locations = gt::cells_column_labels()) |>
gt::tab_footnote(
footnote = "1 if the signal sent equals the Part 2 choice (e.g. sent T and chose T in Part 2). Measures whether players followed through on their signal.",
locations = gt::cells_body(columns = variable, rows = 1)
) |>
gt::tab_footnote(
footnote = "1 if the signal sent equals the Part 1 choice (e.g. signalled T and had also chosen T in Part 1). Measures whether the signal reflects past behaviour — independent of Part 2.",
locations = gt::cells_body(columns = variable, rows = 2)
) |>
gt::tab_footnote(
footnote = "1 if the Part 2 choice differs from the Part 1 choice (choice1 \u2260 choice2). Measures switching behaviour across rounds, independent of the signal. N may differ from rows 1\u20132 due to missing values in different variables.",
locations = gt::cells_body(columns = variable, rows = 3)
) |>
gt::tab_options(table.font.size = 13)
```
### Binomial test: signal honesty vs 50%
```{r}
#| label: tab-binom-honest-pd
tab_binom_honest_pd |>
gt::gt() |>
gt::tab_header(title = "Binomial test: P(Honest) vs H₀ = 0.50",
subtitle = "Two-sided test; 95% Clopper-Pearson CI") |>
gt::cols_label(x = "n honest", n = "N", pct = "%", ci95 = "95% CI",
p_value = "p-value") |>
gt::tab_style(style = gt::cell_text(weight = "bold"),
locations = gt::cells_body(columns = p_value,
rows = p_value < 0.05)) |>
gt::tab_style(style = gt::cell_text(weight = "bold"),
locations = gt::cells_column_labels())
```
::: callout-note
In PD, a signal is *honest* if the player's Part 2 action matches what they signalled. The aggregate honesty rate must be interpreted with caution: **honesty is only behaviorally meaningful for T signals**. A player who signals D and then defects is simply playing the dominant strategy — their "honesty" would have occurred regardless of the signal. By contrast, a player who signals T and then cooperates is making a costly choice (forgoing the defection payoff), so P(choice2 = T | signal = T) is the relevant measure of credible cheap talk. The overall honesty rate conflates these two very different cases and tends to be inflated by the trivially honest D→D cases.
:::
### Strategic transition heatmaps
```{r}
#| label: fig-sankey-pd
#| fig-cap: "PD — Strategic transitions. Left: Part 1 → Signal (row %: conditional on Part 1 choice). Right: Signal → Part 2 (row %: conditional on signal sent)."
#| fig-height: 4
#| fig-width: 11
p_sankey_pd
```
### Conditioning on gender and role
```{r}
#| label: tab-cond-honest-pd
tab_cond_honest_pd |>
gt::gt() |>
gt::tab_header(title = "PD — Signal honesty and strategy change by gender and role",
subtitle = "χ² with Monte Carlo simulated p-value (B = 2000)") |>
gt::cols_label(Outcome = "Outcome", Factor = "Factor", Test = "χ²(sim.) test") |>
gt::tab_style(style = gt::cell_text(weight = "bold"),
locations = gt::cells_column_labels()) |>
gt::opt_stylize(style = 1) |>
gt::tab_options(table.font.size = 13)
```
```{r}
#| label: fig-cond-honest-pd
#| fig-cap: "PD — P(Honest) by gender (left) and role (right). Error bars = 95% Clopper-Pearson CI. Dashed line = 50%."
#| fig-height: 4
#| fig-width: 9
p_cond_honest_pd
```
```{=html}
</div>
</details>
```
---
```{=html}
<details>
<summary><strong>4 — Belief accuracy & bonus</strong></summary>
<div style="padding: 0.75em 0.5em 0.5em 0.5em;">
```
::: {.callout-note icon="false"}
##### The two belief questions
After Part 2, each player answered two incentivised questions about their beliefs:
**Belief 1 — First-order belief:** *"What do you think your opponent chose in Part 2?"* (T or D).
Scored correct (`GT_right_guess1 = 1`) if the player's prediction matched the opponent's actual Part 2 choice. Bonus: +2€ if correct.
**Belief 2 — Second-order belief:** *"What do you think your opponent believes you chose in Part 2?"* (T or D).
Scored correct (`GT_right_guess2 = 1`) if the player correctly identified what the opponent believed about the player's own choice. Bonus: +2€ if correct.
The **belief accuracy score** = `GT_right_guess1 + GT_right_guess2` ∈ {0, 1, 2}. The **belief bonus** = score × 2€ ∈ {0€, 2€, 4€}.
:::
### Objective
Describe the distribution of **belief accuracy scores** (0 = both beliefs wrong; 1 = one correct; 2 = both correct) and the associated **belief bonus** payoff. Assess whether belief accuracy is correlated with coordination outcomes at the couple level.
### Belief accuracy distribution
```{r}
#| label: fig-belief-bar-pd
#| fig-cap: "PD — Distribution of belief accuracy scores."
#| fig-height: 5
#| fig-width: 7
p_belief_bar_pd
```
### Hypothesis tests: beliefs, cognitive ability & coordination
```{r}
#| label: tab-belief-hyp-pd
#| echo: false
tab_belief_hyp_gt
```
::: callout-note
**H1** tests whether more reflective players (higher CRT) are better at predicting their opponent — if strategic reasoning drives belief formation, a positive Spearman ρ is expected. **H2** tests whether players who struggled with game comprehension (more quiz errors) hold less accurate beliefs — expected direction is negative. **H3** tests whether couples where *both* players had perfect beliefs were more likely to coordinate in Part 2 — the only couple-level hypothesis; uses Fisher exact given small N and binary outcomes. Note that H1 and H2 operate at the individual level while H3 is at the couple level; they are not directly comparable.
:::
### Conditioning on gender and role
```{r}
#| label: fig-cond-belief-pd
#| fig-cap: "PD — Belief accuracy score (0/1/2) by gender (left) and role (right). Bars show proportion within each group; labels show % and count. Score 0 = both beliefs wrong, 1 = one correct, 2 = both correct."
#| fig-height: 4
#| fig-width: 10
p_cond_belief_pd
```
```{=html}
</div>
</details>
```
---
```{=html}
<details>
<summary><strong>5 — Econometric models</strong></summary>
<div style="padding: 0.75em 0.5em 0.5em 0.5em;">
```
---
### 5.1 — Determinants of belief accuracy
Estimate an **ordered logit (proportional-odds model)** for the belief accuracy score (0 = both wrong, 1 = one correct, 2 = both correct) using Theory of Mind (MASC) and IRI subscales as predictors. The proportional-odds assumption implies a single log-odds shift per unit increase in each predictor, shared across both thresholds (0→1 and 1→2).
::: {.callout-warning icon="true"}
##### Small-sample caveat
With n = `r n_olog` participants (score 0: n=`r n_olog_0`; 1: n=`r n_olog_1`; 2: n=`r n_olog_2`), EPV is computed as min(n₀, n₂) / k: M1 = `r epv_B1`, M2 = `r epv_B2`, M3 = `r epv_B3`. All are well below the recommended 10. **All results are exploratory and should be treated as hypothesis-generating.**
:::
#### Model specifications
$$
\begin{aligned}
\text{M1:} \quad & \text{logit}\,P(Y \le j) = \alpha_j - \beta_1\,\text{MASC}_z \\
\text{M2:} \quad & \text{logit}\,P(Y \le j) = \alpha_j - \beta_1\,\text{MASC}_z - \beta_2\,\text{IRI-PT}_z \\
\text{M3:} \quad & \text{logit}\,P(Y \le j) = \alpha_j - \beta_1\,\text{MASC}_z - \beta_2\,\text{IRI-PT}_z - \beta_3\,\text{IRI-PD}_z
\end{aligned}
$$
**MASC** = Theory of Mind total score; **IRI-PT** = Perspective Taking subscale (cognitive empathy — most directly linked to predicting opponents' decisions); **IRI-PD** = Personal Distress subscale (self-oriented distress — may impair strategic prediction). All scores z-standardised. OR > 1 shifts probability towards higher belief accuracy.
#### Coefficient table
```{r}
#| label: tab-olog-pd
tab_olog_gt
```
#### Goodness of fit
```{r}
#| label: tab-gof-olog-pd
tab_gof_olog_gt
```
#### Forest plot: odds ratios
```{r}
#| label: fig-forest-olog-pd
#| fig-cap: "PD — Ordered logit: odds ratios for belief accuracy. OR > 1 increases probability of higher accuracy score. Error bars = 95% CI. Dashed line = OR 1 (no effect). x-axis log scale."
#| fig-height: 4
#| fig-width: 8
p_forest_olog
```
::: {.callout-note icon="false"}
##### Interpretation
```{r}
#| echo: false
m1_or_masc <- tab_olog_all |> dplyr::filter(model == "M1: MASC only", term == "MASC_z") |> dplyr::pull(OR)
m1_p_masc <- tab_olog_all |> dplyr::filter(model == "M1: MASC only", term == "MASC_z") |> dplyr::pull(p_value)
m2_or_iript <- tab_olog_all |> dplyr::filter(model == "M2: MASC + IRI-PT", term == "IRI_PT_z") |> dplyr::pull(OR)
m2_p_iript <- tab_olog_all |> dplyr::filter(model == "M2: MASC + IRI-PT", term == "IRI_PT_z") |> dplyr::pull(p_value)
m3_or_iripd <- tab_olog_all |> dplyr::filter(model == "M3: MASC + IRI-PT + IRI-PD", term == "IRI_PD_z") |> dplyr::pull(OR)
m3_p_iripd <- tab_olog_all |> dplyr::filter(model == "M3: MASC + IRI-PT + IRI-PD", term == "IRI_PD_z") |> dplyr::pull(p_value)
```
**MASC ToM (M1–M3):** OR = `r m1_or_masc`, p = `r m1_p_masc`. Higher Theory of Mind ability may improve belief accuracy by enabling better prediction of opponents' decisions — an OR > 1 is consistent with this interpretation. **IRI Perspective Taking (M2–M3):** OR = `r m2_or_iript`, p = `r m2_p_iript` — cognitive empathy is directly relevant to inferring opponents' intended strategies; a positive OR would support the link between perspective-taking and prediction accuracy. **IRI Personal Distress (M3):** OR = `r m3_or_iripd`, p = `r m3_p_iripd` — self-oriented distress may interfere with accurate belief formation (OR < 1 expected). Given EPV ≤ `r epv_B1` across all models, all estimates carry substantial uncertainty.
:::
---
### 5.2 — Cooperation under defection signal
::: {.callout-warning icon="false"}
##### Small-sample note
This analysis restricts the sample to participants who received a **D signal** from their opponent (`opp_signal_received = D`). The dependent variable is whether they nonetheless chose **T (cooperate)**. Given the small subsample size, EPV may be below 10 and Firth penalised logit is applied automatically.
:::
#### Models
$$
\begin{aligned}
\text{M1:} \quad & \text{logit}\,P(T) = \beta_0 + \beta_1\,\text{quiz\_err} \\
\text{M2:} \quad & \text{logit}\,P(T) = \beta_0 + \beta_1\,\text{quiz\_err} + \beta_2\,\text{CRT} \\
\text{M3:} \quad & \text{logit}\,P(T) = \beta_0 + \beta_1\,\text{CRT}
\end{aligned}
$$
**Sample**: `opp_signal_received = D` only. n = `r n_D`, events = `r n_events_D`. EPV: M1 = `r round(epv_D_m1, 1)`, M2 = `r round(epv_D_m2, 1)`, M3 = `r round(epv_D_m3, 1)`. All estimated with Firth penalised logit.
#### Results
```{r}
#| label: tab-logit-D
#| echo: false
#| message: false
tab_D_gt
```
```{r}
#| label: fig-forest-D
#| echo: false
#| message: false
#| fig-cap: "Cooperation when opponent signals D — M1 (quiz only), M2 (quiz + CRT), M3 (CRT only). Odds ratios with 95% CI. OR > 1 increases P(cooperate). All Firth. x-axis log scale."
#| fig-height: 3.5
#| fig-width: 8
p_forest_D
```
::: {.callout-note icon="false"}
##### Interpretation
```{r}
#| echo: false
d_or_quiz <- round(exp(coef(m_coop_D)["log_quiz_err"]), 2)
d_p_quiz <- round(summary(m_coop_D)$coefficients["log_quiz_err", 4], 3)
d_or_crt <- round(exp(coef(m_coop_D)["CRT4"]), 2)
d_p_crt <- round(summary(m_coop_D)$coefficients["CRT4", 4], 3)
```
**Quiz errors** (OR = `r d_or_quiz`, p = `r d_p_quiz`): a higher error rate on the comprehension quiz may reflect lower understanding of the game, potentially increasing naive cooperation even after a D signal. **CRT score** (OR = `r d_or_crt`, p = `r d_p_crt`): more reflective thinkers may be more sensitive to the dominant strategy argument and less likely to cooperate when signalled D. EPV = `r round(epv_D, 1)` — estimates are exploratory and should be interpreted with caution.
:::
```{=html}
</div>
</details>
```