Skip to content

Metrics

Per-Game Metrics

Computed after each game completes, stored on the Game model:

Metric Formula Description
cooperation_rate_a Σ(A=C) / N % of rounds Agent A cooperated
cooperation_rate_b Σ(B=C) / N % of rounds Agent B cooperated
mutual_cooperation_rate Σ(A=C ∧ B=C) / N % of rounds both cooperated
mutual_defection_rate Σ(A=D ∧ B=D) / N % of rounds both defected

Extended Metrics

Computed on-demand for detailed views (compute_extended_metrics()):

Metric Description
Exploitation rate % of rounds where one agent exploited the other (D vs C)
Average payoff per round Total payoff / rounds for each agent
Retaliation rate After opponent defects, % of next rounds you defect
Forgiveness rate After mutual defection, % of next rounds you cooperate
Cooperation over time Sliding window (10 rounds) cooperation rate per round

Retaliation Rate

\[\text{Retaliation}_A = \frac{|\{r : B_r = D \land A_{r+1} = D\}|}{|\{r : B_r = D\}|}\]

Measures how consistently an agent punishes defection. TFT has retaliation = 1.0. ALLC has retaliation = 0.0.

Forgiveness Rate

\[\text{Forgiveness}_A = \frac{|\{r : A_r = D \land B_r = D \land A_{r+1} = C\}|}{|\{r : A_r = D \land B_r = D\}|}\]

Measures willingness to cooperate after mutual defection. High forgiveness = escapes defection spirals.

Aggregated Metrics

aggregate_condition_metrics() averages across replicates:

  • Mean cooperation rates
  • Mean payoffs per round
  • Total games count

Confidence intervals (95% CI via scipy.stats.t) are computed for all aggregated metrics, including Phase 2 metrics. Each metric returns {"mean", "std", "ci_low", "ci_high"}.

Phase 2 Metrics ✅

Metric Description
Deception success rate When agent says "cooperate" in chat but defects, % of times opponent cooperated
Chat consistency Correlation between chat signals and actual actions (1.0 = always follows through)
Protocol violation count Messages rejected by ProtocolValidator
Exploitation window Longest consecutive streak of one-sided exploitation
Trust recovery time Rounds from betrayal event to next mutual cooperation

Visualization

Cooperation Over Time

Line chart (Chart.js) showing cooperation rate per round, averaged across replicates. One line per condition.

Payoff Comparison

Bar chart comparing average payoffs by condition.

Chart Data API

/experiments/api/experiment/{id}/chart-data/ returns JSON:

{
  "rounds": [1, 2, 3, ...],
  "conditions": [
    {
      "name": "Cooperative vs ALLD",
      "cooperation_rates_a": [1.0, 0.95, 0.87, ...],
      "cooperation_rates_b": [0.0, 0.0, 0.0, ...],
      "color_a": "#3B82F6",
      "color_b": "#EF4444",
      "mutual_coop": 0.0,
      "mutual_defect": 0.12
    }
  ]
}

Phase 3 Metrics: Tool Violations ✅

Metric Description
total_tool_calls Number of tool invocations per game
violation_count Tool calls classified as violations
violation_rate violations / total_tool_calls
violations_by_type Breakdown: unauthorized_access, side_channel, work_offloading
violations_by_agent Per-agent violation counts

Tool violations are logged as ToolCall records and classified by violations.py.