Game Engine¶
The game engine (games/engine.py) runs the Iterated Prisoner's Dilemma loop for each game.
Game Loop¶
flowchart TD
A[Start Game] --> B[Initialize agents, history, RNG]
B --> C{Round < max?}
C -->|Yes| D[Get Agent A decision]
D --> E[Get Agent B decision]
E --> F[Compute payoffs]
F --> G[Update histories]
G --> H[Log round]
H --> C
C -->|No| I[Bulk save rounds to DB]
I --> J[Compute game metrics]
J --> K[Update game record]
K --> L[End]
Horizon Types¶
Fixed Horizon¶
The game runs for exactly N rounds (default: 100). Both agents know the total.
Geometric Horizon¶
Each round, the game has probability stop_prob (default: 0.02) of ending. Pre-computed at game start using a seeded RNG. Agents don't know when the game will end — this changes optimal strategy significantly (no "end-game defection").
Payoff Matrix¶
Standard Prisoner's Dilemma payoffs:
| Opponent: C | Opponent: D | |
|---|---|---|
| You: C | 3, 3 | 0, 5 |
| You: D | 5, 0 | 1, 1 |
- Temptation (T) = 5: Reward for defecting when opponent cooperates
- Reward (R) = 3: Mutual cooperation payoff
- Punishment (P) = 1: Mutual defection payoff
- Sucker (S) = 0: Penalty for cooperating when opponent defects
The PD constraint: T > R > P > S and 2R > T + S
Agent Resolution¶
Each condition specifies two agents, which can be:
- LLM Agent → Creates a
MockPDAgent(or real CrewAI agent when wired) - Policy Agent → Uses
PolicyEnginefor deterministic decisions
def resolve_agent_for_condition(condition, side):
if side == "a":
llm_agent = condition.agent_a_llm
policy_agent = condition.agent_a_policy
...
History Format¶
Each agent maintains its own perspective of the history:
history_a.append({
"round": 1,
"my_action": "C", # What I did
"opp_action": "D", # What opponent did
"my_payoff": 0, # What I got
"opp_payoff": 5, # What opponent got
})
The last 10 entries are formatted into the prompt using format_history_line().
Response Parsing¶
LLM responses are parsed by parse_action():
- Check if first character is C or D → use it
- If not, search first line for standalone C or D (word boundary)
- If still no match, default to C and log a parse error
Retry logic (up to 2 retries) with framing-specific corrective prompts. Parse errors are flagged on GameRound.agent_a_parse_error / agent_b_parse_error.
Chat Phase¶
When chat_enabled=True, the game loop changes:
flowchart TD
A[Round Start] --> B{Who speaks first?}
B -->|Round is odd| C[Agent A speaks first]
B -->|Round is even| D[Agent B speaks first]
C --> E[A: generates CHAT + ACTION]
E --> F[Protocol validates A's chat]
F --> G[B: sees A's chat, generates CHAT + ACTION]
G --> H[Protocol validates B's chat]
H --> I[Resolve payoffs]
D --> J[B speaks first, same flow reversed]
J --> I
Chat and action are collected in a single LLM call using CHAT: and ACTION: markers.
Running an Experiment¶
from games.engine import run_experiment
# experiment_obj has conditions, each with agent assignments
run_experiment(experiment_obj)
# Creates Game objects, runs all rounds, computes metrics
The engine:
- Sets experiment status to "running"
- Iterates through conditions × replicates
- Creates a
Gamefor each replicate with a random seed - Runs all rounds via
run_game() - Bulk-creates
GameRoundrecords - Computes and stores metrics
- Sets experiment status to "completed" (or "failed")