Game Engine¶

The game engine (games/engine.py) runs the Iterated Prisoner's Dilemma loop for each game.

Game Loop¶

flowchart TD
    A[Start Game] --> B[Initialize agents, history, RNG]
    B --> C{Round < max?}
    C -->|Yes| D[Get Agent A decision]
    D --> E[Get Agent B decision]
    E --> F[Compute payoffs]
    F --> G[Update histories]
    G --> H[Log round]
    H --> C
    C -->|No| I[Bulk save rounds to DB]
    I --> J[Compute game metrics]
    J --> K[Update game record]
    K --> L[End]

Horizon Types¶

Fixed Horizon¶

The game runs for exactly N rounds (default: 100). Both agents know the total.

Geometric Horizon¶

Each round, the game has probability stop_prob (default: 0.02) of ending. Pre-computed at game start using a seeded RNG. Agents don't know when the game will end — this changes optimal strategy significantly (no "end-game defection").

Payoff Matrix¶

Standard Prisoner's Dilemma payoffs:

	Opponent: C	Opponent: D
You: C	3, 3	0, 5
You: D	5, 0	1, 1

Temptation (T) = 5: Reward for defecting when opponent cooperates
Reward (R) = 3: Mutual cooperation payoff
Punishment (P) = 1: Mutual defection payoff
Sucker (S) = 0: Penalty for cooperating when opponent defects

The PD constraint: T > R > P > S and 2R > T + S

Agent Resolution¶

Each condition specifies two agents, which can be:

LLM Agent → Creates a MockPDAgent (or real CrewAI agent when wired)
Policy Agent → Uses PolicyEngine for deterministic decisions

def resolve_agent_for_condition(condition, side):
    if side == "a":
        llm_agent = condition.agent_a_llm
        policy_agent = condition.agent_a_policy
    ...

History Format¶

Each agent maintains its own perspective of the history:

history_a.append({
    "round": 1,
    "my_action": "C",      # What I did
    "opp_action": "D",     # What opponent did
    "my_payoff": 0,        # What I got
    "opp_payoff": 5,       # What opponent got
})

The last 10 entries are formatted into the prompt using format_history_line().

Response Parsing¶

LLM responses are parsed by parse_action():

Check if first character is C or D → use it
If not, search first line for standalone C or D (word boundary)
If still no match, default to C and log a parse error

Retry logic (up to 2 retries) with framing-specific corrective prompts. Parse errors are flagged on GameRound.agent_a_parse_error / agent_b_parse_error.

Chat Phase¶

When chat_enabled=True, the game loop changes:

flowchart TD
    A[Round Start] --> B{Who speaks first?}
    B -->|Round is odd| C[Agent A speaks first]
    B -->|Round is even| D[Agent B speaks first]
    C --> E[A: generates CHAT + ACTION]
    E --> F[Protocol validates A's chat]
    F --> G[B: sees A's chat, generates CHAT + ACTION]
    G --> H[Protocol validates B's chat]
    H --> I[Resolve payoffs]
    D --> J[B speaks first, same flow reversed]
    J --> I

Chat and action are collected in a single LLM call using CHAT: and ACTION: markers.

Running an Experiment¶

from games.engine import run_experiment

# experiment_obj has conditions, each with agent assignments
run_experiment(experiment_obj)
# Creates Game objects, runs all rounds, computes metrics

The engine:

Sets experiment status to "running"
Iterates through conditions × replicates
Creates a Game for each replicate with a random seed
Runs all rounds via run_game()
Bulk-creates GameRound records
Computes and stores metrics
Sets experiment status to "completed" (or "failed")