Game Theory Simulator Back
Mathematical Sciences

Game Theory Simulator

Edit payoff matrices for Prisoner's Dilemma, Stag Hunt, and Chicken Game. Find Nash equilibria automatically and watch cooperation evolve in real-time spatial evolutionary games.

Game Presets
Payoff Matrix (Row: P1, Col: P2)
P2: Cooperate (C)P2: Defect (D)
P1: Cooperate (C) / /
P1: Defect (D) / /
Evolutionary Game Settings
Grid Size
Mutation Rate
%
Results
Coop. Rate
0
Generation
Avg Payoff
Pareto Opt.
Cooperate (C)
Defect (D)
Tit-for-Tat
Win-Stay
Game
Pay
Theory & Key Formulas
For all players $i$ and strategies $s_i'$:
$u_i(s_i^ , s_{-i}^ ) \geq u_i(s_i', s_{-i}^*)$

Replicator equation:
$\dot{x}_i = x_i(f_i - \bar{f})$

What is Game Theory?

🙋
What exactly is a "Nash Equilibrium"? I see the simulator highlights cells in the payoff matrix when I run it.
🎓
Basically, it's a stable state where no player can get a better payoff by changing their strategy alone. In this simulator, when you click "Find Nash," it checks every cell in the matrix. For instance, in a classic Prisoner's Dilemma, the equilibrium is where both players defect, even though cooperating would be better for both. Try changing the payoffs in the matrix above and see how the highlighted equilibrium cell shifts.
🙋
Wait, really? So the equilibrium isn't always the "best" overall outcome? What's that "Replicator" simulation doing then?
🎓
Exactly! That's the key tension. The Replicator Dynamics simulation shows how strategies evolve over time in a population. When you hit "Simulate," each colored dot is an agent playing a strategy. More successful strategies get copied more often. Try lowering the "Mutation Rate" slider—you'll see one strategy often takes over completely. Increase it, and you get a more mixed, unpredictable population.
🙋
So the "Update Rule" changes how they copy each other? What happens if I switch from "Imitate Best" to "Fermi Rule"?
🎓
Great question! "Imitate Best" is deterministic—agents always copy the most successful neighbor. The "Fermi Rule" adds randomness, like making a mistake or experimenting. In practice, this can allow cooperation to survive in harsh environments like the Prisoner's Dilemma. Change the rule while the simulation runs and watch if the cooperative blue dots can sustain themselves against the defecting red ones.

Physical Model & Key Equations

The core condition for a Nash Equilibrium is that each player's strategy is a "best response" to what the others are doing. No one has an incentive to unilaterally deviate.

$$u_i(s_i^ , s_{-i}^ ) \geq u_i(s_i', s_{-i}^*)$$

Here, $u_i$ is the payoff for player $i$, $s_i^ $ is their equilibrium strategy, and $s_{-i}^ $ are the strategies of all other players. The inequality must hold for every possible alternative strategy $s_i'$ that player $i$ could choose.

Replicator Dynamics describe how the proportion of agents using a strategy changes over time, based on its performance relative to the average.

$$\dot{x}_i = x_i(f_i - \bar{f})$$

$x_i$ is the fraction of the population using strategy $i$. $f_i$ is the fitness (payoff) of strategy $i$, and $\bar{f}$ is the average fitness of the whole population. If a strategy does better than average ($f_i \gt \bar{f}$), its share $x_i$ grows.

Real-World Applications

Auction & Bidding Design: Governments use game theory to design spectrum auctions for mobile networks. The goal is to structure payoffs so the Nash Equilibrium leads to efficient outcomes and fair prices, preventing bidders from gaming the system.

Traffic Flow & Routing: Apps like Waze or Google Maps create a massive game where each driver chooses a route. The Nash Equilibrium can be where no single driver can find a faster path, but this collective state is often worse (more congestion) than if a central planner assigned routes.

Evolutionary Biology: The Replicator Dynamics model directly applies to animal behavior. For instance, the proportion of "Hawk" (aggressive) vs. "Dove" (peaceful) strategies in a species will evolve based on the payoffs of fights over resources.

Cybersecurity & Network Defense: Security experts model attacks and defenses as a game. A company must allocate limited resources to protect various assets, while an attacker chooses targets. Finding the mixed-strategy Nash Equilibrium helps predict attack patterns and optimize defense spending.

Common Misconceptions and Points to Note

First, let go of the assumption that "the Nash equilibrium is the one and only 'correct answer.'" For example, in the "Stag Hunt" game, there are two Nash equilibria: "everyone cooperates" and "everyone defects." If you change the initial conditions in the simulator, you'll see the convergence shift between these equilibria. This illustrates that in real-world negotiations or markets, different equilibria can be realized depending on initial conditions or historical context (e.g., which technology gained adoption first).

Next, be aware of the pitfalls in setting parameters for evolutionary games. When you set the "update rule" to "best response," the strategy changes on the grid can become extremely fast and chaotic. Considering that real human or biological learning/imitation isn't that perfectly rational, this should prompt you to question whether the model might be overly simplistic. When applying these concepts in practice, remember that the choice of update rule significantly influences the outcomes, so you need to carefully consider the "learning mechanism" of the system you're studying.

Finally, understand that the "ordinal relationship" between payoffs is more fundamental than their "absolute values." In the Prisoner's Dilemma, the relationship between the temptation payoff for defection (T), the reward for mutual cooperation (R), the punishment for mutual defection (P), and the sucker's payoff for unilateral cooperation (S) is T > R > P > S. Even if you drastically increase the numerical value of the "reward R" from 10 to 100 in the simulator, as long as this ordinal relationship holds, the Prisoner's Dilemma structure remains, and defection stays dominant. When tweaking numbers, pay close attention to how this ordering changes.

How to Use

  1. Enter payoff values for each cell: m00r/m00c (both defect), m01r/m01c (row defects, column cooperates), m10r/m10c (row cooperates, column defects), m11r/m11c (both cooperate)
  2. Click Simulate to run 500+ generations of iterated play with random player matching
  3. Monitor Nash equilibrium detection and real-time cooperation rate trending; adjust payoff matrix and re-run to test different strategic structures

Worked Example

Prisoner's Dilemma: m00r=1, m00c=1 (mutual defection), m01r=0, m01c=3 (sucker's payoff vs exploitation), m10r=3, m10c=0, m11r=2, m11c=2 (mutual cooperation). After 500 generations with tit-for-tat seeding, cooperation rate stabilizes at 78%, Nash equilibrium locked at (Defect, Defect) with average payoff 1.2. Switching to m11r=2.5, m11c=2.5 shifts equilibrium; cooperation climbs to 92% by generation 450, Pareto optimality improves to 85%.

Practical Notes

  1. Symmetric games (m01r=m10c, m10r=m01c) converge faster; asymmetric payoffs trigger cycling behaviors lasting 200+ generations
  2. Set m11r and m11c above 1.5 to make cooperation attractive relative to defection; below 1.0 guarantees defection-dominant equilibrium
  3. Watch for mixed-strategy equilibria when no pure Nash exists; cooperation rate oscillates rather than stabilizing
  4. Pareto optimality metric flags whether current generation state dominates the (0,0) outcome; useful for Stag Hunt variants