Setup code (click to expand)
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import pandas as pd
import numpy as np
np.random.seed(42)Adam Fillion
January 12, 2026
Here is a classic problem in decision theory, first posed by Nicolaus Bernoulli in 1713 [1]. Imagine I offer you a betting game.
I, the dealer, start with 1 dollar. I flip a fair coin. If heads, the pot doubles (1 -> 2 -> 4 …). If tails, you win the entirety of the pot.
For example, an immediate tails wins you 1 dollar, while Heads-Heads-Tails wins you 4 dollars.
How much are you willing to pay to play this game?
The outcomes span an infinite range, from winning 1 dollar to \(\infty\) dollars. In betting games, you can naively bet the expected value of the game to determine your betting amount, so let’s compute that:
The payout for \(n\) coin flips (ending in tails) is \(2^{n-1}\) dollars, with probability \(\left(\frac{1}{2}\right)^n\):
\[E[X] = \sum_{n=1}^{\infty} \left(\frac{1}{2}\right)^n \cdot 2^{n-1} = \sum_{n=1}^{\infty} \frac{2^{n-1}}{2^n} = \sum_{n=1}^{\infty} \frac{1}{2} = \frac{1}{2} + \frac{1}{2} + \frac{1}{2} + \cdots = \infty\]
Infinite. Intuition should be screaming at you right now, clearly you shouldn’t liquidate your life savings for a chance to play.
And that’s why this is a paradox, since the math and the reality of being a money-loving human don’t agree. Thankfully, expanding the mathematics a little bit to model human behaviour better, we can find a satisfactory result.
(Approach 1) Discount low probability events
Humans naturally discount events that are extremely unlikely. If I told you there was a 0.0001% chance of winning a billion dollars, you might mentally round that to zero. Let’s formalize this: we ignore any outcome with probability below some threshold \(p_{min}\).
If we only consider outcomes with probability \(\geq p_{min}\), we include flips where \(\left(\frac{1}{2}\right)^n \geq p_{min}\), which means \(n \leq \log_2\left(\frac{1}{p_{min}}\right)\).
The expected value becomes:
\[E[X]_{clipped} = \sum_{n=1}^{N_{max}} \frac{1}{2} = \frac{N_{max}}{2} \quad \text{where } N_{max} = \lfloor \log_2(1/p_{min}) \rfloor\]
def expected_value_with_threshold(p_min):
"""Calculate expected value ignoring events with probability < p_min"""
n_max = int(np.floor(np.log2(1/p_min)))
return n_max / 2
# Calculate for various thresholds
thresholds = [0.1, 0.01, 0.001, 0.0001, 0.00001]
evs = [expected_value_with_threshold(p) for p in thresholds]
# Create figure with probability distribution
fig = make_subplots(rows=1, cols=2,
subplot_titles=("Probability Distribution (clipped at 1%)",
"Expected Value vs Probability Threshold"))
# Left plot: probability distribution with clipping
n_values = np.arange(1, 25)
probabilities = 0.5 ** n_values
payouts = 2 ** (n_values - 1)
contributions = probabilities * payouts # Each is 0.5
# Highlight clipped region (p < 0.01 means n > ~7)
colors = ['#2E86AB' if p >= 0.01 else '#cccccc' for p in probabilities]
fig.add_trace(
go.Bar(x=[f"n={n}" for n in n_values[:12]],
y=probabilities[:12] * 100,
marker_color=colors[:12],
name="Probability (%)"),
row=1, col=1
)
fig.add_hline(y=1, line_dash="dash", line_color="red", row=1, col=1,
annotation_text="1% threshold")
# Right plot: expected value vs threshold
fig.add_trace(
go.Scatter(x=[f"{p*100}%" for p in thresholds],
y=evs,
mode='lines+markers+text',
text=[f"${ev:.1f}" for ev in evs],
textposition='top center',
line=dict(color='#E94F37', width=2),
marker=dict(size=10),
name="Expected Value"),
row=1, col=2
)
fig.update_xaxes(title_text="Number of flips", row=1, col=1)
fig.update_yaxes(title_text="Probability (%)", row=1, col=1)
fig.update_xaxes(title_text="Minimum probability threshold", row=1, col=2)
fig.update_yaxes(title_text="Expected Value ($)", row=1, col=2)
fig.update_layout(height=400, showlegend=False)
fig.show()With a 1% probability threshold, you’d only consider outcomes up to 6 flips (winning up to $32), giving an expected value of $3.00. That’s much more aligned with what people actually say they’d pay.
This approach is mathematically equivalent to capping the dealer’s maximum payout. If you don’t trust that the dealer can actually pay you $100 quadrillion (and you shouldn’t), then you’re implicitly applying this threshold. A casino with a $1 million cap gives an expected value of only about $10.
(Approach 2) Use a utility function for money
A more sophisticated approach, proposed by Daniel Bernoulli [2], is to recognize that humans don’t value money linearly. Bernoulli proposed a logarithmic utility of wealth:
\[U(x) = \ln(x)\]
wealth = np.linspace(1, 10000, 1000)
fig = make_subplots(rows=1, cols=2, subplot_titles=(
'Linear Value (What math assumes)',
'Logarithmic Utility (What we feel)'
))
fig.add_trace(
go.Scatter(x=wealth, y=wealth, mode='lines',
line=dict(color='#E94F37', width=2), name='Linear'),
row=1, col=1
)
fig.add_trace(
go.Scatter(x=wealth, y=np.log(wealth), mode='lines',
line=dict(color='#2E86AB', width=2), name='Log Utility'),
row=1, col=2
)
fig.update_xaxes(title_text="Dollars", row=1, col=1)
fig.update_xaxes(title_text="Dollars", row=1, col=2)
fig.update_yaxes(title_text="Value", row=1, col=1)
fig.update_yaxes(title_text="Utility", row=1, col=2)
fig.update_layout(height=350, showlegend=False)
fig.show()With log utility, we maximize expected utility rather than expected value:
\[E[U] = \sum_{n=1}^{\infty} \left(\frac{1}{2}\right)^n \ln(2^{n-1}) = \ln(2) \sum_{n=1}^{\infty} \frac{n-1}{2^n}\]
This sum converges! Using the identity \(\sum_{n=1}^{\infty} \frac{n}{2^n} = 2\), we get:
\[E[U] = \ln(2) \cdot (2 - 1) = \ln(2) \approx 0.693\]
The “certainty equivalent” (the guaranteed amount that gives the same utility) is \(e^{0.693} \approx\) $2. More refined calculations accounting for your existing wealth push this to around $4.
# Calculate expected utility for Bernoulli's log utility
def expected_log_utility():
"""Calculate the expected log utility of the St. Petersburg game"""
total = 0.0
for n in range(1, 100): # Sum enough terms
prob = 0.5 ** n
payout = float(2 ** (n - 1))
total += prob * np.log(payout)
return total
eu = expected_log_utility()
certainty_equiv = np.exp(eu)
print(f"Expected log-utility: {eu:.4f}")
print(f"Certainty equivalent: ${certainty_equiv:.2f}")
print(f"\nThis means you should value this game the same as")
print(f"a guaranteed ${certainty_equiv:.2f} payout.")Expected log-utility: 0.6931
Certainty equivalent: $2.00
This means you should value this game the same as
a guaranteed $2.00 payout.
This is pretty much all you need to resolve the paradox, but there is still more to talk about if you are allowed to play the game multiple times.
Something interesting happens as the number of game trials increases: the typical average payout grows. The expected value of the average is always infinite (it inherits the infinite mean of a single game), but the payout you’re actually likely to realize creeps upward — roughly in proportion to \(\log_2\) of the number of trials [4]. This is also why the “Mean” column below is so jumpy while the “Median” climbs smoothly: a handful of rare, enormous payouts dominate the sample mean.
The mathematics here is similar to what happens in other scenarios like a random walk.
function playStPetersburg() {
let payout = 1;
while (Math.random() < 0.5) {
payout *= 2;
}
return payout;
}
function simulateAveragePayouts(nTrials, nSimulations) {
const averages = [];
for (let i = 0; i < nSimulations; i++) {
let total = 0;
for (let j = 0; j < nTrials; j++) {
total += playStPetersburg();
}
averages.push(total / nTrials);
}
return averages;
}
// Pre-compute simulations for all trial counts
allTrialData = {
const sampleSize = 1000;
const trialCounts = [1, 5, 10, 25, 50, 100, 250, 500];
const data = [];
for (const nTrials of trialCounts) {
const averages = simulateAveragePayouts(nTrials, sampleSize);
for (const avg of averages) {
data.push({nTrials, average: avg});
}
}
return data;
}// Summary across all trial counts
summaryData = {
const trialCounts = [1, 5, 10, 25, 50, 100, 250, 500];
return trialCounts.map(n => {
const values = allTrialData.filter(d => d.nTrials === n).map(d => d.average);
return {
trials: n,
mean: d3.mean(values),
median: d3.median(values),
p90: d3.quantile(values.sort((a,b) => a-b), 0.9)
};
});
}html`<div style="font-size: 13px; margin-top: 15px;">
<strong>Summary: How the typical payout grows with trials</strong>
<table style="margin-top: 8px; border-collapse: collapse; width: 100%;">
<tr style="border-bottom: 2px solid #333;">
<th style="text-align: left; padding: 4px 8px;">Trials</th>
<th style="text-align: right; padding: 4px 8px;">Mean</th>
<th style="text-align: right; padding: 4px 8px;">Median</th>
<th style="text-align: right; padding: 4px 8px;">90th %ile</th>
</tr>
${summaryData.map(d => html`<tr style="border-bottom: 1px solid #ddd;">
<td style="padding: 4px 8px;">${d.trials}</td>
<td style="text-align: right; padding: 4px 8px;">$${d.mean.toFixed(2)}</td>
<td style="text-align: right; padding: 4px 8px;">$${d.median.toFixed(2)}</td>
<td style="text-align: right; padding: 4px 8px;">$${d.p90.toFixed(2)}</td>
</tr>`)}
</table>
</div>`Therefore, the more trials I am willing to offer you, the more you should be willing to pay for each game. However, we haven’t accounted for downside risk yet.
The Kelly Criterion [3] addresses exactly this kind of risk-of-ruin tradeoff. Rather than maximizing raw expected value, it maximizes the expected growth rate of your wealth, which naturally penalizes bets large enough to risk bankruptcy. The optimal bet turns out to depend on your current wealth and the payout distribution.
The Kelly Criterion tells us to maximize the expected logarithm of wealth. With entry fee \(f\) and payout \(X\), paying to play turns wealth \(W\) into \(W + X - f\), so your long-run growth rate is:
\[G(f) = E\left[\ln\left(1 + \frac{X - f}{W}\right)\right] = \sum_{n=1}^{\infty} \frac{1}{2^n} \ln\left(1 + \frac{2^{n-1} - f}{W}\right)\]
\(G(f)\) only falls as the fee rises, so the target isn’t a maximum but a break-even fee \(f^*\) — the most you can pay while your wealth still grows:
\[G(f^*) = 0\]
Pay less and your wealth compounds; pay more and it bleeds away, despite the game’s infinite expected value. We solve for \(f^*\) numerically:
from scipy.optimize import brentq
def growth_rate(entry_fee, wealth):
"""Expected log growth rate per game at a given entry fee"""
g = 0.0
for n in range(1, 50): # Sum enough terms
prob = 0.5 ** n
payout = 2 ** (n - 1)
g += prob * np.log(1 + (payout - entry_fee) / wealth)
return g
# Break-even fee: the largest fee with non-negative growth, G(f) = 0
wealth_levels = [10, 50, 100, 500, 1000, 10000, 100000, 1000000]
print("Break-even Entry Fees (Kelly Criterion):")
print("-" * 48)
for w in wealth_levels:
f_star = brentq(lambda f: growth_rate(f, w), 0.01, w)
print(f"Wealth ${w:>9}: break-even fee = ${f_star:.2f} ({100*f_star/w:.2f}% of wealth)")Break-even Entry Fees (Kelly Criterion):
------------------------------------------------
Wealth $ 10: break-even fee = $2.88 (28.84% of wealth)
Wealth $ 50: break-even fee = $3.90 (7.79% of wealth)
Wealth $ 100: break-even fee = $4.36 (4.36% of wealth)
Wealth $ 500: break-even fee = $5.48 (1.10% of wealth)
Wealth $ 1000: break-even fee = $5.97 (0.60% of wealth)
Wealth $ 10000: break-even fee = $7.62 (0.08% of wealth)
Wealth $ 100000: break-even fee = $9.28 (0.01% of wealth)
Wealth $ 1000000: break-even fee = $10.94 (0.00% of wealth)
The break-even fee rises with your bankroll. Overpay, though, and wealth trends downward — here’s how a few fixed fees play out from a $100 start, where break-even sits around $4:
def play_st_petersburg():
"""Play one round of the St. Petersburg game"""
payout = 1
while np.random.random() < 0.5: # Keep flipping while heads
payout *= 2
return payout
def simulate_kelly_play(entry_fee, initial_wealth, n_games):
"""Simulate wealth evolution with fixed entry fee"""
wealth = initial_wealth
history = [wealth]
for _ in range(n_games):
if wealth < entry_fee:
break # Can't afford to play
wealth -= entry_fee
wealth += play_st_petersburg()
history.append(wealth)
return history
# Compare different strategies
np.random.seed(42)
initial = 100
n_games = 200
strategies = {
'$2 (Conservative)': 2,
'$4 (Moderate)': 4,
'$8 (Aggressive)': 8,
'$16 (Very Aggressive)': 16
}
fig = go.Figure()
colors = ['#2E86AB', '#4CAF50', '#FF9800', '#E94F37']
for (name, fee), color in zip(strategies.items(), colors):
history = simulate_kelly_play(fee, initial, n_games)
fig.add_trace(go.Scatter(
x=list(range(len(history))),
y=history,
mode='lines',
name=name,
line=dict(color=color, width=2)
))
fig.add_hline(y=initial, line_dash="dash", line_color="gray",
annotation_text="Starting wealth ($100)")
fig.update_layout(
title="Wealth Evolution for Different Entry Fees",
xaxis_title="Games Played",
yaxis_title="Wealth ($)",
height=450
)
fig.show()Let’s run many simulations to see the distribution of outcomes:
def simulate_many_outcomes(entry_fee, initial_wealth=100, n_games=100, n_simulations=1000):
"""Run many simulations and return final wealth distribution"""
final_wealths = []
bankruptcies = 0
for _ in range(n_simulations):
wealth = initial_wealth
for _ in range(n_games):
if wealth < entry_fee:
bankruptcies += 1
break
wealth -= entry_fee
wealth += play_st_petersburg()
final_wealths.append(max(0, wealth))
return final_wealths, bankruptcies
np.random.seed(123)
fees = [2, 4, 8, 16]
results = {}
for fee in fees:
wealths, bankruptcies = simulate_many_outcomes(fee)
results[fee] = {
'wealths': wealths,
'bankruptcies': bankruptcies,
'median': np.median(wealths),
'mean': np.mean(wealths)
}
# Create comparison table
print("Results after 100 games (1000 simulations each):")
print("-" * 60)
print(f"{'Entry Fee':<12} {'Bankruptcies':<15} {'Median Wealth':<15} {'Mean Wealth':<15}")
print("-" * 60)
for fee in fees:
r = results[fee]
print(f"${fee:<11} {r['bankruptcies']/10:>13.1f}% {r['median']:>14.0f} {r['mean']:>14.0f}")Results after 100 games (1000 simulations each):
------------------------------------------------------------
Entry Fee Bankruptcies Median Wealth Mean Wealth
------------------------------------------------------------
$2 0.0% 348 1360
$4 21.6% 130 836
$8 93.3% 4 104
$16 99.6% 11 13
fig = go.Figure()
for i, fee in enumerate(fees):
fig.add_trace(go.Box(
y=results[fee]['wealths'],
name=f'${fee} entry',
marker_color=colors[i]
))
fig.add_hline(y=100, line_dash="dash", line_color="gray",
annotation_text="Started with $100")
# Cap the (log) y-axis at the largest final wealth observed so a handful of
# extreme outliers don't squash the boxes into an unreadable sliver.
y_cap = max(w for fee in fees for w in results[fee]['wealths'])
fig.update_layout(
title="Final Wealth Distribution After 100 Games<br><sub>1,000 simulations each</sub>",
yaxis_title="Final Wealth ($)",
height=450,
yaxis_type="log",
yaxis_range=[0, np.log10(y_cap)] # log10 units: $1 up to the biggest value seen
)
fig.show()The data reveals the tradeoff: higher entry fees give you a shot at bigger wins, but also dramatically increase your bankruptcy risk. The “optimal” choice depends on risk tolerance.
Peterson, Martin. “The St. Petersburg Paradox.” Stanford Encyclopedia of Philosophy.
Bernoulli, Daniel. “Exposition of a New Theory on the Measurement of Risk.” Econometrica 22, no. 1 (1738/1954): 23–36. Translated by Louise Sommer.
Kelly, J. L. “A New Interpretation of Information Rate.” Bell System Technical Journal 35, no. 4 (1956): 917–926.
“St. Petersburg paradox.” Wikipedia. See the repeated-game analysis, where the total payout after \(n\) trials grows like \(n\log_2 n\) (a result due to Feller).
---
title: "The St. Petersburg Paradox"
description: "A game with infinite expected value that nobody would pay much to play—and what it teaches us about decision-making"
author: "Adam Fillion"
date: "2026-01-12"
categories: [probability, economics, paradoxes, decision-theory]
draft: false
---
```{python}
#| label: setup
#| code-fold: true
#| code-summary: "Setup code (click to expand)"
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import pandas as pd
import numpy as np
np.random.seed(42)
```
Here is a classic problem in decision theory, first posed by Nicolaus Bernoulli in 1713 [[1]](#references). Imagine I offer you a betting game.
I, the dealer, start with 1 dollar. I flip a fair coin. If heads, the pot doubles (1 -> 2 -> 4 ...). If tails, you win the entirety of the pot.
For example, an immediate tails wins you 1 dollar, while Heads-Heads-Tails wins you 4 dollars.
**How much are you willing to pay to play this game?**
The outcomes span an infinite range, from winning 1 dollar to $\infty$ dollars. In betting games, you can naively bet the expected value of the game to determine your betting amount, so let's compute that:
The payout for $n$ coin flips (ending in tails) is $2^{n-1}$ dollars, with probability $\left(\frac{1}{2}\right)^n$:
$$E[X] = \sum_{n=1}^{\infty} \left(\frac{1}{2}\right)^n \cdot 2^{n-1} = \sum_{n=1}^{\infty} \frac{2^{n-1}}{2^n} = \sum_{n=1}^{\infty} \frac{1}{2} = \frac{1}{2} + \frac{1}{2} + \frac{1}{2} + \cdots = \infty$$
Infinite. Intuition should be screaming at you right now, clearly you shouldn't liquidate your life savings for a chance to play.
And that's why this is a **paradox**, since the math and the reality of being a money-loving human don't agree. Thankfully, expanding the mathematics a little bit to model human behaviour better, we can find a satisfactory result.
**(Approach 1) Discount low probability events**
Humans naturally discount events that are extremely unlikely. If I told you there was a 0.0001% chance of winning a billion dollars, you
**might** mentally round that to zero. Let's formalize this: we ignore any outcome with probability below some threshold $p_{min}$.
If we only consider outcomes with probability $\geq p_{min}$, we include flips where $\left(\frac{1}{2}\right)^n \geq p_{min}$, which means $n \leq \log_2\left(\frac{1}{p_{min}}\right)$.
The expected value becomes:
$$E[X]_{clipped} = \sum_{n=1}^{N_{max}} \frac{1}{2} = \frac{N_{max}}{2} \quad \text{where } N_{max} = \lfloor \log_2(1/p_{min}) \rfloor$$
```{python}
#| label: probability-threshold
#| code-fold: true
def expected_value_with_threshold(p_min):
"""Calculate expected value ignoring events with probability < p_min"""
n_max = int(np.floor(np.log2(1/p_min)))
return n_max / 2
# Calculate for various thresholds
thresholds = [0.1, 0.01, 0.001, 0.0001, 0.00001]
evs = [expected_value_with_threshold(p) for p in thresholds]
# Create figure with probability distribution
fig = make_subplots(rows=1, cols=2,
subplot_titles=("Probability Distribution (clipped at 1%)",
"Expected Value vs Probability Threshold"))
# Left plot: probability distribution with clipping
n_values = np.arange(1, 25)
probabilities = 0.5 ** n_values
payouts = 2 ** (n_values - 1)
contributions = probabilities * payouts # Each is 0.5
# Highlight clipped region (p < 0.01 means n > ~7)
colors = ['#2E86AB' if p >= 0.01 else '#cccccc' for p in probabilities]
fig.add_trace(
go.Bar(x=[f"n={n}" for n in n_values[:12]],
y=probabilities[:12] * 100,
marker_color=colors[:12],
name="Probability (%)"),
row=1, col=1
)
fig.add_hline(y=1, line_dash="dash", line_color="red", row=1, col=1,
annotation_text="1% threshold")
# Right plot: expected value vs threshold
fig.add_trace(
go.Scatter(x=[f"{p*100}%" for p in thresholds],
y=evs,
mode='lines+markers+text',
text=[f"${ev:.1f}" for ev in evs],
textposition='top center',
line=dict(color='#E94F37', width=2),
marker=dict(size=10),
name="Expected Value"),
row=1, col=2
)
fig.update_xaxes(title_text="Number of flips", row=1, col=1)
fig.update_yaxes(title_text="Probability (%)", row=1, col=1)
fig.update_xaxes(title_text="Minimum probability threshold", row=1, col=2)
fig.update_yaxes(title_text="Expected Value ($)", row=1, col=2)
fig.update_layout(height=400, showlegend=False)
fig.show()
```
With a 1% probability threshold, you'd only consider outcomes up to 6 flips (winning up to \$32), giving an expected value of **\$3.00**. That's much more aligned with what people actually say they'd pay.
This approach is mathematically equivalent to capping the dealer's maximum payout. If you don't trust that the dealer can actually pay you \$100 quadrillion (and you shouldn't), then you're implicitly applying this threshold. A casino with a \$1 million cap gives an expected value of only about **\$10**.
**(Approach 2) Use a utility function for money**
A more sophisticated approach, proposed by Daniel Bernoulli [[2]](#references), is to recognize that humans don't value money linearly. Bernoulli proposed a *logarithmic* utility of wealth:
$$U(x) = \ln(x)$$
```{python}
#| label: utility-curves
#| code-fold: true
wealth = np.linspace(1, 10000, 1000)
fig = make_subplots(rows=1, cols=2, subplot_titles=(
'Linear Value (What math assumes)',
'Logarithmic Utility (What we feel)'
))
fig.add_trace(
go.Scatter(x=wealth, y=wealth, mode='lines',
line=dict(color='#E94F37', width=2), name='Linear'),
row=1, col=1
)
fig.add_trace(
go.Scatter(x=wealth, y=np.log(wealth), mode='lines',
line=dict(color='#2E86AB', width=2), name='Log Utility'),
row=1, col=2
)
fig.update_xaxes(title_text="Dollars", row=1, col=1)
fig.update_xaxes(title_text="Dollars", row=1, col=2)
fig.update_yaxes(title_text="Value", row=1, col=1)
fig.update_yaxes(title_text="Utility", row=1, col=2)
fig.update_layout(height=350, showlegend=False)
fig.show()
```
With log utility, we maximize *expected utility* rather than expected value:
$$E[U] = \sum_{n=1}^{\infty} \left(\frac{1}{2}\right)^n \ln(2^{n-1}) = \ln(2) \sum_{n=1}^{\infty} \frac{n-1}{2^n}$$
This sum converges! Using the identity $\sum_{n=1}^{\infty} \frac{n}{2^n} = 2$, we get:
$$E[U] = \ln(2) \cdot (2 - 1) = \ln(2) \approx 0.693$$
The "certainty equivalent" (the guaranteed amount that gives the same utility) is $e^{0.693} \approx$ **\$2**. More refined calculations accounting for your existing wealth push this to around **\$4**.
```{python}
#| label: certainty-equivalent
#| code-fold: true
# Calculate expected utility for Bernoulli's log utility
def expected_log_utility():
"""Calculate the expected log utility of the St. Petersburg game"""
total = 0.0
for n in range(1, 100): # Sum enough terms
prob = 0.5 ** n
payout = float(2 ** (n - 1))
total += prob * np.log(payout)
return total
eu = expected_log_utility()
certainty_equiv = np.exp(eu)
print(f"Expected log-utility: {eu:.4f}")
print(f"Certainty equivalent: ${certainty_equiv:.2f}")
print(f"\nThis means you should value this game the same as")
print(f"a guaranteed ${certainty_equiv:.2f} payout.")
```
This is pretty much all you need to resolve the paradox, but there is still more to talk about if you are allowed to play the game multiple times.
Something interesting happens as the number of game trials increases: the *typical* average payout grows. The expected value of the average is always infinite (it inherits the infinite mean of a single game), but the payout you're actually likely to realize creeps upward — roughly in proportion to $\log_2$ of the number of trials [[4]](#references). This is also why the "Mean" column below is so jumpy while the "Median" climbs smoothly: a handful of rare, enormous payouts dominate the sample mean.
The mathematics here is similar to what happens in other scenarios like a [random walk](/posts/2026-01-12-queue-paradox-final/).
```{ojs}
//| echo: false
function playStPetersburg() {
let payout = 1;
while (Math.random() < 0.5) {
payout *= 2;
}
return payout;
}
function simulateAveragePayouts(nTrials, nSimulations) {
const averages = [];
for (let i = 0; i < nSimulations; i++) {
let total = 0;
for (let j = 0; j < nTrials; j++) {
total += playStPetersburg();
}
averages.push(total / nTrials);
}
return averages;
}
// Pre-compute simulations for all trial counts
allTrialData = {
const sampleSize = 1000;
const trialCounts = [1, 5, 10, 25, 50, 100, 250, 500];
const data = [];
for (const nTrials of trialCounts) {
const averages = simulateAveragePayouts(nTrials, sampleSize);
for (const avg of averages) {
data.push({nTrials, average: avg});
}
}
return data;
}
```
```{ojs}
//| echo: false
// Summary across all trial counts
summaryData = {
const trialCounts = [1, 5, 10, 25, 50, 100, 250, 500];
return trialCounts.map(n => {
const values = allTrialData.filter(d => d.nTrials === n).map(d => d.average);
return {
trials: n,
mean: d3.mean(values),
median: d3.median(values),
p90: d3.quantile(values.sort((a,b) => a-b), 0.9)
};
});
}
```
```{ojs}
//| echo: false
html`<div style="font-size: 13px; margin-top: 15px;">
<strong>Summary: How the typical payout grows with trials</strong>
<table style="margin-top: 8px; border-collapse: collapse; width: 100%;">
<tr style="border-bottom: 2px solid #333;">
<th style="text-align: left; padding: 4px 8px;">Trials</th>
<th style="text-align: right; padding: 4px 8px;">Mean</th>
<th style="text-align: right; padding: 4px 8px;">Median</th>
<th style="text-align: right; padding: 4px 8px;">90th %ile</th>
</tr>
${summaryData.map(d => html`<tr style="border-bottom: 1px solid #ddd;">
<td style="padding: 4px 8px;">${d.trials}</td>
<td style="text-align: right; padding: 4px 8px;">$${d.mean.toFixed(2)}</td>
<td style="text-align: right; padding: 4px 8px;">$${d.median.toFixed(2)}</td>
<td style="text-align: right; padding: 4px 8px;">$${d.p90.toFixed(2)}</td>
</tr>`)}
</table>
</div>`
```
Therefore, the more trials I am willing to offer you, the more you should be willing to pay for each game. However, we haven't accounted for downside risk yet.
The Kelly Criterion [[3]](#references) addresses exactly this kind of risk-of-ruin tradeoff. Rather than maximizing raw expected value, it maximizes the expected *growth rate* of your wealth, which naturally penalizes bets large enough to risk bankruptcy. The optimal bet turns out to depend on your current wealth and the payout distribution.
The Kelly Criterion tells us to maximize the expected *logarithm* of wealth. With entry fee $f$ and payout $X$, paying to play turns wealth $W$ into $W + X - f$, so your long-run growth rate is:
$$G(f) = E\left[\ln\left(1 + \frac{X - f}{W}\right)\right] = \sum_{n=1}^{\infty} \frac{1}{2^n} \ln\left(1 + \frac{2^{n-1} - f}{W}\right)$$
$G(f)$ only falls as the fee rises, so the target isn't a maximum but a *break-even fee* $f^*$ — the most you can pay while your wealth still grows:
$$G(f^*) = 0$$
Pay less and your wealth compounds; pay more and it bleeds away, despite the game's infinite expected value. We solve for $f^*$ numerically:
```{python}
#| label: kelly-math
#| code-fold: true
from scipy.optimize import brentq
def growth_rate(entry_fee, wealth):
"""Expected log growth rate per game at a given entry fee"""
g = 0.0
for n in range(1, 50): # Sum enough terms
prob = 0.5 ** n
payout = 2 ** (n - 1)
g += prob * np.log(1 + (payout - entry_fee) / wealth)
return g
# Break-even fee: the largest fee with non-negative growth, G(f) = 0
wealth_levels = [10, 50, 100, 500, 1000, 10000, 100000, 1000000]
print("Break-even Entry Fees (Kelly Criterion):")
print("-" * 48)
for w in wealth_levels:
f_star = brentq(lambda f: growth_rate(f, w), 0.01, w)
print(f"Wealth ${w:>9}: break-even fee = ${f_star:.2f} ({100*f_star/w:.2f}% of wealth)")
```
The break-even fee rises with your bankroll. Overpay, though, and wealth trends downward — here's how a few fixed fees play out from a \$100 start, where break-even sits around \$4:
```{python}
#| label: kelly-simulation
#| code-fold: true
def play_st_petersburg():
"""Play one round of the St. Petersburg game"""
payout = 1
while np.random.random() < 0.5: # Keep flipping while heads
payout *= 2
return payout
def simulate_kelly_play(entry_fee, initial_wealth, n_games):
"""Simulate wealth evolution with fixed entry fee"""
wealth = initial_wealth
history = [wealth]
for _ in range(n_games):
if wealth < entry_fee:
break # Can't afford to play
wealth -= entry_fee
wealth += play_st_petersburg()
history.append(wealth)
return history
# Compare different strategies
np.random.seed(42)
initial = 100
n_games = 200
strategies = {
'$2 (Conservative)': 2,
'$4 (Moderate)': 4,
'$8 (Aggressive)': 8,
'$16 (Very Aggressive)': 16
}
fig = go.Figure()
colors = ['#2E86AB', '#4CAF50', '#FF9800', '#E94F37']
for (name, fee), color in zip(strategies.items(), colors):
history = simulate_kelly_play(fee, initial, n_games)
fig.add_trace(go.Scatter(
x=list(range(len(history))),
y=history,
mode='lines',
name=name,
line=dict(color=color, width=2)
))
fig.add_hline(y=initial, line_dash="dash", line_color="gray",
annotation_text="Starting wealth ($100)")
fig.update_layout(
title="Wealth Evolution for Different Entry Fees",
xaxis_title="Games Played",
yaxis_title="Wealth ($)",
height=450
)
fig.show()
```
Let's run many simulations to see the distribution of outcomes:
```{python}
#| label: kelly-distributions
#| code-fold: true
def simulate_many_outcomes(entry_fee, initial_wealth=100, n_games=100, n_simulations=1000):
"""Run many simulations and return final wealth distribution"""
final_wealths = []
bankruptcies = 0
for _ in range(n_simulations):
wealth = initial_wealth
for _ in range(n_games):
if wealth < entry_fee:
bankruptcies += 1
break
wealth -= entry_fee
wealth += play_st_petersburg()
final_wealths.append(max(0, wealth))
return final_wealths, bankruptcies
np.random.seed(123)
fees = [2, 4, 8, 16]
results = {}
for fee in fees:
wealths, bankruptcies = simulate_many_outcomes(fee)
results[fee] = {
'wealths': wealths,
'bankruptcies': bankruptcies,
'median': np.median(wealths),
'mean': np.mean(wealths)
}
# Create comparison table
print("Results after 100 games (1000 simulations each):")
print("-" * 60)
print(f"{'Entry Fee':<12} {'Bankruptcies':<15} {'Median Wealth':<15} {'Mean Wealth':<15}")
print("-" * 60)
for fee in fees:
r = results[fee]
print(f"${fee:<11} {r['bankruptcies']/10:>13.1f}% {r['median']:>14.0f} {r['mean']:>14.0f}")
```
```{python}
#| label: kelly-boxplot
#| code-fold: true
fig = go.Figure()
for i, fee in enumerate(fees):
fig.add_trace(go.Box(
y=results[fee]['wealths'],
name=f'${fee} entry',
marker_color=colors[i]
))
fig.add_hline(y=100, line_dash="dash", line_color="gray",
annotation_text="Started with $100")
# Cap the (log) y-axis at the largest final wealth observed so a handful of
# extreme outliers don't squash the boxes into an unreadable sliver.
y_cap = max(w for fee in fees for w in results[fee]['wealths'])
fig.update_layout(
title="Final Wealth Distribution After 100 Games<br><sub>1,000 simulations each</sub>",
yaxis_title="Final Wealth ($)",
height=450,
yaxis_type="log",
yaxis_range=[0, np.log10(y_cap)] # log10 units: $1 up to the biggest value seen
)
fig.show()
```
The data reveals the tradeoff: higher entry fees give you a shot at bigger wins, but also dramatically increase your bankruptcy risk. The "optimal" choice depends on risk tolerance.
## References
1. Peterson, Martin. "[The St. Petersburg Paradox](https://plato.stanford.edu/entries/paradox-stpetersburg/)." *Stanford Encyclopedia of Philosophy*.
2. Bernoulli, Daniel. "[Exposition of a New Theory on the Measurement of Risk](https://www.econometricsociety.org/publications/econometrica/1954/01/01/exposition-new-theory-measurement-risk)." *Econometrica* 22, no. 1 (1738/1954): 23–36. Translated by Louise Sommer.
3. Kelly, J. L. "[A New Interpretation of Information Rate](https://www.princeton.edu/~wbialek/rome/refs/kelly_56.pdf)." *Bell System Technical Journal* 35, no. 4 (1956): 917–926.
4. "[St. Petersburg paradox](https://en.wikipedia.org/wiki/St._Petersburg_paradox)." *Wikipedia*. See the repeated-game analysis, where the total payout after $n$ trials grows like $n\log_2 n$ (a result due to Feller).