Reshmi Ghosh

Logo

Senior Applied Scientist, Microsoft (MSAI)

What were my odds? Estimating FIFA 2026 ticket lottery chances probabilistically

Posted May 09, 2026

In the first week of January 2026 I placed several bets in the FIFA World Cup 2026 Random Selection Draw. The two applications I’ll walk through here are representative of those bets — one low-demand group-stage match (selected) and one quarterfinal (not selected). I picked these two because they bracket the interesting range of per-match demand. After the draw closed I started wondering: what were the actual odds on each, given how lopsided demand was across matches?

FIFA published two useful aggregate numbers and almost nothing else:

Per-match counts were never released. So the modeling problem is: given those aggregates and a few well-defined unknowns, can we put a credible interval on the probability of winning a specific application? Yes — and you don’t need anything more sophisticated than a Monte Carlo over informed priors.

→ Try the interactive version — sliders for demand, supply share, and public-pool fraction; the point estimate updates live, and the histogram shows the full posterior over 10,000 simulated draws.

The model in one line

For a single application:

P(win) ≈ q × T_cat / D_cat

where

That’s the whole thing. Everything interesting is in how you put priors on the four inputs.

What’s known vs. assumed

Input Source How I treat it
Stadium capacity FIFA published Fixed (Gillette: 65,878)
Public-pool fraction Industry rule of thumb Prior centered at 65%, with uncertainty
Pre-sale drainage FIFA: “nearly 2M tickets sold before phase 3” Prior centered at 20% drainage of T_cat
Category supply share Logical (Cat 3 is the largest tier) Prior centered at 35%
Category demand share Cheap-tier demand rises with match popularity (downgrade behaviour) Interpolates 40% (low-demand) → 55% (high-demand)
Per-match demand Unpublished — bucketed using “1M+ for 77 matches” M5 lognormal at 800k; M97 lognormal at 5M
Global demand sum FIFA: 500M+ across 104 matches Per-iteration rescaling enforces Σ ≈ 500M ± 5%
q-rule Unspecified by FIFA Toggle between linear-in-q and application-equal

The biggest source of uncertainty is per-match demand. The choice of lognormal priors with wide variance is deliberate — I want the credible intervals to reflect that I genuinely don’t know the demand for any specific match.

Why M5 and M97 are roughly independent

I applied for one ticket per match, two total. The 40-ticket aggregate cap doesn’t bind here, so I can treat the two draws as independent. (If you’d applied for many tickets across many matches, you’d want a sequential-draw correction — at the cap, your earlier wins reduce your remaining lottery participation.)

Results

Pulling the priors together and running 10,000 Monte Carlo draws under the central scenario:

So winning M5 and losing M97 sits comfortably inside the model’s bulk — the second-most-likely joint outcome under the central settings. The model isn’t refuted, and with N=2 it’s also not strongly confirmed.

The interesting comparative result is the demand × public-pool heatmap: P(win) is much more sensitive to per-match demand than to the public-pool fraction. If you knew M97’s true demand to within ±20%, you could collapse the credible interval substantially. Whereas tightening your prior on the public-pool fraction barely moves the needle.

What v2 fixed (after a critique)

A first version of this model gave noticeably more optimistic numbers (M5 ~3.7%, M97 ~0.6%). After someone walked through the assumptions carefully, I rewrote it with four fixes:

  1. Pre-sale drainage — Visa Presale and the Early Draw had moved ~2M tickets before the Random Selection Draw. The original model treated T_cat as fully available; v2 subtracts a 20% drainage prior.
  2. Global 500M demand constraint — FIFA’s reported aggregate is now enforced. Per Monte Carlo iteration, the sampled per-match demands are rescaled so the implied 104-match total respects 500M ± 5%. This forces the priors to be internally consistent in a way v1 didn’t.
  3. Demand-share covariance — A flat 45% Cat 3 demand share was the wrong simplification. Marquee matches see disproportionate downgrade-to-Cat-3 behavior, so v2 interpolates between 40% (low-demand) and 55% (high-demand).
  4. q-rule toggle — v1 silently assumed P(win) ∝ q. The competing model is “each application is one entry regardless of tickets requested.” For my q=1 applications this is a no-op, but the spread for multi-ticket apps is up to 4×, so v2 exposes it explicitly.

Net effect: numbers came down, credible intervals tightened, and the page now carries an assumptions inventory (collapsed under the limitations section) listing every modeling choice — supply, demand, lottery mechanics, calibration anchors, and what’s still open. That last bit matters more than the headline number; you should be able to push back on any single assumption and see the model react.

Why I built it three ways

I put three deployable versions of this in the source repo:

  1. Marimo notebook — live Python in the browser via WASM. Maximum interactivity (sliders re-run the actual simulation), but the bundle is ~30 MB.
  2. Quarto + Plotly — pre-computed grid of scenarios rendered as interactive charts. ~3 MB. No live re-compute.
  3. Vanilla HTML + Plotly.js — model formula reimplemented in JS, Monte Carlo runs on page load. ~3 MB, single file, instant load. This is what’s deployed here.

For a one-shot post like this, the simple HTML version was clearly the right call: zero build step, no Pyodide cold start, and the math is simple enough to port to JS without losing fidelity. The Marimo version is what I’d reach for if I were iterating on the model itself.

What this isn’t

This is not a calibration of FIFA’s actual draw mechanism — they almost certainly stratify by category, residency, and possibly host-country bias in ways that aren’t public. It’s a transparent first-principles estimate that anchors on the two aggregate numbers FIFA did publish, and exposes every assumption as a slider.

Next steps

The single biggest unknown in this model is per-match demand. If FIFA publishes post-tournament attendance and sales-by-match data — should any of that become public — the natural next step is a proper Bayesian update. Sample the prior, compute the likelihood from observed sell-through, post the updated probability. That’ll meaningfully tighten the credible intervals and may force a real shift on the central estimates if FIFA’s released numbers diverge from these priors. I’ll write that follow-up the moment the data lands.

A few smaller follow-ups in the meantime:

Try it

→ Interactive page — drag the sliders, watch the credible interval update.

→ Source on GitHub — Marimo, Quarto, and HTML versions, plus the README explaining the deployment tradeoffs.

If you applied to the same draw and want to compare numbers, I’d love to hear from you — find me on LinkedIn or Twitter.

— Reshmi