Methodology
← Back to rankingThis ranking is based on past competition results only.
It does not measure today’s fitness or current training level. Use it as a guide for what division or level an athlete has historically competed at — not as a claim about how fit they are right now.
What this is
A single ranking of Estonian CrossFit athletes, computed from publicly available competition results — the CrossFit Open, Quarterfinals, Semifinals, Games (from games.crossfit.com), plus local Estonian competitions hosted on Competition Corner and Circle21 (Fittest in Tartu, Tallinn Throwdown, Kõu Hybrid Storm, etc.).
We display two metrics per athlete, both derived entirely from recorded competition finishes:
- Skill — a Glicko-2 rating built from head-to-head finishes over the last 3 years. Higher Skill means this athlete has consistently finished ahead of others in shared events. It says nothing about what they can lift or run today.
- Points (CWP) — accumulated CrossFit-style Worldwide Points, summed across every recorded event with stage-prestige weighting and time decay. Captures how much an athlete has competed and how well, recently.
The practical use: if you’re organising a competition and want to seed divisions, or if you’re an athlete deciding whether to enter RX or Scaled, past competition history is the most honest signal available. That is what this ranking surfaces.
Why these numbers are defensible
- Stage maxima mirror CrossFit’s own published Worldwide Ranking (Open 1k / Quarterfinals 2k / Semifinals 4k / Games 10k). We didn’t invent the prestige ratios — they are the ratios athletes already see on games.crossfit.com.
- Glicko-2 is a peer-reviewed Bayesian rating system used by chess federations and major leaderboards. It tracks both rating (μ) and uncertainty (φ), so athletes with only a few events are flagged provisional rather than ranked at full confidence.
- Cross-division ratios (Scaled = 0.5×RX, Foundations = 0.25×RX) are calibrated against bridge athletes who switched divisions across years. The empirical slope (RX→Scaled = 0.57 on n=105) is within 15% of the default — so the defaults stand until a future calibration moves them.
- Time decayis a smooth 2-year half-life capped at 5 years. No hard cliffs that would erase the ranking annually given Estonia’s sparse calendar.
Skill (Glicko-2 μ)
Glicko-2 treats every pair of athletes who appeared in the same event as a head-to-head match (better rank wins). After all events are processed, each athlete has μ (rating mean) and φ (rating deviation, i.e. uncertainty). Defaults are μ=1500, φ=350 for a brand-new athlete; the system converges as events accrue. Athletes with very few events sit at high φ and are tagged “Provisional”.
Glickman’s Glicko-2 paper (PDF) is the canonical reference. We use the standard implementation (Ryan Kirkman’s Python port).
Recency window for Skill:Glicko-2 has no native time decay — old matches count the same as recent ones, which would mean an athlete who was dominant in 2018 keeps a high rating even if they’ve declined since. So we only feed Glicko-2 events from the last 3 years. Latest is sharpest — current form drives Skill.
Who shows up here
The pipeline computes Skill / Volume for every athlete in the data. The public ranking filters more narrowly:
- Active in the last 3 years— last event finish within today − 3 years.
- Has competed in at least one Estonian local comp in that window. The CrossFit Open is worldwide-volunteered participation; many athletes register but don’t actually compete. Doing a local throwdown is the signal someone is an active Estonian competitor, not just an Open registrant.
Data sources
What feeds the ranking, in rough order of weight:
- CrossFit.com — Open (2017–2026), Quarterfinals (2022–2026), Semifinals (2022–2026), Games. Via the public c3po API. Per-athlete-per-WOD results captured.
- Competition Corner — 100 events ingested across Estonian-hosted (Fittest in Tartu, Tallinn Throwdown, Tallinn Partner Throwdown) and Baltic/Nordic events Estonian athletes have travelled to (Lietuvos čempionatas, LYNX GAMES, Sporko Games, Turku Tuomiopäivä, Sweden Throwdown, French Throwdown, Wodapalooza qualifiers, etc.).
- Circle21 — Fittest in Tartu Teams 2025. Other Circle21 events (Kõu Hybrid Storm, FiT 2026) pending direct ingestion or organiser hand-off.
- Hyrox — recon complete; not ingested yet (their robots.txt disallows scraping; awaiting organiser path).
- Community submissions — anyone can submit a missing event URL or upload a results file via /submit. Reviewed manually at session start; duplicates auto-flagged.
Every athlete’s row on the competitions page links through to the per-event leaderboard with their current Skill / Volume overlaid.
Points (CWP)
Every athlete also gets a points number called CWP — CrossFit-style Worldwide Points. It is the sum, across every event the athlete has completed, of:
points = stage_max × percentile × time_decay CWP = mean(top-8 event scores) / max(events, 5)
Percentile is how high you finished in that event: finishing first in a 100-athlete field gives 1.00; finishing 50th gives 0.50; finishing last gives ~0.00. Stage max is the ceiling of points a perfect finish in that stage can earn (the table below).
OWGR-shaped average. CWP is the mean of your best 8 event scores, with a minimum divisor of 5. This replaces the previous sum-of-top-10 cap and kills volume bias at all event counts:
- 1–4 events— score is divided by 5, not by actual event count. Prevents a single lucky result from inflating a brand-new athlete’s rating.
- 5–8 events — straight mean of all your results.
- 8+ events— mean of your best 8 only. Competing in more events neither helps nor hurts; only quality of your top results matters. Athletes who have competed in more than 8 events also see a “Career pts” total on their profile — the uncapped sum of all events ever, a fun accumulation metric.
Why these stage values
The 1k / 2k / 4k / 10k stage maxima mirror CrossFit’s official Worldwide Ranking system. Using the same shape means a CWP number here lines up with ratios athletes already recognise from games.crossfit.com.
| Stage | Max points | Source |
|---|---|---|
| Open RX | 1,000 | CrossFit official |
| Open Scaled | 500 | Half of RX (calibrated via bridge athletes) |
| Open Foundations | 250 | Quarter of RX (calibrated) |
| Quarterfinals | 2,000 | CrossFit official |
| Semifinals | 4,000 | CrossFit official |
| Games | 10,000 | CrossFit official |
| Local elite (e.g. Fittest in Tartu RX) | 1,500 | Owner-set; between Open and Quarterfinals |
| Local RX | 800 | Owner-set |
| Local scaled | 400 | Owner-set |
Time decay
Old results count less. Specifically, every event is multiplied by an exponential decay with a half-life of two years, cut to zero after five years.
| Years since event | Counts as |
|---|---|
| 0 | 100% |
| 1 | ~71% |
| 2 | 50% |
| 4 | 25% |
| 5+ | 0% |
We chose a soft decay (not a hard cliff) because Estonian athletes typically only have 1–2 events per year. A cliff would erase people’s rankings every off-year.
Domain fitness profile
Each athlete’s profile shows three domain scores — Cardio, Strength, and Gymnastics — on a 0–100 scale. 50 represents the average Estonian competitor; above 70 means notably strong in that domain; below 30 means that domain is a relative weakness.
The scores are computed from CrossFit Open WOD descriptions. Each WOD is classified by domain weight (e.g. a row-centric WOD scores high on cardio; a heavy-clean WOD scores high on strength). For each WOD the athlete completed, we compute their percentile finish within the EE field. Domain scores are then the decay-weighted mean of those percentiles across all WODs with that domain classification:
domain_score = Σ(percentile × wod_weight × decay)
/ Σ(wod_weight × decay)
× 100Only Open WODs have reliable per-WOD scores; local competition WODs are typically scored by total rank only, so domain scores are Open-data driven. Athletes with fewer than 3 Open WODs across their history may see a “—” for domains with no data.
Dominant domainis simply the highest-scoring of the three. It is only shown when the score exceeds 50, so athletes with uniformly mediocre Open results won’t get a misleading label.
Division fit
The Division fit badge appears on an athlete’s profile when the system detects a pattern of consistently finishing near the top of their division while having a significantly higher Skill rating than the division median. It is a signal — not a ruling — that the athlete may have room to compete at a higher level.
The badge is not an accusation of sandbagging. Many legitimate reasons cause an athlete to stay in a lower division: injury, returning from a break, age, or simply personal preference. The badge just surfaces a statistical curiosity: this athlete wins in their division more often than you’d expect given their overall Skill number.
Specifically, the score increments each time an athlete finishes in the top 10% of a division where their Glicko-2 μ is at least one standard deviation above that division’s median μ. The badge appears at a score of 3 or higher. The score resets with new events as form and division choice evolve.
Team events
CrossFit competitions include team divisions (pairs, trios, teams of 4). Team results are handled differently from individual ones:
- CWP contribution: each team member earns a fractional share of the team points.
points = (stage_max / team_size) × percentile × decaywhere field_size is the number of teams in that division and team_size is the number of athletes on the team. A pair earns half, a team of 4 earns a quarter. - Glicko-2 / Skill: team results are excludedfrom the Skill rating entirely. Because individual contribution within a team is unobservable, using team outcomes to update personal Glicko ratings would be unreliable — a teammate’s skill would inflate or depress your Skill number in unpredictable ways. Individual Skill reflects only solo competition results.
- Team-only athletes — athletes whose only recorded results are team events are tagged Team only on their profile. They have no Skill rating but do have CWP points from their fractional team contributions.
Cross-division calibration
RX, Scaled, and Foundations are different divisions with different difficulty. To compare athletes fairly, we set stage maxima for Scaled (500) and Foundations (250) at half and a quarter of RX (1000). These multipliers are calibrated against bridge athletes: people who appear in two divisions across years. By regressing percentile-in-Scaled against percentile-in-RX for those athletes, we get an empirical ratio for how much harder one division is than the other. The hobby-project defaults match the rule-of-thumb 1:0.5:0.25 ratio.
Provisional flag
An athlete is flagged P for “Provisional” when either is true:
- Their Glicko-2 rating deviation (φ) is greater than 200, or
- They have three or fewer recorded events.
Provisional athletes are never hidden — they appear in the list with the “P” tag. The filter on the homepage lets you hide them if you want a tighter view of established athletes.
Glicko-2 sanity check
In parallel with CWP we also run a Glicko-2 rating, with one rating period per (year, stage) and pairwise matches inside each event (better rank wins). The Glicko-2 μ ± φ are the rating mean and standard deviation under a Bayesian model — useful as an uncertainty signal and as a cross-check on CWP. When CWP and Glicko-2 strongly disagree for an athlete, it usually indicates a division switch and is worth a manual look.
Known limitations
- Lifetime single-division athletes.If an athlete has only ever competed in Scaled, we don’t actually know how they would do in RX. The Provisional tag is the only honest signal for this.
- 2020 Open data gap.The CrossFit API’s country filter is broken for 2020, so that year is missing. The soft time-decay smooths over the hole.
- Bridge athlete self-selection bias. Athletes who try moving up from Scaled to RX tend to be the confident ones, which biases the cross-division calibration upward.
- Time decay erases inactive athletes.An athlete who hasn’t competed in five years drops to zero CWP. A “Hall of Fame” page may be added later to preserve historical recognition.
- Local-comp stage maxima are human judgments. The 1k/2k/4k/10k ratios are CrossFit’s; the local-comp values (Fittest in Tartu, Tallinn Throwdown, etc.) are owner-set. Transparency on this page is what makes them defensible — not uniqueness of the formula.
- Percentile is computed within the EE field. For the Open the CrossFit API gives us EE-only results; for Quarterfinals / Semifinals / Games we filtered the global field to Estonian athletes after fetching. Percentile is therefore calibrated to the Estonian field, not the global one. Across EE this is the right comparison; reading absolute global rank from CWP is not meaningful.
Future: single-number ranking
Once Skill is properly calibrated — specifically, once international anchors (athletes who compete both in the Estonian field and against a known-strength global field) are in the dataset — the plan is to make Skill the primary sort column and demote Volume to a secondary career metric. Currently both are shown side-by-side because Skill is still miscalibrated for athletes who compete internationally: their losses against a 2400-rated QF or Semifinals field default to a 1500 opponent, so the rating understates their true level. Until that anchor problem is solved, CWP Points and Skill together tell a more complete story than either alone.
Source data
All inputs are public CrossFit Games leaderboards. Each athlete links back to their CrossFit Games profile at games.crossfit.com/athlete/{competitorId}.
If you would like your data removed from this ranking, use the Remove my data link in the site footer.