Polling Methodology

How we compute polling averages, and why every parameter is visible.

The Core Principle

Every polling aggregator applies weighting. Most don't tell you what weights they use or why. Decision Labs publishes the exact algorithm, lets you adjust every parameter in real-time, and links directly to the source code. No black boxes.

The Formula

weight(poll) = recency × quality × sample_size × partisan_adj

recency = 0.5 ^ (days_old / 14)

a true 14-day half-life — a poll 14 days old gets half the weight of today's

quality = QUALITY_WEIGHTS[pollster_grade]

A+ → 1.5, A → 1.3, A- → 1.2, A/B → 1.1

B+ → 1.0, B → 0.9, B- → 0.8, B/C → 0.7

C+ → 0.6, C → 0.5, C- → 0.4, unrated → 0.5

sample_size = min(sqrt(n / 600), 1.5)

600 is the "standard" sample; larger polls get a modest boost, capped at 1.5×

partisan_adj = 0.7 if the poll is flagged partisan/sponsored, else 1.0

weighted_average = Σ(value × weight) / Σ(weight)

Each candidate's average is computed independently, over the polls that include them; we do not model “undecided” or redistribute it. The displayed shares therefore reflect each candidate's polled support and need not sum to 100%.

Why These Weights?

Recency (14-day half-life)

Public opinion changes. A poll from two weeks ago is less informative than one from yesterday. We decay each poll by 0.5 ^ (days_old / 14) — a true 14-day half-life, so a 2-week-old poll gets exactly half the weight of today's, a 4-week-old poll a quarter, and so on. This balance avoids overreacting to individual polls while staying responsive to real movement.

Quality (pollster ratings)

Not all polls are equally reliable. We use Silver Bulletin pollster ratings (the successor to FiveThirtyEight's rating system) as our primary quality measure, with the frozen 2024 FiveThirtyEight ratings as a fallback. These ratings are based on historical accuracy and methodological transparency. Grades map to the multipliers in the table above (A+ → 1.5 down to C- → 0.4).

Unrated pollsters

Polls from pollsters we have not yet graded are treated as unrated (weight 0.5) — the same neutral weight as a C-grade poll. We never assign a fake letter grade to fill the gap; an ungraded poll is labeled honestly as unrated and weighted as such.

Sample size (sqrt-scaled)

Larger samples are more reliable, but the gain diminishes. The square root scaling means a poll with 2,400 respondents isn't 4× better than one with 600 — it's only 2× better. The 1.5× cap prevents massive tracking polls from dominating the average.

Partisan adjustment

Polls commissioned by partisan organizations (party committees, campaigns, aligned groups) tend to have a systematic lean. We don't exclude them — they still contain real information — but we apply a 30% penalty (×0.7) to their weight when a poll is flagged partisan or sponsored.

Where this applies today: the partisan penalty is currently active on the national generic-ballot average. Candidate-level race polls are not yet partisan-flagged in our data, so the penalty has no effect there until a backfill (tagging historical partisan-sponsored race polls) is completed. We'd rather state this plainly than imply a correction we aren't yet applying.

The Decision Labs Rating

Alongside the polling average, every race carries a Decision Labs Rating — a seven-step competitiveness label (Safe / Likely / Lean for each party, plus Tossup). It is a measure of how competitive a seat is, computed the same way for every race. It is not a win probability or a vote-share forecast. All constants below are symmetric across parties; positive margins denote a Democratic advantage (D−R points).

Expected margin (M)

We start from fundamentals (F) and blend in polling as it accumulates.

presLean = 0.6 × (2024 presidential D−R margin for the state)

the most recent presidential result, regressed 40% toward the office

lastResult = D−R margin of the most recent same-office election

senate or governor; dropped if older than 6 years (~2 cycles) or unavailable

incumbency = ±4 toward the incumbent's party

applied only if the seat is not open AND the incumbent is on the candidate roster; a retired/departed incumbent counts as an open seat (incumbency 0)

F = 0.55 × presLean + 0.45 × lastResult + incumbency

if lastResult is dropped: F = presLean + incumbency

P = (the one viable Democrat) − (the one viable Republican) in our weighted average

the two-way margin from the same polling average shown on the race page; undecided excluded. P is formed ONLY from a single coherent two-way general field — see the guard below

w = min(N / 8, 0.80)

poll informativeness; N = number of polls. Weighting climbs with the poll count but caps at 0.80, so fundamentals always keep at least a 20% floor; a race needs at least 2 polls before any poll weight applies (N < 2 ⇒ w = 0)

M = w × P + (1 − w) × F

zero polls, a suppressed poll term, or a lone poll (N < 2) ⇒ M = F (fundamentals only)

When the poll term is suppressed

The poll term P is formed only from a single coherent two-way general field — one viable Democrat against one viable Republican. We require two things before we trust polling to move a rating: the race must be in its general phase, and the weighted average must show exactly one Democrat and exactly one Republican clearing a 5-point viability floor. In a primary, runoff, jungle ballot, or any field where more than one candidate of a party is still viable, there is no real head-to-head to read — so the poll term is suppressed and the rating runs on fundamentals only. This prevents the average from inventing a contest (for example, a primary front-runner of one party against a primary front-runner of the other) that appears in zero actual polls.

Uncertainty (σ)

The rating is deliberately cautious about thin or stale evidence. We widen an uncertainty band σ and then shrink the margin toward Tossup by half of it.

σ = base + sparsity + age + dispersion

base = 3.0

sparsity = 3.5 × (1 − w)

a race with no live poll term carries the full 3.5 sparsity penalty; because w caps at 0.80, this term never falls to zero on the poll path — even a heavily-polled race keeps a residual band of 3.5 × (1 − 0.80) = 0.7

age = +2.0 if the newest poll is older than 30 days

dispersion = standard deviation of recent poll margins (0 if fewer than 2 polls)

The w-driven sparsity term and the age penalty are computed ONLY when the poll term is live (P is formed). In a fundamentals-only rating — including primary season, when the field/phase guard suppresses polls that do exist — the sparsity term uses the full penalty (no informativeness credit from suppressed polls) and the age penalty is skipped, so σ = 3.0 + 3.5 = 6.5 flat, with no age or dispersion noise leaking in from dead polls.

σ is floored at 4.0 when nominees are unsettled (a major party has no candidate in the roster yet); the floor only raises σ, so a fundamentals-only σ of 6.5 is unaffected.

From margin to label

We classify on the σ-shrunk effective margin, so a lead has to clear the noise before it earns a confident label.

effM = sign(M) × max(0, |M| − 0.5 × σ)

|effM| < 3 → Tossup

3 ≤ |effM| < 6 → Lean (D if effM > 0, else R)

6 ≤ |effM| < 11 → Likely

|effM| ≥ 11 → Safe

A practical consequence: with a fundamentals-only σ of 6.5 (haircut 3.25), a race with a clear structural lean — roughly D+6 / R+6 or stronger — now reads at least Likely even before polling arrives, while only fundamentals within about 3 to 4 points stay Tossup. We still widen the band for thin or stale evidence, so we under-claim rather than over-state a lean we cannot yet support; but a genuinely non-competitive seat is no longer shrunk all the way to Tossup on fundamentals alone.

What the rating deliberately omits

The poll term reuses our published polling average verbatim — it applies no separate house-effect correction, no margin-of-error propagation, and no likely-voter / registered-voter harmonization beyond what the average already does. These are candidate follow-ups, not silent adjustments.

What We Don't Do

We don't run a forecasting model — we aggregate, not predict
We don't apply "house effect" corrections — that requires modeling assumptions we want to avoid
We don't weight by pollster diversity — each poll is independent
We don't interpolate between states — each state's average is computed from that state's polls only

Data Sources

Silver Bulletin — Current pollster ratings (2026)
FiveThirtyEight Archive — Historical polls and ratings (pre-2025, frozen)
RealClearPolitics — Individual poll results

Planned, not currently in use: FiftyPlusOne polling data (pending API access). It does not feed any average or rating today; this page will be updated if that changes.

Source Code

The weights above live in one canonical file. Two aggregation shapes consume those same weights — a candidate-topline aggregator (for individual races) and a two-way margin aggregator (for the national generic ballot). They differ only in what they average, never in how a poll is weighted.

Canonical weights (the single source of truth — recency, quality, sample, partisan): src/lib/race/weighting.ts
Candidate-topline aggregation (powers every race / state / national dashboard): src/lib/race/averages.ts
Two-way margin aggregation (powers the national sensitivity controls): src/lib/polling.ts
Decision Labs Rating (the competitiveness label — fundamentals, blend, and uncertainty band described above): src/lib/race/rating.ts

The rating's poll term is built from the candidate-topline aggregator above (the same average the race pages display); it never re-weights polls with a second table. Every constant on this page — the 0.6 presidential regression, the 0.55 / 0.45 fundamentals blend, the ±4 incumbency bump, the min(N / 8, 0.80) blend weight (with a 2-poll minimum before any weight applies and a 5-point field viability floor), the σ components (3.0 / 3.5 / 2.0 / 4.0 floor), and the 3 / 6 / 11 thresholds — is exported from that one file.