How the engine sees football.
Six signals, one model. xG, xMins, set-piece duties, defensive contributions, bookmaker odds, and a meta-stacker residual that learns from its own mistakes — here’s what each one is, and how it lands on your dashboard.
Expected goals (xG)
— What a player's chances were worth, regardless of whether they went in.
Every shot has a probability of scoring based on its location, body part, the kind of pass that set it up, and the defensive pressure on the shooter. xG sums those probabilities. A 0.2 xG game means the shots taken would, on average, lead to one goal every five matches like that.
Why Onside leans on xG: goals come in lumps. A player who shoots from great positions four games in a row and only scores once is almost certainly going to score more often soon. Their xG predicts next week better than their goals did.
We use shot-level xG from Understat, possession-corrected so big teams who dominate the ball aren’t double-counted.
Expected minutes (xMins)
— Did the manager actually trust this player to play 90?
A player who is averaging 78 minutes a game is a different beast from one averaging 58. The model needs to know whether the floor of a haul is 0 (subbed at half-time) or 90 (locked in).
Onside uses a Bayesian EWMA over the last 6 GWs of recorded minutes. New players inherit a position-typed prior (mids start with the league-average midfielder distribution). The output is two numbers: P(any minutes) and E(minutes │ any minutes).
Both numbers feed every downstream component — goals, assists, clean sheets, defensive contributions, bonus — because none of them can happen if the player isn’t on the pitch.
Set-piece roles
— Who actually takes the corner, free-kick, and penalty when it matters.
Set-piece duties are the most stable predictor of FPL points after minutes. A player who's on penalties scores 2-3 more goals a season than an identical player who isn't.
Onside maintains two layers of set-piece data: a curated JSON (manually corrected after every manager change) and an FBRef-derived scrape (weekly cron, automated). The curated layer wins per (team, role) cell; the scrape catches what we missed.
The signal feeds into the goals (penalties + direct FKs) and assists (corners) projections separately, so a player who takes corners but not penalties gets their assist component lifted without inflating their goal expectation.
Defensive contributions (DefCon)
— FPL 2025/26 introduced a new rule that pays defenders + mids for tackles and clearances.
DefCon points are scored when a defender accumulates 10 'defensive contributions' (tackles + clearances + interceptions + blocks) in a single match, or when a midfielder hits 12. The reward is +2 FPL points.
The signal is new enough that residual models trained on old seasons miss it. Onside trains the DefCon expectation per player from the current season’s actual contribution rate, then converts to an expected reward using the FPL threshold.
It's a small signal in absolute terms (~0.5 xPts for a top defender) but it's the difference between a budget centre-back at 2.0 xPts and 2.5 xPts — enough to flip a transfer decision.
Bookmaker odds
— What sharp money thinks will happen on Saturday.
Bookmakers are excellent at projecting team-level xG and match-level outcomes. They incorporate injury news, weather, manager rotation rumours, and a dozen other signals before they publish a line.
Onside scrapes match-level over/under, both-teams-to-score, and team-favorite probability from the public odds feed (Pinnacle-style, vig-corrected). We translate those into team-level xG estimates that feed the per-player goal and assist projections.
Why this matters: the engine often disagrees with itself on a single number (xG says 1.4, odds say 1.7). The blend is calibrated weekly — in week 1 we lean ~70 % on odds, in week 30 we lean ~70 % on our own xG. The handover is gradual.
Meta-stacker residual (engine v5)
— A depth-2 gradient-boosted model that learns where the other five signals are wrong.
After computing every signal above, Onside runs them through a depth-2 GBT trained on cross-season residuals — the gap between the simple-sum projection and the actual FPL points the player scored.
Cross-season validation (train v5 on 23/24, predict 24/25, then reverse) shows an 11.8–13.6 % out-of-sample MAE improvement over the previous v4 ensemble. The model self-explains via TreeSHAP — every player drawer shows the top-3 features that pushed the projection up or down.
The residual layer is intentionally small (max-depth 2, only 9 features) because the upstream signals are already strong. We use it to correct, not to predict.
Every prediction logged.
We don’t hide the misses. The calibration dashboard shows every gameweek’s residual, the rolling MAE, and the per-position breakdown.