FiveStat - How Our Football Prediction Model Works

How we build out models

Match Prediction Model

Load Historical Match Data: We ingest and normalize Premier League results going back to 2016. Team names are standardized, and goals scored/conceded are extracted for both home and away matches. This historical dataset forms the foundation of our team ratings.

Calculate Base Ratings: Using the historical data, we compute long-term Attack (ATT) and Defense (DEF) ratings for each team. These are based on the average goals scored and conceded, split by home and away, and averaged to produce overall ATT and DEF strength per team.

Adjust for Recent Form: We select the most recent 20 matches for each team and calculate a form-based ATT and DEF rating. We then blend these with the historical base using a weighted average (`blended = (1 - α) * historical + α * recent`), with α typically around 0.65.

Efficiency & Momentum: We assess how well a team is finishing chances (Efficiency = Goals ÷ xG) and their recent goal-scoring trend (Momentum = recent Goals ÷ recent xG). These factors modify the final expected goals to capture hot streaks or slumps.

Compute Team xG: For each team, we estimate their xG in a fixture using two methods: (1) A Poisson-calibrated xG that matches their ATT rating using binary search; (2) A simple ATT × opponent DEF calculation. The final xG is a weighted blend of both (typically 80% multiplicative, 20% Poisson), plus home advantage and efficiency/momentum adjustments.

Simulate Scorelines: Using the two xG values (home and away), we simulate a matrix of scoreline probabilities with a Poisson distribution, covering results from 0–8 goals for each side. Each matrix cell represents the likelihood of a specific scoreline (e.g., 2–1 or 0–0).

Derive Match Outcome Probabilities: From the scoreline matrix, we aggregate the probabilities to calculate: (1) Home Win % (P[home goals > away goals]), (2) Draw % (P[home goals = away goals]), and (3) Away Win % (P[away goals > home goals]).

Generate Visuals: Scoreline heatmaps and bar charts are rendered to show fans and analysts the most likely scorelines and win/draw probabilities in a visually intuitive way.

League Table Simulation

Monte Carlo Simulation: For each unplayed fixture in the season, we simulate match outcomes using the win/draw/loss probabilities from our model. Each run generates a possible end-of-season outcome.

Repeat 10,000 Times: We simulate the full remaining season 10,000 times to create a distribution of possible finishing positions for every team.

Project Final Table: We compute the probability of each team finishing 1st to 20th based on simulation outcomes. We also calculate final xPTS by combining current points with the average points earned in simulations.

Player Goal Projections

Calculate xG Share: We calculate the player's share of their team’s total xG using season-long data (player xG ÷ team xG). This gives a rough measure of how involved they are in goal chances.

Adjust for Form: We blend the season average with recent performance (per-90 xG in last 5 matches), giving more weight to players on a hot streak.

Forecast Team xG: For each of the next few fixtures, we estimate the team’s expected goals using the match model described above.

Project Player xG: We multiply the adjusted xG share by the team’s projected xG for each match. This gives us a per-match expected goals value for the player.

Simulate Scoring Probability: Finally, we use a Poisson distribution to estimate the probability the player scores at least once in each game, based on their xG.

Visualizations

Radar Charts: Players and teams are scored across multiple metrics and converted to percentiles. This allows easy comparison with peers.
Shotmaps: All shots are plotted with xG values and results. Coordinates are flipped so all shots face the same direction for consistent interpretation.