Expected Goals, Newcastle United and the Premier League
‘Expected Goals stats tell the very real story’ says The Mag after each Newcastle United fixture!
Before The Mag makes further comment on this subject, it gives an excellent summary of what this metric is:
‘Expected Goals is widely agreed to be the best way of measuring how well Premier League clubs play in any particular game.
To get a better look at how sides are doing, the Expected Goals (xG) metric allows you to get a better picture of just how teams are performing.
Expected goals (xG) is a statistic used to work out how many goals should be scored in a match.
With every single shot awarded an xG value based on the difficulty of the attempt, with factors including distance from goal, type of shot and number of defenders present affecting the value.
The higher the xG of a particular shot, the more likely a goal should be scored from that shot.
The xG value of every shot in a game is then used to calculate the expected goals in a particular match.
So rather than just the usual basic statistics of how many shots each team has, Expected Goals factors in where shots were taken from and how good a chance was and whether defenders in the way etc.’
I often find some of the comments bemusing.
Some commentators try and provide a further synopsis of the metric, most of them are pretty good but some fall short. My own comments have in the past concerned the fact that football is unpredictable, and it is actual goals that are going to win matches.
That is largely the point. Although goals can come from any number of unexpected outcomes, expected goals can explain just how unlikely these were.
Unless it’s astro physics, when I’m confronted with something I don’t quite understand, I like to delve further. So here is my attempt at explaining this metric which has fast become an alternate way of viewing the game and forms part of something called data analytics.
In a business sense, data analytics concerns the conversion of raw data into actionable insights, using a range of tools, technologies, and processes to find trends, solve problems and improve decision making by using data.
Translating that into football, whilst the terminology may appear to be new and sound like it’s straight from a Harvard manual, the reality is that we’ve been talking Expected Goals since football began. How many times have you heard, “he should have buried that” or “he should’ve had a hat-trick today” or even, “their keeper’s kept them in it”. While watching a game, our intuition tells us which chances are more or less likely to be scored. Considerations will likely include proximity to the goal, the angle, whether it was a one-on-one, a header or even the positioning of the keeper.
Expected goals (or xG) measures the quality of a chance by calculating the likelihood that it will be scored by using information on similar shots in the past.
Opta, the British sports analytics company that was founded in 1996 to analyse Premier League football matches and now provides data for more than 30 sports in over 70 countries, has a huge database that is used to measure xG on a scale between zero and one, where zero represents a chance that is impossible to score, and one represents a chance that a player would be expected to score every single time.
For instance, because a chance from the halfway line isn’t as likely to result in a goal as a chance from inside the penalty area, xG can assign numbers to these different scenarios. Suppose the chance from inside the box is assigned an xG of 0.1, this means that a player would, on average, be expected to score one goal from every ten shots in this situation or 10% of the time.
The database kept by Opta continues to grow and at last count uses a machine-learning technique that is powered by around one million shots. It evaluates how over 20 variables affect the likelihood of a goal being scored. Some of the most important factors are listed below:
Distance to the goal.
Angle to the goal.
Goalkeeper position (which assess the likelihood that the keeper is able to make a save).
The clarity the shooter has of the goal mouth, based on the positions of other players.
The amount of pressure the shooter is under from the opposition defenders.
Shot type, such as which foot the shooter used or whether it was a volley/header/one-on-one etc.
Pattern of play (e.g. open play, fast break, direct free-kick, corner kick, throw-in etc.
Information on the previous action, such as the type of assist e.g. through ball, cross etc.
The main criticisms of xG often appear in scenarios where the metric isn’t being applied correctly. The most common of which is at the game level. A team having a higher xG total in a match doesn’t necessarily imply that they should’ve won the game. xG is only measuring chance quality and not the expected outcome of the game.
Another misconception is in the literal interpretation of the metric name. Goals aren’t ‘expected’ to occur exactly as the likelihood predicts. Fractions of goals cannot be scored either. The name “expected goals” is derived from the mathematical concept of “expected value” and it is a measure of the likelihood of an outcome occurring. Take penalties for instance. These are the most consistent shot in football and are given a constant value reflective of their historical conversion rate but this is clearly less than 1.0 and therefore the xG assigned to a penalty is 0.79.
The expected value of the toss of a coin is 50% likely to land on heads and 50% likely to land on tails (the expected heads or the expected tails is therefore 0.5). This outcome doesn’t happen exactly with each toss of the coin, but over many coin tosses, the total number of each outcome should follow closely to this pattern (or regress to the mean). The same applies to expected goals. Variance from the expected value is inevitable, particularly over the short run.
Suppose the hypothetical situation where two players had the same number of shots (let’s say 40) but scored ten and five goals respectively. By quantifying the quality each player’s chances, xG adds additional context that goes beyond the traditional metrics such as shots on target. With this additional context, it might be the case that the player who scored ten goals has underperformed (from the chances he’s had, the average player might have had an xG tally of 12.2) whereas the player with five goals might have overperformed compared to the average player whose xG tally might have been 3.4 from the chances he’s had.
Over the short term (a typical season) whilst we might expect a player (or team) who has overperformed against their xG to regress back to scoring in line with their expectation, it’s important to recognise they have already ‘banked’ this overperformance and may not necessarily regress to expectation. This concept is known as the Gambler’s Fallacy, which is when an individual erroneously believes that a certain random event is somehow either less or more likely to happen, based on the outcome of a previous event or series of events.
Think about it this way. If a roulette wheel landed on black ten times in a row, the likelihood of the next spin of the wheel landing on black is still 0.5 (or 0.4734 if you want to be pedantic and you’re gambling in the United States), but the previous ten occasions where the wheel landed on black has already happened. In other words, if a player (or team) has already scored more than their expected goals total half-way through the season, it is likely that they may still overperform their final season total because of the variation banked to that point.
Useless or otherwise and we’ll all have an opinion, whether we like it or not, the xG metric is here to say, until the next fad that is. That’s data analytics for you.
For me, the variation between the xG value and the actual is useful intelligence that adds to our understanding of the modern game. Is a player scoring less than he should be? Who is getting chances from high xG situations? Which striker is struggling with their finishing? Which team’s form suggests they should be higher in the league table?
To bring this back to Newcastle United, here’s some stats for you from the season so far where I’ve analysed our top goal scorers performance against their xG tally (NB – all references are to the Premier League, unless otherwise stated):
On balance, this looks largely positive, with only Miggy scoring less than his xG tally. At a team level, here’s how the top seven in the Premier League table as it currently stands looks:
As ever, Man United are the gift that keeps on giving and their variation is larger than the likes of Sheffield United (- 0.8), Burnley (+ 1.1) and Luton Town (+ 1.3), all of whom currently occupy the relegation places.
If you would like to feature on The Mag, submit your article to email@example.com