I invented unexpected goals
— Dingosports (@dingosports) January 10, 2015
I was messing around on the internet one day reading up on the “Expected Goals” metric that has become the darling of the stats world when I, like all great explorers before me, “discovered” this site called “Possession with Purpose“. On that site I found an analyst who was creating charts with R² values above 85%. In other words, I found an analyst who had created a metric which showed an almost perfect correlation between his index and the outcome of matches. That analyst’s name is Chris Gluck.
Let’s step back for a second and define a few things. Think of this as a brief history of stats in football: we started with “stats suck”, the only stat that matters is goals scored. From there we have progressed to “stats don’t tell the whole story” and “stats are like a bikini what they cover up is more important than what they reveal.” Since then, I have noticed that even traditional writers will now latch on to a stat, like say tackles made, and present it as if it had meaning all by itself “Denilson made 93/93 passes!” That’s where the average football fan is right now. Arguing over stats that may or may not have any meaning.
But there is a special brand of analyst out there who is just now starting to impress on the rest of the world that there might be another way. It started, as near as I can tell with a stat called TSR. Yes, TSR is both the publishing company for the original Dungeons and Dragons and a nerdy stat in soccer, “Total Shots Ratio.”
Total Shots Ratio is simply the ratio of shots taken to total shots. This metric was shoved into the limelight when Grantland published an article about it in August 2013. It’s funny to me, because in my time searching around for an analytic that showed strong correlation between one event and League table, I discarded total shots ratio early on as not passing my smell test. After all, teams like Tottenham and Liverpool routinely topped the TSR tables and yet sat lower in the actual table.
TSR was a decent metric and you often see me talk about how many shots Arsenal take compared to shots allowed because, well, because shots taken v. shots allowed shows some relative strength in each of those areas. For example, Arsenal played Barcelona a few years back and the Catalans took 19 shots to Arsenal’s 0. There was no doubt who dominated that match even if the score line was 3-1. How 3-1? Busquets own goal. You don’t even need shots to win games! But TSR’s usefulness was ultimately as a metric which led to another metric “Expected Goals.”
The problem with TSR is that it didn’t take into account the type of shot (header, footer) the distance of the shot from goal, what type of pass produced the shot (through ball v. cross), and other factors. Some very smart people, with a lot of time on their hands, started looking at those other factors and created the “Expected Goals” metric.
Michael Caley, a Spurs fan, created the most widely accepted “xG” metric in the analytic community. I’ve linked to his sortable fancy stats League table in his name. You should have a look at it, it’s quite fun. For example, using the shots data Arsenal are “bang on” the expected goals scored but are lagging a bit in expected goals allowed, about 4. Southampton on the other hand are doing better, defensively, than we would expect given their opponent’s shots locations and qualities. They have allowed just 15 goals this season and Caley’s model predicted they would have let in 20 by January 1st. If you saw the Southampton match against Man United the other day, you will have had it confirmed that Southampton rode their luck a bit: Juan Mata should have scored at least one of the two huge chances presented to him.
Expected Goals carries a strong correlation both to points per game (R²=.73) and goal difference (R²=.79) and is already setting the soccer stats world on fire. People are crafting their own xG metrics (I could do one pretty easily) and this is going to be something that football fans talk about for quite a while, I suspect.
But, and I know I shouldn’t do this but I do, why are Arsenal in 5th place on the League table if their xGR should have them in 2nd place? Bad luck? Errors? “Unexpected goals?”
It’s because the metric only looks at one outcome that it has this inherent problem. Goals are capricious. Look at Fernando Torres. He went from beast-mode to least-mode in the span of a few weeks. He was getting chances, he just wasn’t putting them away. And that was happening over and over again.
I’m not saying expected goals is crap, I’m just pointing out that teams can create great scoring chances and not put them away. That happens more often than we like to admit! Teams can also concede goals when they probably shouldn’t due to a number of reasons.
Which is why I was hunting around on the internet looking for different models. And boy howdy did I find one!
I’ve now spent several hours on the phone with Chris from Possession with Purpose and I had to have his model explained to me several times before I got it. Instead of looking at parts of the game, Chris broke the whole game into parts and then re-pieced them back together to form a more comprehensive analytics tool.
The way Chris explains it on his web site is elegant. The goal of a football match is to win, to earn three points. In order to do that you have to score goals. But in order to score goals, you have to create shots. Shots that are on target. And in order to create shots, you have to have penetrative possession. And in order to have penetration, you have to have possession of the ball and move the ball toward the opposition’s goal! In JPG format his theory looks like this:
Possession is 50% in football. Forget the possession numbers you see all the time, each team will have an equal number of possessions in a game. Some possessions may be longer than others but even a team like Arsenal eventually concede the ball.
Ball retention, on the other hand, averages about 79% in the Premier League. And from there, 25% of that 79% pass completion rate are passes which create goal scoring opportunities. Only 15% of those 25% passes in the opposition final third result in shots, 32% of shots in the Premier League are on frame, and 32% of those shots on goal are scored.
This is how a coach would look at the game, this is how a coach does look at the game, Chris Gluck is a coach and has coached at many levels of football throughout his life. So, it’s no surprise that Chris created a tool which measures more than just expected goals.
Just to put this philosophy another way, the point of offense in a football match is to (this is a quote from his site, linked above):
- Gain possession of the ball
- Move the ball
- Penetrate the opponent’s defending final third
- Generate a shot taken
- That ends up on target and,
- Gets past the keeper
In order to measure those things, Chris looks at the following measurable events (again, this is a quote from his site):
- Possession percentage
- Passing Accuracy across the Entire Pitch
- Passing Percentage within and into the Opponents Final Third compared to overall possession (i.e. = Penetration)
- Shots Taken per Percentage of Penetration
- Shots on Goal per Shots Taken
- Goals Scored per Shots on Goal
But that’s only half the game. The other half of the game is limiting the opposition’s ability to do all of those things to you!
And if you take those two metrics together, his Attacking Possession With Purpose Index (APWP) and his Defending Possession With Purpose (DPWP) together you can form a Composite Possession With Purpose index and that index has an R² value after week 21 in the English Premier League of…. wait for it… .92.
That is absurdly accurate and along with his body of work so far a major reason why he was asked to present at the World Conference on Science and Soccer. Now, before you run off and try to create your own index using this idea, you should be aware that Chris retains Copyright and PWP is Trademarked. Ok? Don’t be a douche.
Chris created a composite index of the English Premier League for an article he wrote just yesterday.
As you can see, his index matches very closely to the League table. There are two teams (Chelsea and City) who have pulled away from the pack with regards to their ability relative to the League average. Below those two teams are a group of three (MUFC, Soton, and Arsenal) who are fighting for third and fourth place respectively.
Looking at just offense, we see why Arsenal are second on the “expected goals” table: Arsenal are very strong offensive team.
Remember, this is measuring pass success rate, pass success rate in the final third, the ratio of shots generated by those final third passes, the rate of shots on goal, and the goals scored. By those measures, Arsenal are easily right up there with teams like Man City and Chelsea.
The problem at Arsenal is defense.
As we know from watching matches this year The Arsenal are struggling with team defense. It’s something that Naveen has pointed out several times in his tactics column and a fact that shows up here in Chris Gluck’s DPWP index which has Arsenal with only the 8th best defense in the league.
Take the two metrics together and you can see why Arsenal are in 5th place: their offense is good enough for 3rd but their defense is bad enough for 8th. Possession With Purpose isn’t just looking at shot ratios, it isn’t just looking at shot locations, it is looking at all of the qualities that go into attack and defense. When we do that, we see that Arsenal are about exactly where they should be in the League table, 5th, because their defense isn’t stopping the opposition from getting penetrative passes, shots on target, and scoring goals.
I quite like the Possession With Purpose index. It’s accurate. Crazy accurate. And it uses the game’s most vital stats while eschewing the less important stats that people tend to think are important but aren’t. That’s a topic we will cover a bit tomorrow.
Follow Chris Gluck on twitter @chrisgluckpwp
Chris hosts his own web site possessionwithpurpose.com and also writes for the Columbian and Stumptown Footy on SBNation.
All words from his site are COPYRIGHT, All Rights Reserved. PWP – Trademark