I have been doing work on in-match metrics relevant to soccer, but lately I have been looking for ways to extract information, however imperfect, from the limited amount of data that we obtain from a match. Goran of the Waiting for the Equalizer blog came up with a goalkeeper metric that weighted goals allowed by the league position of the opponent. I've decided to apply the same concept to goalscorers.
Like Goran's goalkeeper metric, there are two data required: the number of goals scored by a player, and the league position of the opposing team at the time of the match. (Yes, it is possible for a team's position to change during the day, especially in leagues where teams play at different times during the day, but I consider league positions at the start of the day's play.) The difference is that instead of using a simple linear expression to weight goal totals, I use a somewhat more sophisticated function. The rationale is the same — goals scored against the top teams in the league should count more than goals scored against the cellar-dwellers. Or to put it another way, goals scored against lower-ranked teams are worth less than those scored against higher-ranked teams, and goals scored against the top team in the league (at that time) should count most of all. Here's the function I used to express this concept:
Here, Xi is the cardinal league position out of N teams.
This is how the function looks when we plot it as a function of league position in a 18-team league. The solid black line represents the above function, the dotted red line its inverse.
The weighted goal metric, in the end, is the sum of goals scored against an opponent, multiplied by that weighting function of the opponent's league position at the start of the current round:
Handling league positions at the beginning of the season is tricky. All of the teams have identical records, but placing them in a tie for first place would count goals too much and placing them in a tie for last place would make them count for too little. I split the difference by making the initial league position to be the middle of the table, N/2 (retaining the integer part if necessary).
I tried this concept out for a mini-league of four teams, which made it easy to chart league position after every round of the competition. The order of competition was also easy to compile, as were the goalscorers themselves. The UEFA Champions League group stage is a great test case, so I decided to use this season's Group D, which was comprised of Barcelona, Rubin Kazan, FC Copenhagen, and Panathinaikos. Developing the chart in OpenOffice Calc was quite labor-intensive, but the formulas were not as complicated to set up as I thought they would be. Here is a list of the goalscorers in Group D, ordered by the weighted goal metric:
|Christian Noboa||Rubin Kazan||2||1.755|
Now, the usual caveats — it's only a four-team league, sample space isn't large enough — but even so, I am very encouraged by the fact that the leading scorer on the weighted list is exactly the one everyone expects. As Blake Wooster said at the MIT SSAC, you don't need analytics to know that Lionel Messi is a great player. Nevertheless, it's a great validation of the statistic! I am impressed that Christian Noboa ranked so highly, and he has proven to be value for Kazan's investment in him. Pedro also ranks very highly on the list, and in the transfer markets as well.
So this metric has some potential, but putting it together for a league would be a big challenge.
CORRECTIONS: The notation in my metric was a little off; I needed to consider league position of the opponent at the start of the current round, not the end. I recomputed the statistic and republished the table. The corrections don't change the top three, but it does create some interesting movement throughout the rest of the table. It's very interesting that the two highest-ranked Copenhagen players are the ones valued very highly in the transfer market.