Final Pythagorean for 2009-2010 Dutch Eredivisie

This season's Dutch Eredivisie was a fascinating championship to follow, not just because the storyline of Steve McClaren finding rehabilitation in FC Twente, but also because of the gaudy statistics of Ajax in finishing second.  I had my doubts that the Pythagorean formula would be able to predict the total number of points in the face of such a skewed scoring record.  Here's the league table with Pythagorean estimations included:

Team GP GF GA Pts Pythag +/-
Twente Enschede 34 63 23 86 71 +15
Ajax Amsterdam 34 106 20 85 72 +13
PSV Eindhoven 34 72 29 78 69 +9
Feyenoord Rotterdam 34 54 31 63 61 +2
AZ Alkmaar 34 64 34 62 63 -1
SC Heracles Almelo 34 54 49 56 49 +7
FC Utrecht 34 39 33 53 50 +3
FC Groningen 34 48 47 49 47 +2
Roda JC Kerkrade 34 56 60 47 44 +3
NAC Breda 34 42 49 46 42 +4
Heerenveen 34 44 64 37 35 +2
VVV Venlo 34 43 57 35 38 -3
NEC Nijmegen 34 35 59 33 32 +1
Vitesse Arnhem 34 38 62 32 32 0
ADO Den Haag 34 38 59 30 34 -4
Sparta Rotterdam 34 30 66 26 25 +1
Willem II Tilburg 34 36 70 23 28 -5
RKC Waalwijk 34 30 80 15 21 -6

I did not expect the Pythagorean expectation to predict the total number of points very well in such a scenario, and my expectations turned out to be correct.  Twente, Ajax, and PSV outperformed their projections by at least nine points.  Ajax outperformed by 13 points — a difference of four wins.  But look at the difference in predicted points between Twente and Ajax — it was just a point, as it turned out in the final table.  Ajax lost the league title because Twente also played way over their heads during the season; the Tukkers scored 15 more points that their statistical expectation.

There could be a couple of reasons for the huge discrepancy in point totals.  First, the scoring distribution for the top teams this season may have been more skewed than is typical for most domestic leagues.  I took a cursory glance at the result matrix for the Dutch league and my first impression is that Twente had a more typical goal distribution but Ajax's goal distribution so heavily skewed that a Weibull distribution may not have been accurate.  (I need to do a more extensive analysis to find out if that was indeed the case.)  The second reason could be that the Pythagorean expectation does not take into account the spread (variance) of the scoring distribution.  The variance corresponds to the scoring consistency during the season and could add a couple of points to the Pythagorean expectation.  The approach is similar to that presented by Kerry Whisnant in his Pythagorean extension for baseball.

This year's Dutch league was characterized by a majority of teams with lopsided goal differences and a handful of teams with nearly even scoring and defensive records.  Only five clubs in the 18-team top flight had goal differences between -10 and +10.  Heracles appeared to have won more matches than were expected of them, but they earned their place in the European playoffs. At the other end of the table, Waalwijk (automatic relegation place) and Willem II (relegation playoffs) had poor seasons, but perhaps Sparta Rotterdam had a performance that was expected of them. 

Essentially teams at the top of the table win more matches than might have been draws, while teams at the very bottom lose more matches that could have been draws.  At least that observation held up in the Eredivisie this season.