Goalscoring variances: What’s the difference between champion and relegated teams?

Last year I presented a scatter plot of goalscoring variances and color-coded the dots by points per game average of the associated teams.  There was quite an overlap between regions, but some trends were evident, especially in defensive goal statistics and league performance.  I'm going to expand on that previous post and present data in a different manner by considering the relationship between goal statistics and league position.

Some background on the data.  I present end-of-season match result data from 20 European leagues (including all of the Big Five leagues) in the 2009-10 European season (2009 in Scandanavia and Russia), which provides a data set of 330 clubs.  They were the same data I used to test my Soccer Pythagorean metric.  I created histograms of the teams' offensive and defensive goal records and calculated the respective means and variances.  I color-coded the league teams by end-of-season position in their domestic leagues by scaling the position to a uniform 0-to-20 scale:

Color Code = (i-1)/(N-1) * 20

where i = league position and N = number of teams in league.  On this scale, 0 represents a league champion and 20 the bottom team. The number of relegated teams in the European domestic leagues varies so it was less work to come up with a scale from top to bottom in the league.  Relegated teams in general will have a color code of 18 or higher.  The scale is arbitrary, of course, so if you don't like it, you are welcome to choose something else.

Below is a scatter plot of the mean vs variance data.  On the left is a plot of offensive goal scoring (mean on horizontal axis, variance on vertical axis), and on the right, a plot of defensive goal scoring. The colorbar is shown on the far right side, so champion clubs are color-coded dark blue and bottom clubs dark red.  (The dark red dot at the origin in the offensive plot and (3,0) in the defensive plot represents Ankaraspor, who were expelled from the Turkish Süper Lig in 2009-10.)


One starts to observe some important differences between the very best and the very worst in a European domestic league.  Bottom teams in general cannot score, and their defending is very inconsistent, which is indicated by their higher variances in the defensive plot.  Champion clubs have inverse characteristics.  They are not necessarily high scorers on a consistent basis, but they are uniformly the best and most consistent defenders in their league.  Almost of all of them maintain their defensive statistics in that unit square.

In the end, if you want insight on which clubs are likely to win a league championship, don't pay attention to how freely they're scoring.  Look instead at how tightly they're defending.