When I talk about statistics in soccer, I tend to group them into two different classes. I’ve even written posts where I have made such distinctions but without much definition. I’d like to take a step back and discuss what I feel are two important classes of data in soccer: macro-statistics and micro-statistics.
In my view, macro-statistics are those stats that appear in the match report. The most obvious macro-statistic are the goals scored by either team. The record of cautions and expulsions is another macro-statistic, as well as the record of substitutions. All of these events are parameterized by the match time at which they occur. Of course, the list of starting and bench players is also a macro-statistic.
In contrast, micro-statistics are the types of statistics that don’t typically show up in a match report yet influence the ones that do. In this class of statistics I would include shots (on/off target), saves, assists, corners, fouls, throw-ins, tackles (successful or not) and passes. This set of statistics can be expanded to cover whatever kind of event that can be captured, to whatever level of granularity that one desires. One can choose to track every event in a match and parameterize it by time, or track the total number of such events.
It is possible to determine track goalscorers, match participants, and yellow/red card recipients over the whole of a season with macro-statistics. Despite the smaller set of data, it is possible to gain some additional understanding of player and team performance by assessing a player’s influence on the final result given his teammates and the opposition. Micro-statistics provide a much richer set of data for analysis, and it is possible to correlate events during a match with the end result. (I should say “attempt to correlate”; it’s much easier said than done!)
Here’s the main difference between macro-statistics and micro-statistics: macro-statistics are relatively easy to find and for the most part are freely available, although that may not be true in Europe. Micro-statistics are more difficult to find (especially in matches played before the 1990s), very difficult to collect, and often held proprietary by the leagues and governing bodies.