Manchester City Football Club have launched an exciting initiative in which they have released the full set of 2011-12 Premier League touch-by-touch data compiled by Opta Sports. This initiative creates an opportunity to support the development of new team and player performance metrics in football, but also creates challenges in modeling a very rich in-match dataset. We feel the time is right to introduce the Football Match Event Database.
The Football Match Event Database is a database schema that is a superset of another schema I released last year: the Football Match Result Database. I describe the FMRD schema in detail, with a history of the project here and the database design here. The mission statement of the Match Event database is similar:
The objective of the Football Match Event Database (FMED) is to create a schema that maintains the individual micro events within a football match that lead to its macro events in order to support football research activities for the benefit of analysts, clubs, league organizations, media organizations, and other members of the football industry.
In contrast to the historical event data captured by the FMRD, the FMED captures the most basic events within a football match that produces those historical events. These basic events make up the finely-grained data streams that data companies such as Opta, Prozone/Amisco, StatDNA and Match Analysis produce.
Here are the data captured by the FMRD:
- Historical data of matches that occur within football competitions, whether among national teams or domestic clubs, and within league, knockout or group phases of competitions.
- Top-level data on the football match, including match date, competition name and stage, phase-specific details, participating teams, venues, and environmental conditions.
- Macro-events that occur during a match, including goals, penalties, disciplinary incidents, and substitutions.
- Personnel such as players, managers, and match referees.
The FMED goes further:
- Maintains time- and location-stamped data of all field of play events that occur during a football match.
- Distinguishes between touch events and non-touch events.
- Maintains all occurrences of the following events:
- Legal ball touches
- Pass attempts
- Direct free kicks
- Indirect free kicks
- Goal kicks
- Offside decisions
- Corner kicks
- Maintains all information specific to the events listed above.
I believe that the most important detail is the tracking of spatial data associated with a touch event. By default events occur on a normalized pitch (100×100), but the fields work with numbered regions or ignored altogether. But spatial tracking permits development of deeper analytics than summary statistics.
We developed a desktop application to input data into FMRD-formatted databases, as well as a software library to accomplish the task programmatically. Much of that library will be used to develop the library for FMED-formatted databases, and we’ll harmonize the differences between the two schemas over time.
More details on the FMED will be revealed later.