Last month I gave a progress report on analytics projects and software under development. Today I’m pleased to announce that Marcotti-Light — the light version of Soccermetrics’ match database schemas — has been open-sourced on GitHub.
Marcotti-Light is used to create databases that track full-time scorelines in all types of competitions and matches that involve either clubs or national teams. It also tracks penalty shootout results and even administrative point deductions to teams. I’ve used Marcotti-Light (its ancestors, actually) to build the database for the ResultsPage website and a couple of other internal projects as well.
Marcotti-Light, and all of the other Marcotti data schemas, consisted of hand-coded SQL and a library of routines to read and write from/to the database. This year I switched to writing the data models with SQLAlchemy, which means that I no longer have to carry around SQL scripts or custom code to do read/write operations. I can also create collections of common data models that are shared by club and national team data models. Most importantly, it is much easier to write test suites for these data models.
There are some things worth mentioning:
- Marcotti-Light, and the other Marcotti data schemas, are used to BUILD databases. They are NOT databases themselves!
- Use of these models requires knowledge of Python.
- If you feel that there are other values not listed in the data models that should be tracked, fork the repository and customize it yourself. You can even send a pull request, but expect requests to add betting-related fields (e.g. odds data) to be turned down.
- There will be a Wiki that will describe use cases in the near future.
So again, you can find the new Marcotti-Light repository at this link. The new redesigns of the rest of the Marcotti family will come online soon.