Football weather data added to Soccermetrics ProjectData repository

As most of you know, I have a repository on GitHub called ProjectData which stores open data from a variety of soccer analytics projects.  Last night I added a folder to store weather data at football matches.

Weather has an impact on the outcome of outdoor sporting events such as soccer and all of the football codes, yet with few exceptions (J-League, Major League Soccer) I’ve seen weather conditions mentioned in a match report and even fewer examples of structured data sets available online.  I want to change that by collecting weather data on football matches, making those data publicly available, and encouraging others to do the same by contributing.

I’ve started with data files from six competitions in the 2016-17 season — Argentina’s Primera A, English Premier League, German Bundesliga, Italian Serie A, French Ligue 1, and Spanish Primera (LaLiga).  In these files I have fields for adding weather data such as temperature and relative humidity at kickoff, which are tracked by the Match Conditions model of the Marcotti football data schema.  (I didn’t add the fields for weather conditions at kickoff, halftime, or end of match because those are harder to get and I haven’t decided whether I really need them.) Some of the files have been partially updated, and I do plan on adding data, but others are empty and I’m sure that there are competitions that you’d like to see in this folder.  For that I need your help.

I created this repository with the goal of encouraging collaboration and data sharing.  If you think this is an interesting idea, here’s what you should do.

  • Make a fork of the current repository so that you have your own local version.
  • Create a branch of the repository so that you have a space to make changes.  If you don’t like the changes, you can just delete the branch with no worries.
  • Make your contributions by adding files or editing current ones.  A few things to keep in mind:
    • Enter kickoff temperature in degrees Celsius.  Use one decimal place if you can (12.6 deg C instead of 13)
    • Enter relative humidity as a percentage.
    • If you do want to add the predominate weather condition, use the terms described in WeatherConditionType of the Marcotti repository.  These are standard weather descriptions from the National Weather Service in the US.
    • If you want to add other weather data such as wind speed, I suggest that you separate wind magnitude and direction into two separate columns. Please use metric units (km/h for wind, millibar for pressure, mm for precipitation, etc.).
  • When you’ve completed your contribution, make a commit with an adequate description of what you did (“added data” is not adequate) and submit a pull request.  I (and hopefully other contributors) will review it and if everything checks out I will approve it. Check the discussion panels for follow-up comments and questions.

Once again, the football weather data folder is in the ProjectData repository and I look forward to your contributions.  I always receive emails and questions from people seeking to volunteer in some way with the projects that I do — well, now is your chance.

Share

Tags: