[Originally written 31 October. Bumped to the top with updates.]
A few nights ago I posted a link to the MIT Sports Analytics Conference to be held in March and later included a link to a call for papers to the event. I am thinking very seriously about submitting a paper to that conference. It will likely be on the soccer Pythagorean formula, especially if I can demonstrate some kind of success in predicting points won at the end of a season.
Here's where I could use some assistance.
I'd like to study the ability of the soccer Pythagorean to predict team performance in the 2008-09 European season (2008 in the case of MLS or J-League). To do this I will need goals scored and allowed data for all of the teams in the league during that season. Here are the leagues that I'd like to work with:
- Spanish Primera Liga
- German 1.Bundesliga
- Italian Serie A
- USA Major League Soccer
- Japanese J.League
The data aren't all that difficult to obtain, but it is not always arranged in column data format so it takes some time to collect. The format should look something like this spreadsheet. Don't worry about the histogram data; that command is very simple to write. I am paying more attention to the raw data. Once I get the raw data, processing it will be very quick now that I have my curve-fitting and Pythagorean codes written.
I plan on compiling most of the data myself, but if anyone is willing to help, I would be very appreciative. You would also receive a mention in any publications I write — everyone stands on the shoulders of giants, and I am a very strong believer in acknowledging assistance in every publication I write.
The deadline for paper submission to this conference is the second week in December, so we're going to have to move quickly. This isn't the only project that I'm working on, so I would like to make the most of my time.
ONE OTHER THING: I'll present a few of the results here, but I'm going to have to embargo most of the major findings until after the conference, assuming that my paper gets accepted. If it's not accepted I'll publish all of the results here in mid-January.
UPDATE: I'm starting to write the paper now; the submission deadline is 14 December but I'm working on a number of other publications so I need to get a head start if I'm going to get all of them done before the Christmas break. I have a lot of data now, and thanks to all who helped me compile some of it. I still don't have MLS goal data for this season or last, but if I don't have it I can live without it. It would be nice to have considering that this paper would be presented to a predominately American audience.
UPDATE #2: I have MLS data now. Thanks!