[This is the third post in a multi-part review of the 2010 MIT Sloan Sports Analytics Conference.]
For the first time in the brief history of the MIT Sloan Sports Analytics Conference, the organizers put together a session that allowed the presentation of research papers. The papers were split into two categories: those published by academic researchers (students or faculty), and those published by either independent researchers or those working for a company. (I would imagine that the sports teams keep their analytics close to their vest and not publish them until it's obvious that they no longer help the team gain a competitive advantage.)
There were almost 40 papers reviewed for the research paper contest, from which a final group of four were selected — two from academic submitters, and two from non-academics. Assuming a 50/50 split, one has to have a paper in the top 10% in order to be selected for presentation. There were eight slots for presentations during the day, and the second half was devoted to invited papers. It seems that the conference limited the number of presentations for the paper contest so that they could announce the winner before the main afternoon session when most of the attendees would be present.
Of the four finalists, three were papers on basketball and the other on baseball. The first paper was on new approaches to fantasy baseball management that appeared to consist of a Monte Carlo simulation of fantasy league selections that prioritized top scores in certain statistics instead of maximum scores in all statistics. The second paper considered the optimization of a basketball offense by viewing it as a network problem. The paper was an intriguing academic problem but I doubt that it could be useful in an actual game — how many coaches, much less players, run through all those paths in the course of a basketball game?
The last two papers presented turned out to be the two winners, and coincidentally, the two papers I was most interested in seeing presented. (Unfortunately I could not do so because both presentations took place during the Emerging Analytics session.) The academic winner was titled "Beyond Pythagorean Expectation: How Run Distributions Affect Win Percentage" by Prof. Kerry Whisnant of the Department of Physics at Iowa State University. The paper focused on baseball and offered an extension to the Pythagorean estimation to account for run distribution and slugging percentage. He found that those two factors are inversely related to each other, and can add as much as one to two wins to the Pythagorean estimate — higher-order wins. It seemed to be an interesting paper and I asked Prof. Whisnant for a copy. I told him about my extension of the Pythagorean expectation for soccer and he suggested that I consider the shape of the distribution as well. The non-academic winner was "Improved NBA Adjusted +/- Using Regularization and Out-of-Sample Testing" by Dr. Joe Sill, a PhD graduate from Caltech who is now an independent consultant. Dr. Sill has a website that presented adjusted plus/minus data for NBA teams and talks at length about the origins of the formula. The basic idea is that one needs to consider the segments of a basketball game in which the five players on a team are seen as one unit, and the plus/minus rating is calculated for this unit. The segments are all considered together in order to arrive at individual plus/minus ratings for each player. What makes the problem hard is the mathematical complexity of the exercise — the solution is difficult to solve without using advanced techniques in linear algebra. I thought that it was a very clever solution to the basketball plus/minus problem, and one that should be applied to the soccer plus/minus problem.
I hope that the conference has more papers in the research paper competition, but it might require the conference to move to a two-day format, and I'm not sure they want to do that at this time.