This page contains proposed collaborative projects for 2017. If you see a project that you’re interested in, please send an email to [email protected] with a CV and/or portfolio and a statement of how you expect to contribute. If you don’t see a project that piques your interest but would still like to pursue a collaborative project, send an email with a sufficiently detailed description of your proposed project.
When collaborative projects are in progress, this page will contain project summaries and updates.
Draft valuation from perspectives of organization behaviors and player performance
Summary: In North American and Australian professional sports leagues, clubs have the opportunity to select amateur talent through a draft system. A draft pick is an asset that can be exercised, traded, or discarded. Understanding a draft pick’s absolute or relative value can inform trading decisions and selections. It might also be possible to identify and exploit market inefficiencies by understanding the difference between draft valuations as a result of club trading behaviors and valuations that are derived from on-field performance of previously drafted players. Most research has focused on the latter (Swartz et al (2013), Bohrmann (2015), Hamilton (2016)), but there’s not as much research on the former (Cade and Massey (2004)) and almost none that combined the two perspectives. Possible research directions include the formulation of valuation models that involve league-specific transactions, identification of differences between actual and perceived draft values, or a comparative study of draft valuation between leagues in a single country or between countries. This project could benefit from a collaborator with experience in econometrics or finance.
(Near-)Optimal construction of team squads to maximize potential team performance
Summary: Portfolio optimization is a major problem in quantitative finance — selecting the best allocation of bonds, equities, and other assets to maximize investment targets. It might be possible to view a football team in the same way: what is the best way to construct a winning squad? Moreover, what is the best way to construct a winning squad given constraints such as salary caps, multiple player acquisition paths, and different colors of money to credit salary expenditures against the cap? How did previous winning teams build their squads? Are there any similarities that can be exploited? A true optimal solution may not be feasible — are there heuristics that can achieve a near-optimal result? Doug Fearing’s winning SSAC paper from 2013 could serve as a springboard for this research.
Applying Natural Language Processing to football analysis
Summary: Natural Language Processing (NLP) is the application of computational linguistics, computer science, and artificial intelligence to understanding and interacting with human language. There are some obvious applications of NLP to football, such as sentiment analysis of players and managers by parsing formal or informal text, or detecting target words (e.g. “injury”, “strain”, or “sacked”) to automatically generate data relevant to personnel. The objective of this research is to apply NLP concepts to football analysis in novel ways, motivated by Andrew Miller’s CASSIS talk in 2016. Statistical language models use large amounts of text or conversational speech to identify which words are likely to follow certain words or phrases. Is there a greater probability of certain plays or motions occurring than others in football given a previous sequence? Can we use such models to determine how predictable players or teams are?
Creating Jupyter dashboards for match analytics reporting
Summary: Project Jupyter is a collaborative open-source initiative to create interactive tools for data science and scientific computing across all programming languages. Its major tools are the Jupyter Notebook, which is a web application that allows the sharing of documents that combine code, text, and media, and JupyterHub, which is a multi-user version of the Notebook. Currently, match analytics are communicated to end-users through static graphics that are either stand-alone or part of a printed report. There are solutions by commercial companies that add interactivity to match reporting, but it appears that the tools are now in place to create an open-source match analytics dashboard/reporting application. Such an application would interact with match databases and analytics libraries to create interactive visualizations of individual and team performance. How would this application be designed? Can we create a suite of specialized widgets for match reporting? How do we manage access privileges? And most importantly, how would such a project be deployed in an organization and what infrastructure is required?
It’s possible that results of such an effort could be published in a journal, but it’s most likely that the user conferences for Jupyter and sports analytics are better venues to present software.