This paper discusses the effect of the Bosman Ruling on the nature of contracts between club and player under the modeling framework of options pricing theory. It uses the framework to formulate club strategies in contract negotiations in the post-Bosman landscape. (It's worth asking if the findings continue to hold in 2011.)

Review after the jump.

]]>An exact answer doesn't exist, but I think the answer could be approximated by looking at the problem as a trinomial tree. A trinomial tree is a computational tool for pricing options, but at this point I'm more interested in the conceptual model. A soccer team has three possible match results: win, lose, or draw. (Putting aside complicating factors like away goals rule or penalty kick shootouts.) Each match has those three possible outcomes, so it's possible to visualize a team's path through a competition by the tree below:

That's after two games, and there are nine possible paths that a team can take. After three matches (say, group play at the World Cup), there are 27. In fact, there are 3^{n} possible paths for a team playing n games.

So how might you use this to calculate qualification probabilities? I'm still thinking my way through the process, but you would first have to come up with the probability of winning a match against a given opponent. Perhaps you could use a ranking system (ELO, SPI, even FIFA) to derive win/loss/draw probabilities against an opponent when playing home or away. There are certain point totals that guarantee defeat in a two-match series (0 or 1 points), as well as point totals that guarantee success (4 or 6). Point totals of 2 or 3 could result in a win by aggregate goals, the away goals rule, or the penalty kick tiebreaker, which makes any odds calculations complicated in a hurry. If you put those tiebreakers aside for a moment, it's possible to model the series result probabilities as a Markov chain, which is a useful tool for modeling discrete processes where the state of the process at a future step depends only on the state at the current step. There are separate Markov chains during the qualification process in CONCACAF: two two-match series (two for the bottom 22 teams, one for everyone else), one six-match series, and one ten-match series.

As I said, this can get complicated very quickly, and I know that I need to flesh out all the details. It has the makings of a very intriguing problem — as if I don't have enough to do already.

]]>I promised in that long-ago post to post an explanation of MCMC simulations, but I never did so. It's time to make amends by giving an explanation now.

Monte Carlo simulations are relatively straightforward to explain; they are brute-force operations in which certain quantities in a simulation are varied randomly using a statistical distribution (usually Gaussian) and the operations are repeated many times. "Many" can be hundreds, thousands, or even millions of operations, depending on the complexity of the simulations. Monte Carlo simulations are useful for solving problems in which an exact analytic solution is difficult to find or does not exist.

A Markov chain is a discrete event whose time history (also called a process) has a Markov property, which states that future states only depend on the present time. If you have a discrete process with a set of possible events at each time step, then what the Markov chain allows you to do is evaluate the probabilities that a process will end a certain way. (This is a greatly simplified version, so if I don't have it exactly right, feel free to have at me in the comments.)

So what the MCMC simulation is doing in this instance is evaluating the win/loss/draw probability of match-ups in the group phase, then using a Markov chain definition to calculate the probability of each team advancing to the knockout stage. The process is repeated for each ensuing match-up for the rest of the tournament, and the resulting probabilities are calculated to predict a winner. The Monte Carlo simulation comes into play by repeating those previous actions many many times.

One downside of the simulation is that it depends heavily on getting the pairwise probabilities right. The site in question uses "expert" analysis and previous head-to-head results, and we all know how right the "experts" are most of the time. Head-to-head results could be equally useless to predict winners because of the lineups and the circumstances surrounding previous matches. But there's not much of an alternative, anyway. It would be interesting to see how pairwise probabilities get calculated; perhaps there could be an opportunity for the newly completed Soccer Pythagorean (which is nothing more than a win/draw probability estimate).

As the website said, the simulation won't do a very good job of predicting a winner, but it might be useful at developing a betting strategy to maximize profits during the tournament. It won't stop the press from shouting "Computer simulation picks Spain/England/Brazil to win World Cup", of course. But perhaps this simulation could provide some opportunities to win a bet or two when I go to Europe this summer.

]]>It's important to note that the simulation didn't do a very good job of predicting the overall champion, but it did a fair job of predicting bets that would payoff well. In order for the simulation to work well, it's imperative to have a high-fidelity model of head-to-head match outcomes, which is (to put it mildly) extremely difficult.

]]>