I've seen a number of websites performing mock draws and projections leading up to the World Cup draw later today. To be brutally honest, I really don't care for much of them. They're nice as far as conversation starters are concerned, but I doubt that they have much utility beyond that.
I have my own simulation of the World Cup draw, but I've decided to do it a different way. I want to look at the probability of a favorable or unfavorable draw for CONCACAF teams in the World Cup, and to examine this problem I'm going to use a Monte Carlo simulation. There's nothing terribly sophisticated about a Monte Carlo method; all we're doing is selecting a random (or to be more accurate, pseudorandom) number, scaling it to some parameter we're interested in, and perform a huge number of calculations to determine the sensitivity to that parameter. It helps to have a simple problem that can be solved very quickly so that you can then do hundreds, or thousands, or hundreds of thousands of calculations in just a few minutes.
For the World Cup draw, there are a few conditions:
- There cannot be more than one team from a confederation in a group, with the exception of Europe in which case two teams are permitted.
- South Africa must be in Group A.
- A South American team must be in South Africa's group.
- An African team must be in Argentina's and Brazil's groups.
Other than these conditions, there is an equal chance that a ball containing the name of a country will be selected from a given pot. Therefore I am using a uniform distribution in my simulation to select a ball from each pot. If the above rules are satisfied, the country is added to the group. I account for the difficulty of the group using a score metric, which is the average ESPN Soccer Power Index for the teams in the group. I could have used any other rating system such as FIFA or Elo; my choice of SPI was arbitrary, and besides, I know this site is read in Bristol so I could use some additional hits.
My Monte Carlo simulation uses 10000 iterations, and output three histograms of the average SPI of the three groups that the CONCACAF representatives are drawn into. Now, any of the three teams can land themselves into either one of the three scenarios, so there is a range of probabilities for an easy or difficult group.
For reference, the ESPN SPI of the CONCACAF representatives are the following:
- USA: 78.6
- Mexico: 77.0
- Honduras: 75.1
According to this simulation, the CONCACAF side with the best chance of a favorable draw is the USA, assuming that the ESPN SPI actually means something. At best, there is about a 24% probability that the average SPI of the USA's three opponents will exceed the USA's own rating. At worst, there is a 38% probability of that occurring. The dropoff is rather steep for the other CONCACAF opponents: Mexico has a 45-70% chance of a difficult draw, and Honduras has a 65-94% chance of being overpowered by their opponents, at least on paper. Now, a group with evenly matched opponents can be just as difficult if not more so, so the probability of a difficult group for all three participants is even higher.
Attached is the code that I wrote. I did a simulation of the entire draw and focused on the CONCACAF representatives; it's straightforward to analyze the probability for other teams from other confederations. I wrote it in Scilab because I was most comfortable with that language. Perhaps I should have written it in R, but I don't know R, and I had the fierce urgency of now.
I hope this gives you a rough idea of each team's chances for a fortunate draw, and even more to talk about. For CONCACAF, it's looking rather difficult for everybody.