MLS Front-Office Efficiency: Making It Better

As you’ve probably noticed, I’ve been quiet for much of the month of March.  I’ve had some other events going on in my life (new day job, mostly), but I’ve been spending most of my time working on improvements to the Major League Soccer Front-Office Efficiency metric.  In particular, I’ve been working on sourcing and cleaning up the salary, player participation, and player transaction data that together present a clearer picture of a player’s tenure at a Major League Soccer club.

At the same time, I had been thinking about refinements to the current formulation of the Front-Office Efficiency metric in order to correct some inconsistencies that I had seen in the results.  These inconsistencies appeared because of the volatile nature of the competition (not only changes in the number of clubs, but also changes in the number of regular season matches) makes it challenging to make an apples-to-apples comparison between seasons.  The basic structure of the Front-Office Efficiency equation remains the same, but there were enough changes to the formulation to consider this a major change.  So I will call this version 4.0 of the Front-Office Efficiency metric.

This is going to be a long and detailed post, so if your attention span is too short, wait until I show some results in a future article.

Anyway, to mind everyone what it is: Front-Office Efficiency is a performance benchmarking tool that assesses a team’s performance relative to a baseline by calculating the ratio between a team’s usable payroll (its input) and league points won (its output).

The term “usable” is important because it communicates the resources available to a team that were actually used in the service of achieving its objective.  It is a simple metric, to be sure, and doesn’t explain everything about the relationship between resource and performance.  The goal of using such a tool is to identify organizations at the extremes so that more sophisticated analysis can be made on their practice and performance.

The Front-Office Efficiency Metric and Its Terms

Much of the derivation follows that of Version 3, with a few differences.  The newest version of the Front-Office Efficiency expression is

$\mathcal{E}_{FO} = \frac{S_{av}U}{P/N_g}$

Available payroll $$S_{av}$$ is defined as base salary expenditures for players who are on a team’s roster at one point during the regular season — not loaned, released, transferred, or retired. Players who are injured, suspended, or on international duty are considered to be on the roster; the assumption is that the club continues to pay the player’s salary in these cases.  I recognize that these assumptions aren’t bullet-proof, but nevertheless I believe that they are reasonable.

To calculate available payroll, the base salary for each player is prorated by the proportion of the season that the player was available for his team.  The league transaction records are used to calculate the number of weeks $$w_k$$ that a player is available for a team, from which the proportion of the season (consisting of $$W$$ weeks) can be calculated.  (A league week is defined from Monday to Sunday, so Week 1 starts the Monday before the opening round of matches.)

$S_{av} = \sum_{k=1}^N s_k \frac{w_k}{W}$

The team utilization factor $$U$$ is the ratio of two weighted values: the sum of player’s prorated salary weighted by the number of minutes played $$m_k$$ in league matches, and the sum of players prorated salaries weighted by their maximum possible minutes played $$M_k$$ in league matches.  This factor answers the question, “Of the assets available to a club, what proportion of them was actually deployed during the competition?”

$U = \frac{\sum_{k=1}^N m_k s_k}{\sum_{k=1}^N M_k s_k}$

League points $$P$$ are divided by the number of league matches played $$N_g$$ in order to normalize league points and remove any secular increases in point tallies due to expansion of the competition.

The resulting units of Front-Office Efficiency are currency units per points per game, so divide by the number of games to arrive at more typically expressed currency unit per point.

What’s Changed in the Front-Office Efficiency Metric

There are a couple of major changes in the Front-Office Efficiency metric — normalization of inputs and outputs, and removal of baseline payroll and performance terms.

We normalize inputs and outputs in order to make them independent of changes to the Major League Soccer competitions over time.  The number of teams in MLS has changed from 13 in 2007 to 19 in 2014.  (There are 20 teams in 2015 — two were added and one team discontinued operations.)  Correspondingly, the number of matches per team in the regular season has changed from 30 in 2007 to 34 starting in 2010.  The payroll of the individual team doesn’t need to be normalized (it will be inflation-adjusted later), but the league points will be normalized by dividing by the number of matches played by the team.

We removed the baseline payroll and performance terms from the expression because, to be frank, it didn’t make sense to include them.  The idea was there were some minimum level of payroll and performance from which marginal input/output ratios could be calculated, which is fine, but the procedures for determining such minimum levels are not very clear in the research literature.  Bill Gerrard, in his benchmarking study of the Oakland A’s, used baseline payroll and performance in his calculations, but abandoned it in a 2010 study on payroll and performance in the English Premier League.  Gerrard introduced a standardized win cost metric that is more flexible than his previous approach, so we’ll use that and ditch the baseline terms.

Over time, the front-office efficiency will decrease in future years (i.e. the marginal payroll cost per point increases) as inflation increases. To remove the effects of inflation we use Bill Gerrard’s term for standard win cost $$\mathcal{C}_{win}$$, which is the ratio of the team’s front-office efficiency $$\mathcal{E}_{FO}$$ — deflated by an inflation factor $$I$$ — to the average league efficiency $$\mathcal{E}^L_{base}$$ in the baseline season:

$\mathcal{C}_{win} = \frac{\mathcal{E}_{FO}}{\mathcal{E}^L_{base}}I$

The inflation factor is the ratio of the total available payroll in the baseline year to the total available payroll in our year of interest.  As mentioned in the previous section, we can’t assume that the number of teams in Major League Soccer has remained constant over the period (it hasn’t), so we divide the season’s total payroll by the number of teams.

$I = \frac{S^L_{av,base}}{S^L_{av}}$

The average league efficiency is computed identically to the team front-office efficiency:

$\mathcal{E}^L = \frac{S^L_{av}U_L}{2*P_L/\mathbf{N}_g}$

The league available payroll $$S^L_{av}$$ is the total available payroll normalized by the number of teams in the competition.  The league utilization factor $$U_L$$ is identical to the team utilization factor except that we consider all of the players in the league at once.

The total points earned by all teams in the league $$P_L$$ are normalized by the total number of matches in the competition $$\mathbf{N}_g$$ and multiplied by 2 to arrive at the average points per game in the league.

The standard win cost is scaled so that $$\mathcal{C}_{win}$$ is set to 100 for the baseline year.

The Data Challenge

The biggest challenge to any data analytics project is the quantity and quality of data, and this one is no exception.  Three types of data are required:

• Player participation data — team, position, and minutes played in a season, but also demographic data
• Player transactions — who entered and left a club
• Player salaries — base and guaranteed salaries, and salary type for MLS

Player participation data is the easiest to obtain, and tends to be publicly available.  Player transaction data is also obtainable, but less so over time (it is challenging to find MLS transaction data earlier than 2007).  Player salary data are by far the most difficult to source and verify, if they are even available at all.  In Major League Soccer, salary data are only published by the MLS Players Union twice during the season, and salary data before 2007 are extremely difficult to find online.  Salary data for players who are released in mid-season are just as difficult, even if they are senior internationals.  Fortunately, very few players land in this category, but it is frustrating that salary data on every player to participate in MLS is not available from the Players Union.  Yes, there is reason to take the figures with a grain (or more) of salt.  But until Major League Soccer opens up salary figures for its players, numbers from the Players Union will continue to be all we have to work with.

——-

So that’s the latest version of the Front-Office Efficiency metric. The major changes are normalization of inputs and outputs and introduction of standardized costs to compare front-office efficiencies across seasons.  In a future post, we’ll present results for Major League Soccer and update some infographics as well.