Optimal Big Ten realignment using operations research

Using Solver Foundation I have created several mixed integer programming models which figure out how to realign Big Ten teams as fairly as possible. Using the information I have on hand, this seems to be a good realignment if you want to have competitive divisions with roughly equal attendance:

Division A: Indiana, Iowa, Michigan, Michigan State, Nebraska, Purdue
Division B: Illinois, Minnesota, Northwestern, Ohio State, Penn State, Wisconsin

Overview

The Big Ten is an intercollegiate athletic organization composed of eleven (yep) schools from the Midwest. Big Ten schools compete in a number of sports, but the one that receives the most fan interest is football. Big Ten schools compete in the highest division of college football, the Football Bowl Subdivision. Recently the Big Ten extended an invitation to the University of Nebraska to join the conference as the twelfth member institution starting in 2011. As a result the Big Ten has decided to realign itself into two six-team divisions. Schools in the same division will be guaranteed to play each other each year, and the champions of each division will play each other in a Big Ten championship game. There’s been a lot of talk about how the divisions will be selected. There are a number of considerations, including geography, rivalry, revenue, and the overall strength of each team. Realignment will have long term financial implications for Big Ten schools, and more importantly an emotional impact on football crazed fans such as myself. So what’s the right way to make such a decision? A recent article on hawkcentral.com piqued my interest:

Kerry Whisnant has spent a chunk of the past few days crunching the numbers, and the results may have some of the Hawkeye faithful scratching their heads. A physics professor at Iowa State by day, Whisnant ran a program last week to try and figure out which teams, based on their overall and conference records since 1993 (when the Big Ten went to 11 schools), should be paired up in order to create two truly, mathematically fair divisions.

Assuming that you a.) preserve in-state rivalries; b.) split Ohio State and Michigan and Iowa and Wisconsin up; and c.) make sure the Buckeyes, Wolverines, Lions and Nebraska Cornhuskers are divided evenly across two divisions, Whisnant’s computer came up with just eight possible scenarios. And in terms of league victories, these were the three most “balanced” alignments:

Option 1 — Division A: Iowa, Minnesota, Nebraska, Illinois, Northwestern, Ohio State (422 conference wins and one tie since 1993); Division B: Michigan, Michigan State, Indiana, Penn State, Purdue, Wisconsin (421 wins and one tie).

Option 2 — Division A: Iowa, Nebraska, Indiana, Purdue, Michigan, Michigan State (422 and one tie); Division B: Ohio State, Penn State, Wisconsin, Minnesota, Northwestern, Illinois (421 and one tie)

Option 3 — Division A: Iowa, Illinois, Michigan, Michigan State, Northwestern, Penn State (419 and one tie); Division B: Ohio State, Wisconsin, Purdue, Nebraska, Indiana, Minnesota (424 and one tie).

(It did not escape my notice that an Iowa State professor is looking into Big Ten realignment! Perhaps it is a welcome distraction from thinking about Iowa State football…)

Operations research techniques have obvious application to this problem. Operations research aims to solve real world problems by modeling them mathematically, providing optimal solutions by means of sophisticated algorithms. My team builds Microsoft Solver Foundation, a software product for modeling and solving operations research problems. Using Solver Foundation I have created several mixed integer programming models which compute optimal realignments given different sets of rules. See the “Experimenting with the Models” section to download an Excel spreadsheet with the models and data.

Models and Results

I have created four different models. All four models share the following rules:

  • Rule 1: Each division has six teams. (This may seem obvious, but a mathematical model has to include everything, even the obvious stuff…)
  • Rule 2: Preserve in-state rivalries. This means that Northwestern-Illinois, Michigan-Michigan State, Indiana-Purdue must be assigned to the same division.
  • Rule 3: Place Ohio State-Michigan and Iowa-Wisconsin in different divisions.
  • Rule 4: Each division must contain two of: Ohio State, Michigan, Penn State, Nebraska.

The models differ according to the criteria that describe which realignment is “best”:

  • Model ConfWins: assign the teams to divisions so that the total number of conference wins since 1993 is as even as possible.
  • Model AllWins: assign the teams so that the total number of wins (including nonconference games) since 1993 is as even as possible.
  • Model Sagarin: assign the teams so that the average Sagarin rating since 1998 is as even as possible.
  • Model Attendance: assign the teams so that the 2009 average attendance is as even as possible. (I have included this criteron as a lame attempt to model financial considerations. Obviously more sophisticated criterion could be used.)

(For Nebraska in the “ConfWins” model we use the number of Big 12 victories.)

One comment before showing the results: I have claimed that this method obtains “optimal” realignments. Football fans may roll their eyes a bit saying, “here comes another Moneyball-reading, pencil necked geek thinking he knows better than anyone else.” Most of this may well be true, but all I am saying is that if you use these rules, and these specific criteria, here are the best realignments. The beauty of operations research (and mathematical modeling in general) is that if you don’t like the results, then you can change your assumptions, or add more. The model is only as good as the assumptions behind it: for example if we wanted to incorporate other factors such as the series history between each pair of teams, or TV ratings, we could do that. Fans (and sportswriters) often complain about the use of computer models to determine the BCS standings. These complaints are legitimate in the sense that the models sometimes lack transparency, are guided by rules that don’t seem to make sense, or are too simplistic. Those complaints may apply here too, but hey, you get what you pay for.

Here are the optimal realignments for each model:

Conference Wins
Division A Division B
Indiana 33 Illinois 45
Iowa 71 Minnesota 44
Michigan 94 Northwestern 59
Michigan State 63 Ohio State 106
Nebraska 96 Penn State 86
Purdue 63 Wisconsin 79
420 419
All Wins
Division A Division B
Illinois 75 Indiana 68
Iowa 119 Michigan 146
Minnesota 92 Michigan State 106
Nebraska 165 Penn State 147
Northwestern 96 Purdue 104
Ohio State 170 Wisconsin 144
717 715
Sagarin
Division A Division B
Indiana 65.55 Illinois 69.56
Iowa 77.46 Michigan 82.98
Minnesota 73.96 Michigan State 75.82
Ohio State 87.67 Nebraska 83.65
Penn State 82.03 Northwestern 69.61
Purdue 77.25 Wisconsin 81.59
77.32 77.20
Attendance
Division A Division B
Indiana 41833 Illinois 59545
Iowa 70214 Minnesota 50805
Michigan 108933 Northwestern 24190
Michigan State 74741 Ohio State 105261
Nebraska 85888 Penn State 107008
Purdue 50457 Wisconsin 80109
72011 71153

Let’s make some general observations. First of all, none of the proposed realignments seem crazy – that’s a good sign. Second, note that the solution to the “ConfWins” and “Attendance” models are identical, even though they use very different criteria. That seems to be a happy accident. Third, in all cases the teams are divided very evenly according to the model criteria: within 1% in all cases. The teams are so evenly divided we might ask ourselves if we are being too simplistic. After all, it doesn’t much matter whether the difference between conference wins is 1 or 3, or whether the difference between Sagarin ratings is 0.1 or 0.4. Might we gain something if we weren’t so focused on a single goal? Lastly, note that the rules of our model greatly restrict the range of possible realignments. Rules 3 and 4 separate certain teams with the aim of making the realignment more balanced. But this seems arbitrary – why not just let the model do the work by specifying better goals?

Let’s try two more models in attempt to do a little better. First, let’s re-run the Sagarin model, dropping rules 3 and 4 that prevent, for example, Iowa and Wisconsin from being in the same division:

Sagarin no 3,4
Division A Division B
Indiana 65.55 Illinois 69.56
Iowa 77.46 Michigan 82.98
Minnesota 73.96 Michigan State 75.82
Ohio State 87.67 Nebraska 83.65
Purdue 77.25 Northwestern 69.61
Wisconsin 81.59 Penn State 82.03
77.25 77.28

Wisconsin and Penn State flip, which makes a little more sense geographically. But we’re still “overtuning”: who really cares whether the difference in Sagarin ratings is 0.03 or 0.12? Let’s continue to ignore rules 3 and 4, but instead of using a single goal, combine all four goals: all wins, conference wins, Sagarin rating, and attendance. We combine them by “normalizing” each component: for example dividing the attendance difference by the average attendance. This way we will get a result that is satisfactory across all four criteria. Here goes:

All goals no 3,4
Division A Division B
Indiana 65.55 Illinois 69.56
Iowa 77.46 Minnesota 73.96
Michigan 82.98 Northwestern 69.61
Michigan State 75.82 Ohio State 87.67
Nebraska 83.65 Penn State 82.03
Purdue 77.25 Wisconsin 81.59

For this model, conference wins are 420 vs 419, all wins are 708 vs 724, Sagarin difference is 0.28, and average attendance is within 1000 fans. For these criteria, this alignment seems like a good choice.

Experimenting with the Models

You can download an Excel spreadsheet with all the models described above here. In order to run the models you will need to install Microsoft Solver Foundation. The free Express version can be downloaded here: you will need Excel 2007 or Excel 2010 and the .Net Framework (which you probably already have on your machine) to install. Here’s how to try out the model:

  • Install Solver Foundation. Click on either the “Solver Foundation v2.1 – 32-bit” or “Solver Foundation v2.1 – 64-bit” icon. If you aren’t sure which one to select, pick 32-bit.
  • Open the spreadsheet (you will need to actually open it in Excel, not just view it on the web).
  • Notice that Sheet1 has two tables: the top one has information about each Big Ten conference team. The bottom one has a proposed realignment along with information about the total number of wins, Sagarin rating, and attendance. This sheet represents the data used by the Solver Foundation model.
  • You should notice a new “Solver Foundation” tab in the Excel ribbon. Click on it.
  • You should see a “Modeling Pane” on the right hand side of the screen. If not, click on the “Model” button in the Solver Foundation ribbon tab. The modeling pane is the place where you define the model.
  • Click on the “Goals” tab in the Modeling Pane. Notice that there are 5 goals, and that the “balanceConf” goal is checked. This means that we will find a realignment that balances conference wins.
  • Click on the Solve button in the Solver Foundation tab. The results are updated in the bottom table in Sheet 1.
  • If you check other goals and click Solve then you can see how the results change. Make sure only one goal is checked.
  • If you are feeling brave, click on the Constraints tab. You can ignore constraints by unchecking them. If you are feeling really brave you can add new constraints. For example, what if I wanted to make sure that Penn State and Michigan State were assigned to the same division?

Once you get comfortable with the modeling language you can do all kinds of experimentation.

Building the Models

Here’s where it gets technical: let’s review how I built the models in Solver Foundation. Those familiar with operations research will recognize this as a simple mixed integer programming model. Regular readers of my blog know that I like to break down models by identifying the sets of objects involved, the output decision variables we want to compute, the input parameters representing the “data”, and finally the goals and constraints. Once we have these we can easily write down the model in Solver Foundation’s OML modeling language and solve it. The sets for this model are teams (there are 12) and divisions (there are 2). What I want to figure out is which division each team should belong to. Since there are only two divisions, all I really want to know is whether each team belongs to what I have called “Division A”: either in or out. This can be modeled as a binary (0-1 integer) decision, one for each team: let’s call this “InDivisionA”, for example InDivisionA[“Iowa”] = 1 if Iowa is in Division A. The parameters are the number of conference wins, number of overall wins, the average Sagarin ratings, and attendance for each team. Let’s think about the constraints:

  • Rule 1: Each division has six teams. Since InDivisionA[t] is 1 if a team t is in Division A, just make sure that the sum of InDivisionA[t] over all teams is 6.
  • Rule 2: Preserve in-state rivalries. Make sure that InDivisionA has the same value for teams in the same state. For example, InDivisionA[“Michigan”] == InDivisionA[“Michigan State”].
  • Rule 3: Place Ohio State-Michigan and Iowa-Wisconsin in different divisions.  The sum of InDivisionA for these pairs of teams should be 1 – so that one is in each division.
  • Rule 4: Each division must contain two of: Ohio State, Michigan, Penn State, Nebraska. The sum of InDivisionA must be 2.

Now let’s turn to the goals. A simple goal is to minimize the difference in conference wins between divisions. If I want to find the number of conference wins for division A, I simply take each team “t” and multiply InDivisionA[t] by the number of wins. For division B, I multiply by (1 – InDivision[t]) instead. Then using the trick described in this post, I can minimize the absolute value of the difference between the two. Here is the OML model:

Model[
  Parameters[Sets[Any], Teams],
  Parameters[Reals, ConfWins[Teams]],
  Decisions[Integers[0, 1], InDivisionA[Teams]],
  Decisions[Integers,  ConfWinsDiff],
  Goals[Minimize[balanceConf -> ConfWinsDiff]],
  Constraints[
    c1 -> Sum[{t, Teams}, ConfWins[t] * InDivisionA[t]]
             - Sum[{t, Teams}, ConfWins[t] * (1 - InDivisionA[t])] <= ConfWinsDiff,
    c2 -> Sum[{t, Teams}, ConfWins[t] * (1 - InDivisionA[t])] 
             - Sum[{t, Teams}, ConfWins[t] * InDivisionA[t]] <= ConfWinsDiff,
    samecount -> Sum[{t, Teams}, InDivisionA[t]] == Sum[{t, Teams}, (1 - InDivisionA[t])],
    in_state_mi -> InDivisionA["Michigan"] == InDivisionA["Michigan State"],
    in_state_in -> InDivisionA["Indiana"] == InDivisionA["Purdue"],
    in_state_il -> InDivisionA["Illinois"] == InDivisionA["Northwestern"],
    tosu_mich -> InDivisionA["Michigan"] + InDivisionA["Ohio State"] == 1,
    iowa_wisc -> InDivisionA["Iowa"] + InDivisionA["Wisconsin"] == 1,
    tosu_mich_neb_psu -> InDivisionA["Ohio State"] + InDivisionA["Michigan"] 
+ InDivisionA["Nebraska"] + InDivisionA["Penn State"] == 2,
  ]
]

The goals involving total wins, Sagarin rating, and attendance can all be modeled exactly the same way: I just add parameters that represent (for example) Sagarin rating and add similar constraints and goals. Check out the Excel spreadsheet for the full model.

Author: natebrix

Follow me on twitter at @natebrix.

1 thought on “Optimal Big Ten realignment using operations research”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s