Creating Equation Haikus using Constraint Programming

Bob Bosch pointed out (in a response to Shea Serrano, wonderfully…) that 77 + 123 = 200 is a haiku in English:

How many such haikus are there? Well, we can use constraint satisfaction (CSP) to give us an idea (Python code here for the impatient):

A traditional haiku consists of three lines: the first line has five syllables, the second seven, and the last five, for a total of seventeen syllables.

We can write equations that describe our problem. First, let’s define the function s[n] as the number of syllables in the English words for n. For example, s[874] = 7; try saying “eight hundred seventy four” out loud. It’s not that hard in English to compute s[n]; at least, I don’t think I messed it up.

Let’s call A the number in the first line of the haiku, B the second, and C the third. If we want a haiku with the right number of syllables and for the equation it describes to hold, all we need is:

s[A] = 5
1 + s[B] = 7 (since “plus” is one syllable)
2 + s[C] = 5 (since “equals” is two syllables)
A + B = C

If you let A=77, B=123, C=200 as in Bob’s example, you will see the above equations hold.

Using the constraint package in Python, it’s easy to create this model and solve it. Here is the code. A great thing about CSP solvers is that they will not just give you one of the solutions to the model, but all the solutions, at least for cases where A, B, C <= 9999. It turns out there are 279 such haikus, including

eight thousand nineteen
plus nine hundred eighty one
equals nine thousand

and the equally evocative

one hundred fourteen
plus one hundred eighty six
equals three hundred

Of course, this is not the only possible form for an equation haiku. For example, why not:

+ B + C
= D

I invite you to modify the code and find other types of equation haikus!

Screen Shot 2019-09-22 at 2.35.14 PM


Optimizing 19th Century Typewriters

The long title for this post is: “Optimizing 19th Century Typewriters using 20th Century Code in the 21st Century”.

Patrick Honner recently shared Hardmath123’s wonderful article “Tuning a Typewriter“. In it, Hardmath123 explores finding the best way to order the letters A-Z on an old and peculiar typewriter. Rather than having a key for each letter as in a modern keyboard, the letters are laid out on a horizontal strip. You shift the strip left or right to find the letter you want, then press a key to enter it:

Screen Shot 2018-11-26 at 12.40.38 PM.png

What’s the best way to arrange the letters on the strip? You probably want to do it in such a way that you have to shift left and right as little as possible. If consecutive letters in the words you’re typing are close together on the strip, you will minimize shifting and type faster.

The author’s approach is to:

  • Come up with an initial ordering at random,
  • Compute the cost of the arrangement by counting how many shifts it takes to type out three well-known books,
  • Try to find two letters that when you swap them results in a lower cost,
  • Swap them and repeat until you can no longer find an improving swap.

This is a strong approach that leads to the same locally optimal arrangements, even when you start from very different initial orderings. It turns out that this is an instance of a more general optimization problem with an interesting history: quadratic assignment problems. I will explain what those are in a moment.

Each time I want to type a letter, I have to know how far to shift the letter strip. That depends on two factors:

  1. The letter that I want to type in next, e.g. if I am trying to type THE and I am on “T”, “H” comes next.
  2. The location of the next letter, relative to the current one T. For example, if H is immediately to the left of T, then the location is one shift away.

If I type in a bunch of letters, the total number of shifts can be computed by multiplying two matrices:

  • A frequency matrix F. The entry in row R and column C is a count of how often letter R precedes letter C. If I encounter the word “THE” in my test set, then I will add 1 to F(“T”, “H”) and 1 to F(“H”, “E”).
  • A distance matrix D. The entry in row X and column Y is the number of shifts between positions X and Y on the letter strip. For example, D(X, X+1) = 1 since position X is next to position X+1.

Since my problem is to assign letters to positions, if I permute the rows and columns of D and multiply this matrix with F, I will get the total number of shifts required. We can easily compute F and D for the typewriter problem:

  • To obtain F, we can just count how often one letter follows another and record entries in the 26 x 26 matrix. Here is a heatmap for the matrix using the full Project Gutenberg files for the three test books:

Screen Shot 2018-11-26 at 12.58.40 PM.png

  • The distance matrix D is simple: if position 0 is the extreme left of the strip and 25 the extreme right, d_ij = abs(i – j).

The total number of shifts is obtained by summing f_ij * d_p(i),p(j) for all i and j, where letter i is assigned to location p(i).

Our problem boils down to finding a permutation that minimizes this matrix multiplication. Since the cost depends on the product of two matrices, this is referred to as a Quadratic Assignment Problem (QAP). In fact, problems very similar to this one are part of the standard test suite of problems for QAP researchers, called “QAPLIB“. The so-called “bur” problems have similar flow matrices but different distance matrices.

We can use any QAP solution approach we like to try to solve the typewriter problem. Which one should we use? There are two types of approaches:

  • Those that lead to provably global optimal solutions,
  • Heuristic techniques that often provide good results, but no guarantees on “best”.

QAP is NP-hard, so finding provably optimal solutions is challenging. One approach for finding optimal solutions, called “branch and bound”, boils down to dividing and conquering by making partial assignments, solving less challenging versions of these problems, and pruning away assignments that cannot possibly lead to better solutions. I have written about this topic before. If you like allegories, try this post. If you prefer more details, try my PhD thesis.

The typewriter problem is size 26, which counts as “big” in the world of QAP. Around 20 years ago I wrote a very capable QAP solver, so I recompiled it and ran it on this problem – but didn’t let it finish. I am pretty sure it would take at least a day of CPU time to solve, and perhaps more. It would be interesting to see if someone could find a provably optimal solution!

In the meantime, this still leave us with heuristic approaches. Here are a few possibilities:

  • Local optimization (Hardmath123’s approach finds a locally optimal “2-swap”)
  • Simulated annealing
  • Evolutionary algorithms

I ran a heuristic written by Éric Taillard called FANT (Fast ant system). I was able to re-run his 1998 code on my laptop and within seconds I was able to obtain the same permutation as Hardmath123. By the way, the zero-based permutation is [9, 21, 5, 6, 12, 19, 3, 10, 8, 24, 1, 16, 18, 7, 15, 22, 25, 14, 13, 11, 17, 2, 4, 23, 20, 0] (updated 12/7/2018 – a previous version of this post gave the wrong permutation. Thanks Paul Rubin for spotting the error!)

You can get the data for this problem, as well as a bit of Python code to experiment with, in this git repository.

It’s easy to think up variants to this problem. For example, what about mobile phones? Other languages? Adding punctuation? Gesture-based entry? With QAPs, anything is possible, even if optimality is not practical.

Chaining Machine Learning and Optimization Models

Rahul Swamy recently wrote about mixed integer programming and machine learning. I encourage you to go and read his article.

Though Swamy’s article focuses on mixed integer programming (MIP), a specific category of optimization problems for which there is robust, efficient software, his article applies to optimization generally. Optimization is goal seeking; searching for the values of variables that lead to the best outcomes. Optimizers solve for the best variable values.

Swamy describes two relationships between optimization and machine learning:

  1. Optimization as a means for doing machine learning,
  2. Machine learning as a means for doing optimization.

I want to put forward a third, but we’ll get to that in a moment.

Relationship 1: you can always describe predicting in terms of solving. A typical flow for prediction in ML is

  1. Get historical data for:
    1. The thing you want to predict (the outcome).
    2. Things that you believe may influence the predicted variable (“features” or “predictors”).
  2. Train a model using the past data.
  3. Use the trained model to predict future values of the outcome.

Training a model often means “find model parameters that minimize prediction error in the test set”. Training is solving. Here is a visual representation:

Screen Shot 2018-08-19 at 3.40.32 PM

Relationship 2. You can also use ML to optimize. Swami gives several examples of steps in optimization algorithms that can be described using the verbs “predict” or “classify”, so I won’t belabor the point. If the steps in our optimization algorithm are numbered 1, 2, 3, the relationship is like this:

Screen Shot 2018-08-19 at 3.40.39 PM

In these two relationships, one verb is used as a subroutine for the other: solving as part of predicting, or predicting as part of solving.

There is a third way in which optimization and ML relate: using the results of machine learning as input data for an optimization model. In other words, ML and optimization are independent operations but chained together sequentially, like this:

Screen Shot 2018-08-19 at 3.40.46 PM

My favorite example involves sales forecasting. Sales forecasting is a machine learning problem: predict sales given a set of features (weather, price, coupons, competition, etc). Typically business want to go further than this. They want to take actions that will increase future sales. This leads to the following chain of reasoning:

  • If I can reliably predict future sales…
  • and I can characterize the relationship between changes in feature values and changes in sales (‘elasticities’)…
  • then I can find the set of feature values that will increase sales as much as possible.

The last step is an optimization problem.

But why are we breaking this apart? Why not just stick the machine learning (prediction) step inside the optimization? Why separate them? A couple of reasons:

  • If the ML and optimization steps are separate, I can improve or change one without disturbing the other.
  • I do not have to do the ML at the same time as I do the optimization.
  • I can simplify or approximate the results of the ML model to produce a simpler optimization model, so it can run faster and/or at scale. Put a different way, I want the structure of the ML and optimization models to differ for practical reasons.

In the machine learning world it is common to refer to data pipelines. But ML pipelines can involve models feeding models, too! Chaining ML and optimization like this is often useful, so keep it in mind.

Finding Optimal State Capitol Tours on the Cloud with NEOS

My last article showed you how to find an optimal tour of all 48 continental US state capitols using operations research. I used the Python API of the popular Gurobi solver to create and solve a traveling salesman problem (TSP) model in a few seconds.

In this post I want to show you how to use Concorde, the world’s best TSP solver for free on the cloud using the NEOS optimization service. In less than 100 lines of Python code, you can find the best tour. Here it is:


Using NEOS is pretty easy. You need to do three things to solve an optimization problem:

  1. Create a NEOS account.
  2. Create an input file for the problem you want to solve.
  3. Give the input file to NEOS, either through their web interface, or by calling an API.

Let’s walk through those steps for the state capitol problem. If you just want to skip to the punchline, here is my code.

Concorde requires a problem specification in the TSPLIB format. This is a text based format where we specify the distances between all pairs of cities. Recall that Randy Olson found the distances between all state capitols using the Google Maps API in this post. Here is a file with this information. Using the distances, I created a TSPLIB input file with the distance matrix – here it is.

The next step is to submit the file to NEOS. Using the xmlrpc Python module, I wrote a simple wrapper to submit TSPLIB files to NEOS. The NEOS submission is an XML file that wraps the contents of the TSPLIB data, and also tells NEOS that we want to use the Concorde solver. The XML file is given to NEOS via an XML-RPC call. NEOS returns the results as a string – the end of the string contains the optimal tour. Here is the body of the primary Python function that carries out these steps:

def solve_tsp_neos_concorde(dist):
xml = make_neos_concorde(dist)
neos = NeosClient()
result =
return tour_from_neos_concorde_result(result)

When I run this code, I obtain the same tour as in my initial post. Hooray! You can also extend my code (which is based on NEOS documentation) to solve many other kinds of optimization models.

Computing Optimal Road Trips Using Operations Research

Randy Olson recently wrote about an approach for finding a road trip that visits all 48 continental US state capitols. Randy’s approach involves genetic algorithms and is accompanied by some very effective visualizations. Further, he examines how the length of these road trips varies as the number of states visited increases. While the trips shown in Randy’s post are very good, they aren’t quite optimal. In this post I’d like to show how you can find the shortest possible road trips in Python using the magic science of operations research. I suggest you read Randy’s post first to get up to speed!

An “optimal road trip” is an ordering of the 48 state capitols that results in the smallest possible driving distance as determined by Google Maps. This is an example of what is known as the Traveling Salesman Problem (TSP). In his work, Randy has made a couple of simplifying assumptions that I will also follow:

  • The driving distance from capitol A to capitol B is assumed to be the same as from B to A. We know this isn’t 100% true because of the way roads work. But close enough.
  • We’re optimizing driving distance, not driving time. We could easily optimize “average” driving time using data provided by Google. Optimizing expected driving time given a specified road trip start date and time is actually pretty complicated given that we don’t know what the future will bring: traffic jams, road closures, storms, and so on..

These aren’t “bugs”, just simplifying assumptions. Randy used the Google Maps API to get driving distances between state capitols – here’s the data file. Note that Google Maps returns distances in kilometers so you’ll need to convert to miles if that’s your preference.

Randy’s approach to solve this problem was to use a genetic algorithm. Roughly speaking, a genetic algorithm starts with a whole bunch of randomly generated tours, computes their total distances, and repeatedly combines and modifies them to find better solutions. Following the analogy to genetics, tours with smaller total distances are more likely to be matched up with other fit tours to make brand new baby tours. As Randy showed in his post, within 20 minutes his genetic algorithm is able to produce a 48 state tour with a total length of 13,310 miles.

It turns out that we can do better. An inaccuracy in Randy’s otherwise great article is the claim that it’s impossible to find optimal tours for problems like these. You don’t have to look at all possible 48 city road trips to find the best one – read this post by Michael Trick. What we can do instead is rely on the insanely effective field of operations research and its body of 50+ years of work. In an operations research approach, we build a model for our problem based on the decisions we want to make, the overall objective we have in mind, and restrictions and constraints on what constitutes a solution. This model is then fed to operations research software (optimization solvers) that use highly tuned algorithms to find provably optimal solutions. The algorithms implemented solvers rule out vast swaths of possible tours in a brutally efficient manner, making the seemingly impossible routine.

The best-in-class TSP solver is Concorde, and is based on an operations research approach. You don’t need Concorde to solve this TSP – a 48 city road trip is puny by operations research standards. I have chosen to use the Gurobi solver because it is a very powerful solver that includes an easy-to-use Python API, and it has a cloud version. Gurobi even includes an example that covers this very problem! The key to their model is to define a yes-no decision variable for each pair of state capitols. A “yes” value for a decision variable indicates that pair of cities is on the optimal tour. The model also needs to specify the rules for what it means to be an optimal tour:

  • The shorter the total distance of the tour (which is determined by the distances between all of the “yes” pairs of cities), the better. This is the objective (or goal) that we seek to optimize.
  • The traveller will arrive at each capitol from another capitol, and will leave for another capitol. In other words, exactly two decision variables involving a capitol will be “yes”.
  • Tours don’t have cycles: visiting Boise more than once is not allowed.

(Sidebar: If you are not used to building optimization models then the code probably won’t make much sense and you may have no idea what the hell the Gurobi tutorial is talking about. No offense, Gurobi, the tutorial is very well written! The challenge of writing optimization models, which involves writing down precise mathematical relationships involving the decision variables you want solved, is what prevents many computer scientists and data scientists from using the fruits of operations research more often. This is especially the case when the models for classical problems such as the TSP require things like “lazy constraints” that even someone experienced with operations research may not be familiar with. I wrote about this in more detail here. On the other hand, there are a lot of great resources and tutorials out there and it’s simply good practice to rely on proven approaches that provide efficient, accurate results. This is what good engineers do. Anyway, the point is that you can easily steal Gurobi’s sample for this problem and replace their “points” variable with the distances from the data file above. If I wanted to do this with an open source package, or with Concorde itself I could have done it that way too.)

My code, based on Gurobi’s example, is able to find a tour with a total length of 12930 miles, about 380 miles shorter than the original tour. What’s more, it takes seconds to find the answer. Here is my Python code. Here is the tour – click here to explore it interactively.


A text file with the tour is here and a GPX file of the tour is here courtesy of This optimal tour is very close to the one the genetic algorithm came up with. Here is a screenshot for reference:


An interesting twist is that Randy extends the problem to consider both the driving distance and the number of states visited. If we are willing to do a tour of, say, 10 states, then clearly the total distance for the tour will be much shorter than a 48 state tour. Randy has a nice animation showing tours of differing numbers of states, as well as a chart that plots the number of states visited against the estimated driving time. This curve is called the efficient frontier – you sometimes see similar curves in financial models.

The modified problem of finding the shortest tour involving K of 48 total state capitols can also be solved by Gurobi. I extended the optimization model slightly:

  • Introduce new yes-no decision variables for each capitol: “yes” if the capitol is one of the lucky K to be visited.
  • Exactly K of the new decision variables should be “yes”.
  • Fix up the original model to make sure we don’t worry about the other N-K cities not on our mini tour.

(I also had to modify the model because I am running on the cloud and the “lazy constraints” mentioned in Gurobi’s TSP writeup don’t work in the cloud version of Gurobi.)

With this new code in place I can call it for K=3…47 and get this optimal efficient frontier curve:


The distances and tours for all of these mini tours are given here.

What have we learned here? In about 200 lines of Python code we were able to efficiently find provably optimal solutions for the original road trip problem, as well as the “pareto optimization” extension. If you’re a data scientist, get familiar with operations research principles because it will certainly pay off!

SQL as an Optimization Modeling Language

Several years ago, a former (awesome) colleague of mine at Microsoft, Bart de Smet, and I discussed the expressibility of optimization problems using SQL syntax. Most formulations carry over in a straightforward way, for example if we want to solve:

minimize 2 x + y
subject to 
x^2 + y^2 <= 1,
x >= 0.

Then we can express this as

  POWER(X, 2) + POWER(Y, 2) <= 1 AND X >= 0

Through suitable rewriting such a specification could be easily sent to a solver. You get the idea; a range of problem types, and even concepts like warm starting are easily supported. I suppose even column generation could be supported via triggers.

Update 10/1/2015: Friends Of The Blog Jeff, Alper, and colleagues thought of this long before I did. See this paper for more.

Operations Research Tom Swifties

I apologize in advance, and you’re welcome. Add your Tom Swifties in the comments!

  • “Queuing theory says that the registration line will move faster if we all stand here,” Tom said optimistically.


  • “Ah, but what did you expect? It was a grad student talking about Simplex at a sponsored session!” Tom pivoted.


  • “Oh yeah, Gomory would positively rave over your choices,” Tom said cuttingly.


  • “I don’t know the best way to the airport, but I can tell you what I did last time,” Tom said warmly.


  • “I assure you darling, my derivatives are continuous in an extremely large neighborhood near the solution,” Tom said smoothly.


  • “My model says that if I post this on a Tuesday, I’ll get more 20% blog hits,” Tom said analytically.


  • “There may be some error in the data, but don’t worry, we’ve accounted for that,” Tom said robustly.


  • “Just choose the largest slice and be done with it,” Tom said greedily.


  • “Break the constraint if you want, but it’ll cost you,” Tom said softly.


  • “I use Avogadro’s number in all of my big-M constraints,” Tom said unstably.


  • “Things have never been better,” Tom said monotonically nondecreasingly.

Yes, Virginia, there is a difference between optimization and prescriptive analytics

[This post is a nod to this famous editorial. I don’t want you to think I’ve gone crazy. The subject of this post was inspired by the Lustig/Dietrich/Johnson/Dziekan article “The Analytics Journey” and the last bit (regarding skills) by a conversation with my boss.]

Eight-year old Virginia O’Dual writes:

Dear Blogger: I am eight years old. Some of my little friends tell me that prescriptive analytics is just a fancy way to say optimization. My advisor tells me that if it is on your blog it must be so. Tell me, is it so?

Virginia, your little friends are wrong. Analytics is not the same as optimization. I realize it may sound strange to be so blunt – after all, this is merely semantics, right? The internet is filled with people ceaselessly arguing about terminology. Millions (billions?) are afflicted with the disease of thinking that knowing the name of a thing is the same as knowing it. Virginia, you may be asking why this is any different. Am I not falling into the same trap by insisting there’s a difference? On top of that, why pick a fight about “analytics”? The word “analytics” is on everyone’s lips these days: on twitter, in magazines and newspapers, during coffee breaks at conferences. More often than not, rushes of hot air follow. So why on earth am I telling you there’s a difference? Virginia, the reason is that there is a connotation to “prescriptive analytics” that is missing from “optimization”:

Optimization speaks to process and prescriptive analytics to purpose.

[Numerical] optimization is finding the best option from a set of alternatives for a mathematical model. In more staid company it is called “operations research”. In less refined circles some simply speak of “search problems”, but this detracts from the majesty and beauty of the techniques that are so well-known and well-loved by practitioners. The theories and techniques of optimization date back centuries, though they were not labeled as such until much more recently. Optimization usually implies a formal mathematical model and a clearly articulated method that provides a guarantee about the quality of the alternatives are discovered.

Virginia, you read this blog, so you already know all of this. You’re an optimizer. You’re proud. Proud of your heritage, proud of the sound reasoning behind your methods, and proud of the profligate applications of your craft in matters of war, peace, and profit. More or less. You may consider those who use highfalutin terms such as “prescriptive analytics” to be late to the party, particularly when they speak in the language of business rather than in the more precise language of mathematics. Skepticism abounds; in the past I have even called analytics “old wine in a new bottle“. The fact that the terminology comes with a sales pitch doesn’t help: a cluster of computers, consulting services, a web service. If you listen to the hucksters, everything under the sun seems to be an essential part of an analytics “solution”: a filesystem, a database, a charting library. Even worse, the aspirations the “analytics community” speak of seem so depressingly modest. A pivot table is not analytics, we optimizers snort. We can factorize matrices already, thank you very much. Yes, my data fits on my machine. Go away so I can continue to bear the world on my shoulders like Atlas (or Dantzig). We know what you are talking about, we say to them, and each other. You are talking about optimization. (I notice a similar relationship between statisticians and the term “predictive analytics”.)

But prescriptive analytics really is something new. I have come to understand that though optimization is the primary tool of prescriptive analytics, it would be very limiting to conflate the two. Prescriptive analytics refers to turning data into decisions – but not just any decision. Decisions that are to be made by people, answering questions posed by people. This is not always the case for optimization. Sometimes decisions produced by optimization are in service of another software process. Other times the decisions aren’t really decisions in the everyday sense.  Root finding and regression are properly speaking optimization, even if we don’t always label them such. Other times, the answers produced by optimization are far removed from the questions that were originally posed. “Consider a spherical cow“, a professor once said. In many cases, such as scheduling, supply chain optimization, or routing there are so many transformations, simplifications, and layers of rewriting from prose to math to AMPL (or GAMS, or AIMMS) to C and back have been undertaken that it is hard to connect solution to problem. The plot is lost. I wrote my Masters thesis on semidefinite programming formulations for truss topology design – bridge building. During my defense I was asked by a committee member if I would drive over a bridge I had built using the results of my formulation. The correct answer was “hell, no” – because I was trying to solve an optimization problem and not a prescriptive analytics problem. (I wasn’t brave enough to say so at the time, of course.)

Prescriptive analytics implies a specific purpose – building a system that automatically produces decisions to choices that humans face. As a wise man once said: “begin with the decision in mind and see where it leads you“. In the practice of prescriptive analytics this may lead you down unfamiliar roads, but brainpower and software are always provisions for the journey. Software that lies at the center of data, models, algorithms, and computational platforms. Sometimes the job simply boils down to assembling data. Real time data, historical data, disjointed and incomplete data that must be joined, refined, summarized, synthesized and completed. Once this is done we are left with nothing but a measly linear program, or even more pathetically, a sort. There is no shame in this – there are no bonus points awarded for complicated formulations in the world of analytics.

Prescriptive analytics concerns an entire process, and this process typically involves assembling data, building models, evaluating them, and presenting the results. Optimization comes into the play for the middle two steps, but the first and last are every bit as important. In fact, many times the first and last steps – assembling the data and presenting results in a form that people understand – are the most difficult and time consuming ones. And you do need tools to help you to pull them off. The problem with the analytics crowd, and the reason why optimization people sometimes roll their eyes, is that analytics people often focus entirely on the edges and ignore the middle: formulating and solving a real model. In prescriptive analytics we give all steps their proper attention, because without them the goal of delivering actionable plans for success cannot be attained.

So, Virginia, this offers fantastic opportunities to those who choose to take up the challenge of mastering prescriptive analytics – young and old alike. A practitioner of prescriptive analytics needs to understand not only the formal methods of optimization, but much more besides. They need to understand how data is obtained, manipulated, and joined together. They need to understand the art of model building, which is experimental in nature and requires knowledge of the original problem domain. They need to be able to communicate the insights produced by their models in words and in pictures. Finally, they need to be able to understand the business objectives and motivations behind their projects, so that they can not only deliver what is being asked for, but anticipate (and answer) the questions that come next.

Shipbuilders, Sailors, and Passengers

I sometimes trot out a boat analogy when I am asked about a career in data science, which is tenuous since I grew up in Iowa.

Cutty Sark

Building solvers is like shipbuilding. Shipbuilding is an ancient discipline which in more recent times has grown into a sophisticated engineering task. Building ships takes time to do right, and it takes a lot of practice to learn – it helps if you’re an apprentice first. It’s not an art, but it’s not quite a science either. For all of the technology, for all of the engineering, as a shipbuilder there are certain things you just don’t do, because that’s just the way the that you were taught. People long before you have tried it a different way and it just didn’t work. You don’t need to be young to be a shipbuilder – in fact in some ways it might be a little bit better if you are a bit older and wiser. Some people think the very idea of building a boat is a terrible bore (or just too damn hard), but for others it’s captivating. That’s pretty much all they want to do.

Using a solver – modeling – is like sailing. Sailors care very deeply about boats, of course, but that’s not all they care about. If you have ever taken a sailing lesson, one of the first things you will hear about is the weather. Your instructor will tell you to pay attention to the way the wind plays off the water, the trees near the shore, flags on buildings, and so on. A sailor needs to understand how their boat interacts with the wind and the water. When the forces of nature and the sailboat are in harmony, the experience is almost magical. If you’re sailing anywhere interesting then you will also need to know about any hazards along the way, like rocks, mermaids or whatever. Even experienced sailors can’t figure this stuff without some help – navigational aids, or the advice of locals is often helpful. Some people sail for fun, and others sail because it’s their job, but either way there’s a goal in mind. What matters is not the general characteristics of boats, or wind, or water, but the specific conditions that come into play on their voyages. Shipbuilders and sailors evaluate boats differently. Shipbuilders have no idea how their boat is going to be used, so they have to think about the entire range of conditions they may face. They may even design for the worst possible storm. Sailors only care about the voyages that they themselves are on. But since people sail for a reason, the destination and the conditions are often out of the ordinary, and may stress the boat in ways that shipbuilders, or even Hans Mittelmann, may not have anticipated.

Just because you can sail doesn’t mean that you can build a boat. Sailors want boats, not shipbuilders, but you can’t have a boat without a shipbuilder. Good shipbuilders are hard to find. That all seems obvious. The funny thing is that it’s hard for many of us to think of it the other way round. Just because you can build a boat doesn’t mean that you know how to sail. Some can, it’s true…but you can’t bank on it. I certainly wouldn’t try to turn a bunch of shipbuilders into sailors without some sort of training or evaluation. Maybe because we’re so in awe of the few that know how to build ships that we think that they simply must know how to sail. I think things are starting to change, but it seems to me that most operations research graduate students are trained to be shipbuilders. This is not a bad thing. The thing is that once they’re trained to be shipbuilders, they are often hired to be sailors. For my part, I was trained to be a shipbuilder as a CS grad at the University of Iowa. When I joined Solver Foundation at Microsoft several years ago, I was (thankfully) hired on to be a shipbuilder, writing our interior point solvers. The more I got familiar with Solver Foundation and its customers, and especially after I took over leadership of the team, I began to realize that many of our customers were asking us to teach them to sail, or just to sail the damn boat for them. My head was filled with the alphas, mus, and sigmas of the shipyard but they are not always all that useful out on the water. I needed to learn to sail. Now I find that I like sailing more than I do shipbuilding. Go figure. Management is sometimes not as familiar with nautical terminology so these distinctions are sometimes lost. Management often seems more interested in cars than boats.

I’ve neglected to mention the most important group. People who ride on boats – passengers (or voyagers if you are more romantic) – are like people who use models. Some people take trips for fun, like a trip around Puget Sound, or a cruise to Alaska. Other people take trips to get from one place to another, like from Seattle to Victoria. Some voyages don’t involve passengers at all – we’re moving freight.  The important difference is that in each of the cases, the fact that sailors and shipbuilders are involved at all is incidental – a passenger is paying for an experience, or for a service. While you and I are boat enthusiasts – can’t get enough! – most passengers couldn’t give a crap about the type of fabric used to build the sail, or the horsepower on the engines, or how narrow the strait is. The experience is what matters. Sometimes there is only a dim awareness that a boat is involved at all – all they know is that when they go down the ramp, they are in a new place. Solvers help determine how an Amazon package gets to your door. That’s amazing.

There are only so many shipbuilders in the world. There are many more sailors, but even sailors are overwhelmingly outnumbered by passengers. It’s not even close. But I don’t think that means that any one of these groups is any more important, or any more noble or intelligent. Everyone is coming at this from their own perspective, and you have to respect that. Shipbuilders have mastered a craft, and that requires a lot of dedication. Modelers are able to adjust to conditions and get people where they need to go. Passengers have their own lives, and often a particular trip is only a line or two in the larger narrative that is their lives. The sailor shouldn’t mock the guy playing shuffleboard – he’s probably earned the right to kick back a little. The passenger shouldn’t look down on the sailor, either. They will keep you from drowning.

I wonder – which would you rather do? Build boats, sail, or go on a voyage? Where is the most money? Does it matter? Will we ever be able to eliminate the need for boats, voyages, or trips? Beats me.

2011 INFORMS Conference on Analytics

Recently I have been "heads down" on some new stuff we’re working on for Technical Computing. Fortunately I am getting the chance to fly out to Chicago to attend the 2011 INFORMS Conference on Analytics. The annual meeting is a lot of fun – especially because of the sheer volume of great talks – but this conference is perhaps even more relevant to my day-to-day work with Solver Foundation and Technical Computing. Here are the four things I am looking forward to the most:

The technology workshops on Sunday. We’re in the middle of a development cycle so I don’t have a workshop this time, so I will have the chance to sit back and watch! It looks there is going to be some great stuff from Palisade, Gurobi, Ziena, and many more. The workshops are a great opportunity to learn about the latest tools, straight from those who build them. I try to go as many of these as I can.

The Edelman Awards. As somebody trying to build tools to help people solve real-world problems, this is what it’s all about. It’s inspiring, educational, and impressive. I see that Mike Trick is once again a judge for the competition, and I do not envy his job. It’s going to be tough. I am not sure how I would weigh the novelty of a submission versus its business impact. I have found in several cases that simple blunt instruments can be surprisingly effective in practice, even if they make for totally uninteresting papers. We’ve got some outstanding entries this year.

The plenary keynotes, in particular Chuck Holland from UPS. The title of his talk is "Operations Research: Getting the Attention of Senior Executives". This conference was renamed from the "INFORMS Practice Conference" to put a greater emphasis on the field of analytics: using computing to gain insight from data. In this context, what has been traditionally called "operations research" provides many – but not all – of the tools used to make business critical decisions. What’s intriguing about Chuck’s abstract is that it confronts a key problem head-on: bridging the gap between those who supply insight and possible decisions (who are practitioners of operations research and analytics whether they know it or not), and those who actually make these decisions.

The last thing, and perhaps the most important to me, is the chance to talk with other attendees. The INFORMS Analytics conference is cool because it brings together the research community, practitioners, and software vendors. I learn so much just by hearing everyone’s stories, and I always make connections that make an impact on what I do the rest of the year. To top it all off, it’s in Chicago, one of my favorite cities in the whole world.