All models are wrong, but some are useful. Even when it comes to burritos. I had been successfully ignoring the seemingly interminable Burrito Bracket Challenge(*) on Five Thirty Eight until this retweet by Carl Bialik grabbed my attention:
Am I supposed to take any of this seriously? I don’t know. Nevertheless, curiosity got the better of me and I read the Vox article. Matthew Yglesias asserts that Chipotle is better because unlike La Taqueria, most people can actually find a Chipotle burrito when they want one, unlike La Taqueria which is available only to the lucky few within a five mile radius of the Mission District. In the words of Yglesias:
The best burrito is the burrito you actually want to have in real life, a burrito that is both tasty and available.
OK, so now we’re discussing which definition of “best burrito” is best. Of course, that’s subjective, as I pointed out (in a wildly different context) some time ago. In any decision making process, especially one involving analytics(**), it is critical to openly discuss the criteria for ranking one decision (or burrito) over another. That is obvious. Less obvious is the fact that we often choose our decision criterion based on how easy the criterion is to evaluate, rather than how relevant it is to the matter at hand.
Let’s return to burritos. The Five Thirty Eight criterion was “best tasting burrito”. Here’s what they had to do to figure that out:
- Decide on a list of attributes to consider, and their weighting.
- Trim the list of all burrito joints in the US to a reasonable number using readily available data.
- Visit each location and sample at least one burrito.
- Take detailed notes on the experience and produce ratings based on the attributes.
- Review and calibrate the ratings.
- Determine the winner by sorting the ratings (and getting Silver’s blessing).
That’s real work! The Vox criterion was “tasty and available” (aka “scalable”). Here’s what they had to do:
- Get a list of chain restaurants that serve burritos. Remove Taco Bell from this list, because it is not tasty.
- Get the number of locations for each.
- Sort the list.
Much easier. A child, or possibly an intern, could do this in about thirty minutes, and you can do it informally in one second by whispering “Chipotle”. A more advanced version of the “scalability” study would be to replace step 2 with:
- Calculate the total reach for each chain by finding the number of people who live within five miles of chain locations.
Throw in a D3 map with burrito reach for the top three chains, and you’ve got yourself a nice little post that will get a bunch of views and retweets.
But that would be pointless(***). This global analysis based on a scalability criterion is useless, and it’s useless even though we all know that the “best burrito criterion” question is subjective. The Five Thirty Eight criterion is certainly useful, even if you don’t agree with it: you’ve got an idea of where to find the best tasting burrito in America, based on a reasonable standard. If you’re looking for the best burrito in your region, you can find that too (Northeast? Head to the Bronx). The Vox criterion as evaluated above is not useful because you can’t use it to make decisions. It only tells you that there are tons of Chipotle locations, which you already knew. Such an analysis might be useful for burrito chains themselves, but not for consumers.
If you are looking for a burrito you can actually eat, like, now, then a useful analysis would consist of opening the Yelp app and searching for “burrito” using “Current Location” and sorting by rating. Of course, the results of your analysis and my analysis based on this criteria would yield different results because we live in different places; the results of my analysis apply only to me.
If we wanted an analysis that would actually be useful in decision making, we’d need to modify our methodology to something like:
- Divide the United States into little squares, say 5 x 5 miles.
- In each square, determine the best “tasty and available” burrito restaurant based on some combination of Yelp reviews of and distance.
- Create an interactive map based on these results.
This is much harder to do, but helpful: if you tell me where you are then I can tell you where you should get a burrito. The answer isn’t always Chipotle! Therefore while the general point that “the best burrito is one that you can actually eat” is quite reasonable, a naive global analysis based on this criterion is quite useless.
Choosing decision criteria based on what is simplest rather than most relevant is a fundamental flaw of many analytics applications. I’ve seen it at every company I’ve worked for! For example, it’s common to instrument websites and apps to see how frequently different features, buttons, and pages are used. Data can be collected and we can see, for example, that 70% of the time users will choose to leave a product download page rather than register. If we collect additional information we can infer demographic information, the device they are using, and so on. If you’re trying to figure out how to modify the website, or which features to add to a product, you may turn to this data. This data tells you only how users use what you’ve got, not what they would like to see. You need different data to answer this question, data obtained from A-B testing, surveys, a competitive analysis, and so on. Sorting whatever data you have nearby and making a cool chart out of it is not good analytics.
Software doesn’t really help you determine what data you’ll need to answer the questions you care about. It can help you access, process, visualize, and summarize, but that’s it. Our current emphasis on data visualization and infographics obscure this point. In the burrito case, it’s easier to make pretty pictures out of the “most scalable” data than the “best tasting” data, even though it is less useful for decision making. Software vendors aren’t helping either: the emphasis on “storytelling” with data skims over the fact that “analytics stories” must have a “moral”, otherwise they merely entertain rather than inform.
(*) Congratulations, La Taqueria.
(**) because the process for making the decision itself is understood by fewer people, and if you change your mind you will have to ask a geek to go change their code.
(***) other than to attract page views…