SQL as an Optimization Modeling Language

Several years ago, a former (awesome) colleague of mine at Microsoft, Bart de Smet, and I discussed the expressibility of optimization problems using SQL syntax. Most formulations carry over in a straightforward way, for example if we want to solve:

minimize 2 x + y
subject to 
x^2 + y^2 <= 1,
x >= 0.

Then we can express this as

  POWER(X, 2) + POWER(Y, 2) <= 1 AND X >= 0

Through suitable rewriting such a specification could be easily sent to a solver. You get the idea; a range of problem types, and even concepts like warm starting are easily supported. I suppose even column generation could be supported via triggers.

Update 10/1/2015: Friends Of The Blog Jeff, Alper, and colleagues thought of this long before I did. See this paper for more.


Programming for Data Scientists – Guidelines

As a data scientist, you are going to have to do a lot of coding, even if you or your supervisor do not think that you will. The nature of your coding will depend greatly on your role, which will change over time. For the foreseeable future, an important ingredient for success in your data science career will be writing good code. The problem is that many otherwise prepared data scientists have not been formally trained in software engineering principles, or even worse, have been trained to write crappy code. If this sounds like your situation, this post is for you!

You are not going to have the time, and may not have the inclination, to go through a full crash course in computer science. You don’t have to. Your goal should be to become proficient and efficient at building software that your peers can understand and your clients can effectively use. The strategy for achieving this goal is purposeful practice by writing simple programs.

For most budding data scientists, it is a good idea to try to develop an understanding of the basic principles of software engineering, so that you will be prepared to succeed no matter what is thrown at you. More important than the principles themselves are to put them into practice as soon as you can. You need to write lots of code in order to learn to write good code, and a great way to start is to write programs that are meaningful to you. These programs may relate to a work assignment, or to an area of analytics that you know and love, or a “just for fun” project. Many of my blog posts are the result of me trying to develop my skills in a language that I’m trying to learn, or not very good at. A mistake many beginners make is to be too ambitious in their “warm up” programs. Your warm up programs should not be intellectually challenging, at least in terms of what they are actually trying to do. They should focus your attention on how to effectively write solutions in your programming environment. Doing a professional-grade job on a simple problem is a stepping-stone for greater works. Once you nail a simple problem, choose another one that exercises different muscles, rather than a more complicated version of the problem you just solved. If your warm up was a data access task, try charting some data, or writing a portion of an algorithm, or connecting to an external library that does something cool.

There are many places where you can find summaries of software engineering basics: books, online courses, and blogs such as Joel Spolsky’s (here’s one example). Here I will try to summarize a few guidelines that I think are particularly important for data scientists. Many data scientists already have formal training in mathematics, statistics, economics, or a hard science. For this group, the basic concepts of computing (memory, references, functions, control flow, and so on) are easily understood. The challenge is engineering: building programs that will stand the test of time.

Be clear. A well-organized mathematical paper states its assumptions, uses good notation, has a logical flow of connected steps, organizes its components into theorems, lemmas, corollaries, and so on, succinctly reports its conclusions, and produces a valid result! It’s amazing how often otherwise talented mathematicians and statisticians forget this once they start writing code. Gauss erased the traces of the inspiration and wandering that led him to his proofs, preferring instead to present a seamless, elegant whole with no unnecessary pieces. We needn’t always go that far, whether in math or in programming, but it’s good to keep this principle in mind: don’t be satisfied with stream-of-consciousness code that runs without an error. A number of excellent suggestions for writing clear code are given here [pdf], in particular to use sensible names and to structure your code logically into small pieces. 

Keep it simple. Don’t write code you think you will need, write the code you actually need. Fancy solutions are usually wrong, or at least they can be broken up into simpler pieces. If you have made it too simple, you will figure it out soon enough, whereas if you have made things too complicated you will be so busy trying to fix your code that you may never realize it.

Pretend that you are your own customer. If you are writing a library for others to use, you can start by writing example programs that use the library. Of course at the beginning these examples won’t work – the point is to force yourself to understand how your solution will be used so that you can make it as easy and fun to use as possible. It also forces you to think about what should happen in the case of user or system errors. You may also discover additional tasks that your solution should carry out in order to make life simple. These examples may pertain not only to the entire solution, but to small portions of your solution. By writing tests and examples early – that is, by practicing unit testing and test driven development – you can ensure high quality from the start.

Learn how to debug. Most modern languages are associated with development environments that have sophisticated debuggers. They allow you to step through your code bit by bit, inspecting the data and control flow along the way. Learn how to set breakpoints, inspect variables, step in and out of functions, and all of the keyboard shortcuts associated with each. Computational errors in particular can be very hard to catch by simply reviewing code, so you’ll want to become adept at using the debugger efficiently.

Write high performance code. I wrote about this subject in a previous post. The key is to measure. Rico Mariani does an awesome job of describing why measurement is so important in this post. Rookie data scientists frequently spend too much time tuning their code without measuring.

Add logging and tracing to your code. Your code may run as part of a web application, a complicated production system, or may be self-contained, but it’s always a good idea to add logging and tracing statements to your code. Logging is intended for others to follow the execution of your code, whereas tracing is for your own benefit. Tracing is important because analytics code often has complicated control flow and is more computationally intensive than typical code. When users run into problems, such as incorrect results, sluggish performance, or “hangs”, often the only thing you have to go on are the trace logs. Most developers add too little tracing to their code, and many not at all. Just like the rest of your code, your trace statements should be clear, easy to follow, and tuned for the application. For example, if you are writing an optimization algorithm then you may wish to trace the current iterate, the error, and iteration number, and so on. Sometimes the amount of tracing information can become overwhelming, so add switches that help you to control how much is actually traced.

In future posts in this series, I will talk about developing as part of a team, choice of language and toolset, and the different types of programming tasks a data scientist is likely to encounter over the course of their career.

Better to Be Right Than Fast

Mae West said that too much of a good thing is wonderful. For we shipbuilders who write numerical code that is certainly true of speed and accuracy. How seldom we find ourselves in the happy situation of a piece of code that is both fast enough and accurate enough! A colleague and I were chatting about speed and accuracy today and I realized that when I am building software, I prefer a piece of code that is accurate but slow over one that is less accurate but faster. With profiling and careful thought applied to new code, it’s usually pretty easy to make it faster. Addressing a wide-spread numerical issue often requires a complete re-think.

If I am simply using the software (rather than build it), then all bets are off; it depends on what I am trying to do.

A distasteful analogy for automated analytics systems

I will probably regret writing this post.

So perhaps you have seen these touchless toilets in airports. They flush by themselves. Exhibit A:


Amazing. Wonderful. Sanitary. See that button on the left? That flushes the old fashioned way. You need that. Because no matter how good the sensor is, how elegant your solution for activating the sensor, no matter how impeccably designed the sensor is, you are going to need that button. And when you need that button, boy, you really need that button.

The same thing goes when you are building an “automated” analytics system that hides all of the math and complexity from your poor user. You’re going to need that button, for scarily similar reasons.

Testing at Microsoft

The other day I got into a discussion with a colleague (also ex-Microsoft) about software testing (or QA, if you like) at Microsoft. Here is a short description of my experiences. Microsoft is a big place, and I left a couple of years ago, so your experiences may differ from mine.

Bugs are tracked in a bug tracking system similar to JIRA. Key fields (mandatory for every bug / feature request / whatever) are:

  • Priority
  • Severity
  • Repro Steps
  • Product Version (e.g. Office 10, Office 11, Office 12)
  • Milestone (e.g. M1, M2, M3, Beta, RTM)
  • Resolution (Fixed, Not Repro, Won’t Fix, By Design, …)
  • Status (Open, Resolved, Closed)

Typical workflow:

  • Tester identifies a problem.
  • Tester creates a bug filling out information and assigns to the right developer (or lead, see Triage below)
  • If applicable, the bug is triaged and assigned to the correct developer if necessary.
  • Developer either fixes the problem or identifies the reason why it should not be fixed, and assigns back to tester as Resolved.
  • Tester tests the fix according to their team’s policy.
  • If the fix is acceptable, it is Closed. Otherwise it is reactivated (Status = Open) and assigned back to the developer.
  • If after closing the bug comes back due to a code regression, the bug is reopened and assigned back to the developer. This happens even if the regression is long after the original fix.

Many teams conduct a “waterfall” style process whereby the development for a product release is divided in advance into milestones, and feature development assigned to each milestone. Typically team leadership (program management aka PM, development, QA) agree on release criteria. These are the conditions under which the product will be released. Sample categories:

  • Certain key experiences work as defined.
  • Automated testing for certain key experiences.
  • Number of bugs according to a certain query (for example, no P1-P3 bugs) is zero.
  • Code coverage is at least X%.
  • The top K bugs from the previous release have been fixed.
  • Performance and scalability for key experiences meet certain requirements. (e.g. no page takes longer than K seconds to load.)
  • Security review completed.
  • Globalization review completed
  • Etc.

A release manager (who could be PM/Dev/QA) is often responsible for tracking this checklist as development proceeds. It is often common for there to be milestone exit criteria. This is essentially a less restrictive version of the release criteria. The purpose of milestone exit criteria in the waterfall methodology is to make sure that the product is in a close-to-shippable state even during development.

Based on the release or milestone exit criteria, it is possible to identify a bug query that tracks all issues that as per agreement must be fixed prior to the end of the milestone or release. This query must return zero records before moving to the next milestone or release, otherwise the release is delayed. As the end of the milestone approaches, the team begins to monitor this query closely. The purpose of triage meetings is to review incoming bugs to determine whether they meet the bar for the release. Triage meetings always include a PM, dev, and QA representative. Actions taken in triage may include:

  • Resolving bugs as Won’t Fix or By Design based on review,
  • Raising or lowering the priority of the bug,
  • Assigning bugs to the correct developer,
  • Changing the Product Version or Milestone for the bug, thereby moving the bug out of the current release,
  • Reassigning to the tester for additional information,
  • Assigning to the developer with instructions to investigate the cost and risk of a fix.

Near the end of a release, triage may also review proposed code changes before accepting into the release branch, summoning the developer / tester involved to explain. If triage is not in effect, then QA just assigns the bug to the developer responsible for the area. The release manager periodically sends out a glidepath for the current milestone or release. This shows the current progress towards the zero bug goal, and may identify specific bugs that are the goal for the next period. If insufficient progress towards the goal is being made, corrective action is taken, including:

  • Cutting features
  • Reducing scope
  • Throwing developers in “bug jail”. Any developer with more than X bugs with Pri/Sev of Y are forbidden from working on new features until their bug count is reduced

As developers complete their feature work for a release, they take bugs from other developers to help with the glidepath.

The team identifies a date prior to the release called ZBB (Zero Bug Bounce). This is the day when the bug query is supposed to come up empty. Typically the day after ZBB, new bugs come in, or certain resolved bugs are reactivated. These bugs are tracked in real time and no other changes except changes related to those bugs are accepted. Then final release testing proceeds until the code is released (after signoff from PM, Dev, and Test). In the old days this was called “RTM” – Release to Manufacturing.

Engineering teams should have an Analyst role

Happy New Year! I look forward to a fun year of blogging – I have a number of hopefully interesting posts brewing involving: text analytics, college athletic conference comparisons, Big Data, the Open Source movement, former employers, and of course, chessboxing. The caveat is that I only post when it doesn’t cut into my regular work, and I really can’t spare much time these days. So we’ll see how much of that I get to!

Thanks to Dare Obasanjo I found an interesting response to a Quora question on “What does a Product Manager at Facebook do?”

The top response includes a breakdown of the various engineering roles at Facebook, including “Analyst”:

Analyst: when we’d get too carried away in debates in meetings, one of the eng managers would often remark: "warning: we are entering a data-free zone."  The meaning was that without grounding our arguments in data, we’re just talking about opinions.  The analysts at FB are crucial for keeping everyone grounded in actual numbers.  How well/badly are we doing?  What should be our measure of success?  How do we tell if something is broken?  Analysts play a huge role at Facebook, which will continue to be true as the company grows larger.

This strikes me as a pretty good idea.

The obvious counter is, “Do you really need a separate job description for this? Shouldn’t everyone on the team be an Analyst? Shouldn’t everyone use data to inform decisions?” Well, yes, certainly. But I like the idea of a defined role that attaches this responsibility to a particular person. After all, everyone on an engineering team should be concerned about quality, yet most agree that it is a good idea to have a Test/QA job function. Just as an effective QA team builds a culture that values quality, an effective analyst has the potential to build a culture of data-driven decision making. Additionally, by having an Analyst role this allows for specialization in the form of techniques (regression, data mining, optimization, data collection) and tools – just as many engineering teams have a “performance guru” who can profile anything, anywhere, any time.

On the other hand, I’m speculating. I have never worked on a team with such a role. Have you?

Defining “Better”

I hereby award the “Tweet of INFORMS 2012” prize to Marc-Andre Carle:

#orms is definitely the science of better: each software presented at #INFORMS2012 is better than its competitors 🙂

(Thanks to Michael Trick for the tip!)

Okay, so what is “better” anyway? I get the sense that for many operations research insiders, “better” is another word for “faster”, but that is wrong, wrong, wrong. “Better” means different things to different people. For example:

  • More accurate.
  • Less prone to failure.
  • Easier to use by a broader set of people.
  • Faster to develop a solution.
  • Easier to integrate with other systems.
  • Better supported.
  • Cheaper.
  • Easier to customize and modify.
  • Easier to share.
  • Uses less memory.
  • Unencumbered by intellectual property concerns.

as well as…

  • Faster.

“Better” is a multiobjective problem: most of us actually want many of the things on the list. How we weight the various factors depends in part on what the software is being used for:

  • For academic research,
  • For rapid prototyping,
  • To create a model for a consulting engagement,
  • For a production system.
  • Some of these factors can be measured (and are, thanks to the tireless efforts of Hans Mittelmann and others) while others are more subjective. Even if we are focusing exclusively on “faster”, the picture remains complicated. In a production system what matters is how quickly users get the results they want. So we care about not only the time that the solver takes, but also:

  • How long it takes to retrieve the data and assemble the model to be solved,
  • The predictability of response time over different user requests,
  • How the solver performs in the face of many simultaneous requests.

Solver runtime differences of 5-10% don’t matter that much, generally speaking. I like to categorize how long an operation takes in real-world terms (I stole this idea from somebody else, but I don’t remember who):

  • instantaneous (subsecond)
  • the time it takes to check espn.com and/or twitter (5-10 seconds)
  • get coffee (a few minutes)
  • have lunch (30 minutes or so)
  • overnight
  • a weekend
  • It’s usually not worth the effort making an engineering decision based solely on performance if you can’t move to a different bucket. Otherwise you probably have better things to do.

Employee performance reviews done right

Hiring is the most important activity for any organization. That subject has been covered many times in many ways, so I won’t. Performance evaluation is in the top five, especially because it is linked to compensation. Yes, here’s my blog post about assigning numbers to people.

The back story for this post is that I recently spent all day in performance reviews for our organization. While describing the nature of the process and the details of who said what during today’s session would be great for page views, I’ll steer clear. I want to write about performance reviews because people are almost universally freaked out by them. It’s healthy to do regular, formal, qualitative evaluations of performance…when the evaluations are done the right way.

What are reviews like? Not every reader has been through a formal performance review, at least not in an industry setting. I don’t claim to have had a representative experience either as an employee or manager; I only know what I know. If you want to know more about "how things work" in other places, Google (or Bing) away. I can tell you that my own experiences have been pretty consistent:

  1. Performance is reviewed formally once or twice a year.
  2. Employees fill out a form where they talk about what they’ve done.
  3. Managers rate their employees’ performance and discuss with their peers.
  4. The ratings are sent up the management chain, where a series of calibrations take place to make sure everyone’s grading according to the same curve.
  5. The final ratings come back down the management chain. 
  6. The review is used to determine compensation: pay raises, bonuses, stock, promotions.
  7. Managers and employees have a discussion about the results of the review.

And the circle of life begins anew. Is all of this necessary? Technology people – geeks – really hate this stuff. At the lunch table you will be told that performance reviews are the tool of "pointy hairs" used to suppress and control the free spirited hacker who knows what is right but is not allowed to do it. I am sure there are a thousand smug Dilbert cartoons on the subject. (I despise Dilbert.) Reviews are sometimes used to control, suppress and annoy, but this is a symptom of organizational (and sometimes personal) issues. There are legitimate reasons for reviews. It may sound crass, but:

  • money is a huge motivator,
  • there are limited resources, and
  • there needs to be a process whereby the cash is fairly distributed.

A formal review system at least affords the opportunity to make the process somewhat transparent. If you buy that premise, and you’re in a relatively big organization, then most of the steps above kind of make sense. If you’re in a four person startup, maybe not.

For most of us privileged enough to be in a field like ours, there’s more to it than just money. Different factors motivate us and provide meaning to our work. Since many organizations are publically owned and therefore laser focused on profit, there is often a tension between employees – people – desiring to find greater meaning at work while meeting the needs of their organization. In many cases the things that provide meaning have nothing to do with money. A manager’s job should be to try and thread the needle and do right by both the employee and the organization. Those subscriptions to The Baffler may have snorted chocolate milk out of their noses at this point because this view may sound naive or even exploitive. I kind of get the skepticism, but all I can say is that if I felt I were in a place where I thought doing both weren’t possible, I’d leave. If the people that comprise an organization really want to be about something, then there is no better way to make a statement than through performance reviews. The statement might be "seemingly thankless work like maintaining a build system is valued", or "time spent mentoring new employees matters", or "showboating is lame and counterproductive", or simply "you’re doing an awesome job and we want you to stay." If management is a tightrope act balancing individual development and organizational goals, then performance reviews should be seen as the pole. Stabilizing weight hangs down on either side; a helpful burden.

So what is required to do it right? Above all else, everyone who participates in the process needs to be respectful of everyone else and truthful in their dealings. Most of the horror stories that people have about reviews boil down to a failure in one of these two areas. My own personal horror stories certainly were. (My horror stories: plural, redacted from this post, and unrelated to my current employer.) Face it: any system that involves people will fail without respect and truth. The next most important thing is shared understanding of organizational values. Sometimes the review process itself can help a team understand and articulate its own values more clearly by virtue of needing to introspect. A common trap is assessing the value of heroism: the coder who threw down a few 80 hour weeks to design and implement a brand new system to meet a deadline. New organizations risk overvaluing heroes; stagnant organizations risk undervaluing them. Organizational values make it possible to evaluate contributions. Obviously in order to carry out the evaluation you need to have a clear understanding of what people are doing and the associated value for the organization. Sometimes people spend a lot of time doing an outstanding job on tasks that are not particularly important. Who gets the blame for that depends on the situation, and yeah, realistically sometimes part of the review process is assigning blame – or "responsibility" if you want to be more PC about it.

The review process and its results should not take anyone by surprise. The #1 unwelcome surprise is when a manager tells a direct report that they’re getting a bad review when the employee isn’t expecting it. That sucks for everyone, and the blame lies entirely with the manager. It’s incumbent upon the manager to treat their employees with respect and stay up-to-date with how things are going. A manager may be tempted to blame "the system" in such cases: "I thought you’ve done a fine job, but you know how it is with the curve and all…I did what I could for you but I just couldn’t make the case well enough for you." It’s a reality that differences of opinion exist because no two pairs of heads or hearts are the same, but that’s no excuse for weaseling out of being straight with your team. Hearts and minds should be joined long before the day of judgment arrives. We owe it to each other as a team. The last important ingredient is that peer feedback at multiple levels is crucial. When collecting information for a review, the most important resource are an employee’s coworkers. Asking them directly what they think their peer is doing well, or needs to be improved, is a smart thing to do. As evaluations are reviewed by upper management, repeating this process is important. We all know that some of us are easy graders and others are tougher, so the goal is to be fair by accounting for these differences. Peer feedback needs to be shared at the end of the process (withholding names unless permission has been received) so that employees understand that the evaluation is based on the team’s input.

It’s easy, right? Be respectful, be truthful, understand your values, know what people are up to, and communicate. No, it’s not easy. It takes practice, but remember that reviews done right have still other benefits. They can inform the hiring process. If you know how to evaluate the employees that you have, you know how to look for (and get) the employees that you want. Reviews really can be a positive learning experience for everyone. The downside is that if important ingredients are missing, or applied in the wrong proportion, reviews can be a nightmarish burden. Don’t be one of those teams, and don’t shrink from the challenge!

Team is everything

This is too long for a tweet, so I will make it an extremely brief blog post.

It’s amazing how often engineering managers will spend all night fixing a bug or working on a powerpoint, but will not spend an hour thinking about how to build a team that works together effectively.

(And for the record, my current manager does not suffer from this problem. On the contrary: he thinks about this stuff constantly, and it shows.)

Software Engineering in Large Organizations: Design

(Another in a series.)

When you work at a big software company, the design choices that you make today will shape your destiny for years to come. Much more so than at a smaller, more agile unit that may throw out code or rewrite it with abandon. The reason is simple: the tradeoffs are different when a bunch of people are trying to make improvements to the same thing at the same time.

There is always pressure to put the squeeze on design time. I don’t know how many times (on certain teams) I was asked to complete design in one week out of an eight week sprint. Even then, I sometimes had to spend half of that time cleaning up messes from the previous sprint. So much for measure twice, cut once! It’s true that everyone says that they need more time for each phase of a project, whether it’s requirements, design, implementation, or testing. But I really mean what I say – too little time is spent on design. If given the choice I would gladly trade implementation or testing time for design time, because when I design I still control my own destiny – even over requirements. Once coding begins, the team is carried along by the current and making changes is painful. It’s ironic that all of these things are true, yet the pressure to squeeze design time is entirely justified. There is a huge tendency, even among senior developers, to tinker, re-tinker, and re-re-tinker with designs. Making them bigger, smaller, more intricate, more abstract, and so on.

You can strike a balance by ensuring that:

  • Dedicated time is allocated for design (and only design). Reducing this time means reducing project scope.
  • There is a clearly defined deliverable (such as a design document and/or a presentation).
  • Design starts on an individual or feature team basis and includes more people only as more confidence is developed.
  • Design reviews (informal or formal) always happen before the designer is comfortable with what they have. This ensures that feedback comes after there is sufficient “meat” in the design, but before the designer is totally fallen in love with their ideas.

It’s common to underdesign important things and overdesign unimportant things. The tips above help to avoid both.

It’s also appropriate to change your design approach based on the type of component you’re dealing with. I wouldn’t design an airplane the same way I would design a toothbrush. UI frameworks are rewritten over and over. Calculation engines are not. Business rules are often changed or extended. Databases are often joined or merged but rarely discarded. Glue code is glue code.

At a big company, platform considerations are important but rarely under your control. Someone else has already made those decisions, and you are likely stuck. So you’ll have to learn to design under those constraints, but think of it as a challenge and not a burden (even though it is both).