# Don’t be a hero when trying to solve set covering problems

Please note that Solver Foundation is no longer supported.

A colleague posted an interesting question to an internal discussion board the other day:

Given a set of code blocks changed between two versions of the same binary, and approximated code coverage for the new binary, what are the optimal tests that cover all changed part of the new binary and are least costly to run?

Example: (A “1” in a cell (Bi, Tj) means that block i is tested by test j)

 B1 B2 B3 B4 B5 B6 Cost T1 1 1 1 1 1 1 50 T2 0 1 1 1 1 1 20 T3 1 1 1 0 0 0 15 T4 0 0 0 1 1 1 15

The optimal solution is T3, T4 with minimized cost 15 + 15 = 30. The best greedy solution that we have is T2, T3 with cost 20 + 15 = 35.

The colleague who posted this question also posted a mathematical formulation for the problem:

Binary integer problem statement:

One constraint for each binary block that needs to be covered to make sure we archive maximum coverage. One variable for each test in the list that we try to optimize.

Block 1:   X1 + X3 > 0
Block 2:   X1 + X2 + X3 > 0
Block 3:   X1 + X2 + X3 > 0
Block 4:   X1 + X2 + X4 > 0
Block 5:   X1 + X2 + X4 > 0
Block 6:   X1 + X2 + X4 > 0

X1, X2, X3, X4 = {0, 1}

Cost function to optimize: 50*X1 + 20*X2 + 15*X3 + 15*X4

It’s simple to model this problem using Solver Foundation. Let’s break it down:

• We have sets of tests and code blocks.
• One input parameter is a matrix of 0-1 values that indicate which tests exercise which blocks.
• Another input is the cost of running each test.
• We want to decide whether or not to run each test.
• Our constraint is to make sure that each block is covered.
• We want to minimize the total cost of running the active tests.

Here is the OML. If you run Excel, change the editing mode to manual, paste the OML, then paste the table above and set up the data binding then you can run it for yourself. If you play around you will see that you can easily solve much larger instances.

```Model[
Parameters[
Sets, Tests, Blocks
],

Parameters[
Integers[0, 1],
Cover[Tests, Blocks] // the cells in the matrix below
],

Parameters[
Integers[-Infinity, Infinity],
Cost[Tests] // the cost column
],

Decisions[
Integers[0, 1],
RunTest[Tests] // whether to run test i
],

Constraints[
Constraint1 -> Foreach[{b, Blocks}, Sum[{t, Tests}, Cover[t, b] * RunTest[t]] >= 1] // ensure coverage
],

Goals[
Minimize[
Goal1 -> Sum[{t, Tests}, Cost[t] * RunTest[t]] // minimize cost
]
]
]```

I posted this solution to the alias, and the subsequent feedback was very interesting! The original poster was pleased to have a solution and followed up privately with some questions about the installation requirements for Solver Foundation,etc. He was happy. Another contributor recognized this as a set-covering problem, assumed it couldn’t be solved efficiently, and started proposing LP relaxations and heuristics. Another went into details about complexity theory and the difficulty of finding polynomial time approximation algorithms for set covering problems. Another simply doubted that the proposed solution would work (but didn’t try it). While these are interesting topics for discussion, it loses sight of the fact that the situation is that an engineer wants to solve a problem. Mixed integer programming solvers (like the ones inside Solver Foundation) are incredibly robust and mature; the products of decades of research. A one-off implementation of a specialized algorithm is likely to be less efficient and less robust than a highly tuned, professional grade solver. If an off the shelf solution works, then grab it with both hands and move on with life! Don’t try and be a hero (unless you plan to get a research paper out of it).