Optimization In Google Sheets

Good news for those of you that use spreadsheets to do analytics: Google recently announced a Linear Optimization add-on for Google Sheets, and now Frontline Systems has released a free Solver add-on for Google Sheets that solves not only linear optimization problems, but nonlinear ones as well. It has roughly the same capabilities as the Solver App for Excel Online. If you know how to use Excel’s Solver, then you know how to use this. (Disclaimer: I participated in the development of both the Google Sheets and Excel Online apps during my tenure as CTO of Frontline. I think they are great.)

Here’s how to get started with the Solver add-on for Google sheets.

Step 1: Insert the Add-on. Create a new Google Sheet (for example by going to drive.google.com and clicking “New”). Then, under the Add-ons menu, click “Get add-ons..”. Search for “solver” and you will see both the Google and Frontline apps:

image

Click on the button next to Solver. (Hi Edwin!) Now “Solver” will appear under the Add-ons menu. When you click on it, a pane will show up on the right-hand side of your screen.

image

Step 2: Create an optimization model. You can use the task pane to define the variables, objective, and constraints of your optimization model. Clicking on the “Insert Example” button will paste a sample problem into your sheet. Here’s what it looks like: it’s a production planning problem where we want to determine the number of TVs, Stereos, and Speakers to build in order to maximize profit.

image

In the task pane on the right you can see that the profit cell (F13) has been selected as the objective we are maximizing. Similar to the Excel solver, you can define the constraints by clicking on them in the “Subject To” section.

image

Step 3: Solve. Clicking on Solve will call Frontline’s Simplex solver to solve your model on the cloud (specifically – Windows Azure…). The variables B3:D3 will be updated, as will any formulas that depend on those values. As you can see, profit goes up:

image

WINNING. If you fool around with the app you will see that you can solve models with arbitrary formulas, not just linear models. And it’s free! Go check it out.

Advertisements

401k Simulation Using Analytic Solver Platform

You can build a pretty decent 401k simulation in a few minutes in Excel using Analytic Solver Platform:

image

Let’s give it a shot! You can download the completed workbook here.

First, let’s build a worksheet that calculates 401k balances for 10 years. At the top of the worksheet let’s enter a yearly contribution rate:

image

Let’s compute 401k balances for the next 10 years, based on this contribution. A simple calculation for the balance for a given year involves five factors:

  1. The 401k balance for the previous year.
  2. The rate of return for the 401k.
  3. The previous year’s salary.
  4. The rate of increase in the salary (your raise).
  5. The rate of contribution (entered above).

In row 6 we will enter in the starting values for return, salary increase, balance, and salary in columns B, C, D, E respectively. For now let’s assume:

  • Return = 0.05
  • Salary Increase = 0.05
  • Balance = 5,000
  • Salary = 100,000

With a couple of small assumptions, the new balance is old balance * return + contribution * (salary * (1 + salary increase)). In the next row we will compute Year 1, using this formula:

  • Salary = D6 * (1 + C6). This simply means that this year’s salary is last year’s adjusted by raise. (Obviously salary could be modeled differently depending on when the raise kicks in.)
  • Balance = E6*(1 + B6)+D6*$B$3. There are two terms. The first is the old balance times the portfolio return. The second is the current salary times the contribution rate.

We can fill these values down, giving us the 401k balance for the entire period:

image

Here’s the thing: we don’t actually know what our portfolio return and salary increases will be in future years. They’re uncertain. We can use Analytic Solver Platform to turn the wild guesses in columns B and C into probability distributions. Using simulation we can then determine the most likely range for future 401k balances.

For portfolio return, a reasonable thing to do is to go back and look at past performance. Rates of return for the S&P 500 (and other financial instruments) are given on this page. Using the “From Web” feature of Power Query (or by simply copy-pasting) you can bring this data into another Excel worksheet with no sweat:

image

Now let’s turn this historical data into a probability distribution we can use in our model. Select the S&P 500 historical return data and select Distibutions –> Distribution Wizard in the Analytic Solver Platform tab:

image

Fill in the first page of the wizard:

image

Select “continuous values” in the next step, “Fit the data” in the next, and then pick an empty cell for “Location” in the final step. In the cell that you selected, you will see a formula something like this:

=PsiWeibull(3.55593208704872,0.692234009779183, PsiShift(-0.509633992648591))

This is a Weibull distribution that fits the historical data. If you hit “F9” to recalculate the spreadsheet you will see that the value for this cell changes as a result of sampling from this distribution. Each sample is a different plausible yearly return. Let’s copy this formula in place of the 0.05 values we entered in column B of our original spreadsheet. If we click on the “Model” button in the Analytic Solver Platform ribbon, we will see that these cells have been labeled as “Uncertain Variables” in the Simulation section.

For Salary Increase we will do something simpler. Let’s just assume that the increase will be between 2% and 7% each year. Enter =PsiUniform(0.02, 0.07) in cell C6, and fill down.

The last thing we need to do is to define an “output” for the simulation, called an Uncertain Function. When we define Uncertain Functions, we get nice charts and stats for these cells when we run a simulation. Click on the Balance entry for Year 10, then click on arrow next to the “+” in the Model Pane, and then Add Uncertain Function. Your Model Pane will look something like this:

image

And your spreadsheet will look something like this:

image

Now all we need to do is click Simulate in the ribbon. Analytic Solver Platform draws samples for the uncertain variables (and evaluates everything in parallel for fast performance) and then shows you a chart showing the different possible 401k balances. As you can see, the possible balances vary widely but are concentrated around $100,000:

 

image

Here’s the great thing: you can now build out this spreadsheet to your heart’s content to build simulations that incorporate more factors. If you want to get really fancy, you can correlate yearly returns. Check out the extensive help on solver.com for more.

Simulating data for a logistic regression model in Excel

Rick Wicklin from SAS recently wrote a very nice article about simulating data for a logistic regression model. (I have been a loyal reader for years.) I thought it would be interesting to see if we can do the same thing in Excel using Analytic Solver Platform. Yes!

Download the completed workbook here.

Download Analytic Solver Platform here.

Simulating Logistic Data

As Rick describes in his post, the first step is to generate random explanatory and response variables. He breaks this down as follows:

1. Assign the design matrix (X) of the explanatory variables. This step is done once. It establishes the values of the explanatory variables in the (simulated) study.

2. Compute the linear predictor, η = X β, where β is a vector of parameters. The parameters are the "true values" of the regression coefficients.

3. Transform the linear predictor by the logistic (inverse logit) function. The transformed values are in the range (0,1) and represent probabilities for each observation of the explanatory variables.

4. Simulate a binary response vector from the Bernoulli distribution, where each 0/1 response is randomly generated according to the specified probabilities from Step 3.

We can do this in Excel using Analytic Solver Platform’s built-in probability distributions.

Step 0. In Rick’s example, the parameters for the simulated logistic model are (2, 4, –1). So go get Analytic Solver Platform, fire up Excel, and enter 2, 4, –1 in a new worksheet:

image

Step 1. Let’s create the design matrix.

  • Create a new worksheet and type in column headers x0, x1, x2 in row 1.
  • Go to the next row. x0 is the intercept, so fill in 1 for A2.
  • x1 is a uniform random number between 0 and 1, so use the Analytic Solver Platform formula =PsiUniform(0,1). (Excel’s RAND() would also work here.)
  • x2 is normally distributed. Enter =PsiNormal(0,2).
  • Now you have something like this. Your values will be different because x1 and x2 are random.

image

 

Step 2. Create the linear predictor. Add a column header “eta” and enter the following formula: =SUMPRODUCT(Sheet1!$A$1:$C$1,Sheet2!A2:C2). This multiplies the parameters for the simulated logistic model with the design matrix values, and sums them up.

Step 3. Transform the linear predictor by the inverse logit function. Add a column “mu” with the inverse logit formula: =EXP(D2)/(1+EXP(D2)).

Step 4. Simulate a binary response vector. You can use the PsiBernoulli function to simulate 0-1 values. Add a column “y” with the formula =PsiBernoulli(E2). Your spreadsheet now looks something like this:

image

Step 5. Now you’ve got one simulated point. Copy and paste this row down as far down as desired, for example a few hundred rows. If you use Excel’s “Fill Down” feature then make sure that column A is a column of “1” values, not “1, 2, 3, …”. Now you will have a big table with your simulated data.

Step 6. Freeze it! You may have noticed that the values of certain columns jiggle around. This is because new random values are being generated every time Excel recalcs. Click on the “Analytic Solver Platform” ribbon tab, click Tools, and click Freeze. This will lock the current samples in place.

image

Exploring the Data

Now we can use XLMiner to explore and model the simulated data! Click on the XLMiner ribbon tab, and then Explore –> Chart Wizard:

image

Select “Scatter Plot” as the chart type. Then select x2 for the y-axis, x1 for the x-axis, and color by y. You will get a nice color-coded scatter plot that you can customize to your heart’s content interactively:

image

Running Logistic Regression

Now we can run logistic regression to recover the coefficients for the simulated data. Click on the XLMiner ribbon tab and select Classify –> Logistic Regression. The simulation data should be preselected in the “Data range” box. Select x1 and x2 as Input Variables and y as the Output Variable. Your dialog will look something like this:

image

By default, XLMiner logistic regression assumes an intercept, so you can simply click Finish. The logistic regression runs, and the results are reported in an output worksheet. If you examine the coefficients, you should see that they are rather close to (2, 4, –1)!

image

Since everything is in Excel, you can perform additional analysis, build charts, or even score new data.

Download the complete workbook here, and download Analytic Solver platform here.

Presenting Analytic Solver Platform 2014-R2

In 2014 Frontline Systems released the newest version of its flagship product, Analytic Solver Platform. You can download a free trial of Analytic Solver Platform here.

Analytic Solver Platform makes it easy to learn from your data and make good decisions quickly. You don’t have to learn a new programming language, suffer through a complex deployment process, or abandon what you already know: you can grab data from your desktop, the web, or the cloud and build powerful predictive models in minutes from Excel.

ASP1

In this release of Analytic Solver Platform you’ll find world class time series, prediction, classification, data cleaning, and clustering methods in XLMiner. XLMiner’s 30+ data mining methods have been rewritten from the ground up, combining the latest advances in machine learning with a straightforward Excel interface. Data sets that crash more expensive competitive products run flawlessly in XLMiner. Better yet, XLMiner produces reports with all the information you need to make the business case for your findings, including built-in charts and visualizations.

ASPTree

ASPChart

Analytic Solver Platform works with Microsoft Power BI to turn data into insight. My recent post showed how cloud hosted data can be ingested, cleaned, and mined for insight in minutes. Analytic Solver Platform supplements Power Query’s data cleaning with additional methods to help you categorize, clean, and handle missing data, and provides built in connectors to allow you to sample and score with popular data sources including Power Pivot.

Finally, Analytic Solver Platform helps you bridge the gap between experimentation and production deployment. Using Analytic Solver Platform with SharePoint allows your organization to audit and version your models. Use Frontline’s Solver SDK to integrate simulation and optimization in your application whether you use C++, C#, or web technologies. The latest version of Solver SDK will provide support for the popular F# language, allowing your team to build predictive models in a fraction of the development cost and lines of code.

SDK1

Give it a try!