Marketing Mix Analytics I – Data Acquisition
At last spring’s INFORMS Analytics Conference I was invited to speak about Marketing Mix Analytics at Nielsen. I thought I would (belatedly) summarize my talk for those who were not able to attend.
Nielsen’s mission is to provide the most complete understanding of what consumers watch and buy. My team builds analytics solutions that use watch and buy information to help advertisers understand where their sales come from. Our primary analytical tool to do this is called marketing mix modeling. This first post summarizes what marketing mix is about, and how modeling teams assemble the data necessary for a mix model.
Marketing mix models measure the impact of marketing (and other drivers) on sales. Simply put, we go get sales data in partnership with our clients, and find matching time series data for everything that we believe affects sales: their advertising whether TV, radio, or online; their trade activity such as features and displays in grocery stores; their pricing and discounts; events, holidays, and industry trend. Once we obtain all of this data, we build a big regression model that predicts sales based on all of these factors. This has the effect of attributing the dips and spikes in sales to corresponding dips and spikes in activity. A big ad appears in the paper: sales spike. We assume some portion of the spike is because of the ad. When we run the regression model we obtain a decomposition of sales according to the various factors in the model, based on their coefficients. This allows us to make statements such as “7% of your sales are due to your TV advertising,” or, “you lost 3% of your sales due to your competitor’s pricing strategy.”
These kinds of statements are useful by themselves but they’re even better when you turn them into decisions that affect the future. This is done by chaining models together to provide additional insight. A marketing mix model produces coefficients and decomps – which characterize past sales for historical levels of, for example, TV advertising. We can turn those into sales response curves which predict sales for any level of activity – even levels of advertising that were not conducted historically. These curves are the basis for forecasting and optimization models for media planning. Moving from raw sales, advertising, and pricing data to a coordinated, targeted media plan is a huge leap, but not without challenges.
Textbooks and websites will tell you that marketing mix modeling is old hat, but doing it right is hard work. First of all, getting the data is difficult. The point is for the analyst team and client to dream up everything that can impact sales…and obtain matching, correct time series data for up to three years in duration. Some data, like TV or radio advertising, can be sourced from within Nielsen. Sales, revenue, and margin data comes from a combination of client and MMM vendor sources. Other data such as industry trend, macroeconomic data, and so on may come from third parties. Cleaning and verifying data is always hard, but it’s particularly hard in marketing mix because of its dimensionality. The modeled product dimension may be at the brand, sub brand, or even the PPG (price promoted group) level – a collection of UPCs. Sales data is sometimes modeled down to the store level via grocery store scanner reports. The variety and intricacy of the data used for a “straightforward” mix necessitates a data review between the analyst team and client before modeling even begins. This step alone – getting the data – sometimes takes half of the total cycle time in a mix engagement. Time is money, so defining workflows and procedures that result in quick, accurate, repeatable data acquisition are good for vendor and client alike.