top of page

Bayesian Marketing Mix Modeling in Python via PyMC3


Marketing Mix Modeling is a crucial challenge for businesses, as they have to decide where to allocate their marketing budget. Relying solely on gut feelings, a company may spend significant amounts of money on TV ads, radio ads, and web banners without any assurance of effectiveness.


However, this approach could result in overspending on certain channels that may already be saturated or provide a negative return on investment. To optimize the allocation of resources, businesses need to assess the impact of each media spending channel on their sales or key performance indicators (KPIs).


Marketing mix modeling entails starting with a dataset of media spendings and supplementing it with control variables that could impact the KPI, such as holidays, weather, or product prices. Once the dataset is established, a KPI is selected, typically sales or the number of new customers, and a predictive model is built. This process helps companies determine where to allocate their marketing budget and which channels provide the best returns.

In Bayesian modeling, you can incorporate prior knowledge about a probability p to end up with a density estimate of p, which is an entire distribution instead of a single value. This distribution reflects a trade-off between prior knowledge and observed data. In contrast, the maximum likelihood approach is used in many estimators and models, where the natural estimate for the probability of showing heads is the maximum likelihood estimate of 80% based on flipping a coin 10 times and observing 8 heads. Bayesian modeling allows for excellent extrapolation capabilities and is used for marketing mix modeling, where it helps to solve the issue of hyperparameter estimates often being unstable. A saturation and carryover functionality can be defined in PyMC3 language to incorporate prior knowledge into the model.


First, let us grab our dataset.

Then, we have to define the saturation and carryover functionality, similar to the last article. In PyMC3 language, it might look like this:



Analysis of the Model Output

Afterward, we can look at the usual pictures. Let us start with the posterior distributions. Executing



They all have a nice unimodal (=one peak) shape. We can also explore how pairs of variables behave together via



Above we can see that the saturation strength and the regression coefficient as not independent, but negatively correlated: the higher the coefficient, the lower the saturation parameter tends to be. This makes sense because a higher coefficient can compensate for a slower increasing saturation curve (=lower sat_TV) and vice versa.



Here, we can see why hyperparameter optimization might have problems. Every dot in this picture is a potential model that you could find with a hyperparameter optimization.


For a truly unique best model, we would rather see a point cloud tightly concentrated around a single point (car_TV_true, sat_TV_true). Here, however, we see that the TV carryover strength can have reasonable values between 0.4 and 0.5, depending on the saturation parameter.



so it looks like the model picked up something useful. I will not go into detail about how to evaluate the performance of the model any further here, we can do this in the future.

Channel Contributions

We dealt with distributions so far, but for our favorite channel contributions picture, let us take the means to end up with a single value again. Since we introduced some channel contribution variables in the PyMC3 code, we can easily extract them now using a short compute_mean function.



コメント


bottom of page