Taking an Agent-Based Approach to Innovation Diffusion :

Simulating how a new product gains popularity after introduction to the market.

Introduction

The ideation and creation of a new product is typically thought of as an extremely difficult process. Once the product or service has gone through rigorous testing and you and your peers think it will be a huge success within your target audience then you are ready to set things in motion. But potentially even more difficult than creating the product or service, is getting those first customers to purchase - especially if the price tag is high or if the market is highly competitive with grizzled veteran competitors. Strategic planning and forecasting is vital during this phase prior to product launch and can potentially increase the liklihood of traction within the market. Diffusion modeling can provide decision-makers with critical information regarding the impact of their strategic choices such as, competitive pricing, mass media advertising vs targeted marketing strategies, time-oriented goal setting or decsions (when should you expect to break-even/start profitting, ramp up or slow down manufacturing or sales efforts as the product matures, etc.), where to concentrate sales to efficiently spread product popularity, and the list goes on. The purpose of this short paper is to introduce taking an agent-based approach to forecasting new product diffusion. My goal is to eventually propose a method of optimizing the parameters of the model to increase utility.

Preface

Much of my understanding on the topic of innovation diffusion with the use of agent-based modeling/simulation (ABM/ABS) has come from Elmar Kiesling's research, especially his dissertation which can be found here:

Stummer, Kiesling, Günther, & Vetschera. (2015). Innovation diffusion of repeat purchase products in a competitive market: An agent-based simulation approach. European Journal of Operational Research, 245(1), 157-167.

The Bass Diffusion Model

To provide some more context, I want to briefly explain the Bass Model and the variables that are included in it. These same variables will be used in the ABS design and are critical to understanding the process of diffusion.

The diffusion of an innovation traditionally has been defined as the process by which that innovation is "communicated through certain channels over time among the members of a social system" (Rogers, 1983, p. 5).

Rogers, E.M., 1983, Diffusion of Innovations, 3rd edition, The Free Press, New York, NY, USA.

Mahajan V., Muller E., Bass F.M. (1991) New Product Diffusion Models in Marketing: A Review and Directions for Research. In: Nakićenović N., Grübler A. (eds) Diffusion of Technologies and Social Behavior. Springer, Berlin, Heidelberg

As a company or startup introducing a new product, the marketing and sales strategies you use to prop up your product have a major effect on its acceptance. In the definition above, you might conceptualize the communication as advertising and the members of a social system as your target audience or market. In the Bass model, the diffusion of a new product can be thought of as a contagious process that begins with mass communication and either adopted or rejected by word-of-mouth influence. These are the two parameters used in the model. Bass coined external influence (e.g., mass media and advertising) as "coefficient of innovation" and the internal influence (e.g., product reviews and word-of-mouth) as the "coefficient of imitation", suggesting that there are innovators or opinion leaders and imitators or followers.

You have the power to affect external influence with your advertising and marketing efforts, but internal influence is something that is controlled by your audience. Here we can visualize how these two parameter interact in the Bass model:

In [27]:
from IPython.display import Image
Image("ext-and-int-influence-adoptions.png")
Out[27]:

The diffusion process is driven heavily by external influence and the innovators in your market, but then the product is propagated by the imitators as a result of internal influence. Assuming the product is received positively by the innovators and they don't spread negative reviews, the product will continue to gain popularity because more and more will follow their purchasing decision.

Here is another look at diffusion using the Bass model when p = external influence and q = internal influence varies:

In [28]:
from IPython.display import Image
Image("adoption-curves.png")
Out[28]:

N(t) is the typical Bass S-curve and n(t) is a skewed bell curve of new adoptions. This visualization is a perfect example of how your company's advertising efforts have heavy influence on the adoption process. Likewise, a large q value or the internal influencers can speed up diffusion.

Agent-based Research in the Area of Innovation Diffusion

The Bass model described above is one way, and the most popular way, of modeling diffusion at an aggregate level. There are strengths and weaknesses to modeling diffusion like this. It carries a lot of weight in terms of analytical tractability and gives you a way of analyzing the market as a whole. However, aggregate models fail to reproduce how different consumers are from each other - varying purchasing power, personal preferences, availabilty of information, etc. Agent-based modeling and simulation (ABMS) aims to alleviate these limitations. In my opinion Elmar Kiesling describes the use of ABMS in innovation diffusion modeling beautifuly:

"It is a bottom-up modeling approach that aims to capture emergent phenomena in complex systems on the macro-level by simulating the behavior and interactions of entities on the micro-level. Hence, a key distinguishing feature of this approach is that it does not examine relationships between macro-level variables directly, but rather aims to capture the behavior of individuals explicitly by modeling the rules they employ and the interactions they engage in, with the aim of obtaining a bottom-up causal model."

I think the appeal of ABMS is pretty obvious and for all these reasons it's why I used this method to model new product diffusion. In Kiesling's dissertation he provides an agent-based formulation of the Bass model that I used directly for my simulation. Here is what the algorithm looks like:

In [29]:
from IPython.display import Image
Image("ABM Bass Diffusion.png")
Out[29]:

Seeing that it is Kiesling's formulation, I want to do it justice by letting him explain its elements and principles behind it in his language instead of my nonchalant style of voice. Again, this is taken from his dissertation which I have provided a link to in the preface and you can reference pages 22-24 to follow along.

This model consists of M agents indexed by i = 1, . . . , M, each of which is in either of two states: “potential adopter” or “adopter”. We use a set of variables x = ($x_i$, . . . , $x_M$) ∈ {0, 1} to describe the agents’ adoption state (i.e., $x_i$ = 1 iff agent i has adopted). In the Bass model, each actor’s probability to adopt at time $t + \Delta t$, given that it has not adopted by time t, is described by the hazard model in Equation 2.1. In the analogous agent-based formulation in discrete time, we can use agents’ explicit state variable $x_i$ rather than the cumulative distribution function of adoptions $F(t)$. Agent i’s probability to transition from non-adopter to adopter state is given as a function of the state of the system X as follows:

In [30]:
from IPython.display import Image
Image("ABM Bass Math formula.png")
Out[30]:

Analogously to the Bass model, the probability of agent i to adopt, given that it has not adopted so far, depends linearly on an independent external influence p and an internal influence q that depends of the fraction of prior adopters. The formulation implies homogeneity and global interconnectedness, i.e., each agent’s individual probability of adoption is influenced uniformly by the adoption state of all other agents. Obviously, $f(x)$ = 0 ∀ i for which $x_i$ = 1 and $f(x)$ ∈ [0, 1] ∀ i for which $x_i$ = 0, i.e., all agent that already have adopted remain in adopter state and all agents that have not adopted may switch their state with the same probability in the current period. Algorithm 1 presents a discrete time/synchronous updating formulation of the Bass model. The latter is achieved by a temporary variable $\bar{x}$ which is used to store the new state of the system until the end of the period, when the actual updating occurs. In each time period t until the simulation horizon T, the algorithm decides for each agent i whether or not it adopts based on the adoption probability according to Equation 2.5 and a random value rand drawn from X(ω) ∼ U(0, 1) (lines 7-8). If an agent adopts, the temporary variable $\bar{x}$ is updated accordingly (line 9). As soon as all agents have made their adoption decisions, the state of the system is updated (line 12). Then, the cumulative number of adopters by time t is determined by summing over x (line 13) and stored in a vector adoptions, which the algorithm returns after iterating over all periods (line 15).

By now my hope is that you have a basic understanding of diffusion modeling from the aggregate and the disaggregate level and the elements of the ABMS formula used in the rest of this introduction. As another preface, I am somewhat new to programming with Python so please forgive me if there are errors or my code lacks organizational inefficiency - please let me know if there are edits that I should make to increase performance or just make it better in general!!

In [31]:
import pandas as pd
from numpy import *
from math import *
import matplotlib.pyplot as plt

def dfsn_abm(p, q, M, T):
    adopt = list()
    x = zeros((M,), float32)
    x_temp = zeros((M,), float32)
    adoptions = pd.DataFrame(arange(1,T), columns = ['time'])
    for t in range(T):
        for i in range(1,M):
            prob = (p + q * (sum(x) / M)) * (1 - x[i])
            if random.uniform(0,1,1) <= prob:
                x_temp[i] = 1
        x = x_temp
        adopt.append(sum(x))
    return adopt
In [32]:
sim_dfsn = [dfsn_abm(0.01, 0.3, 1000, 25) for _ in range(25)]
In [33]:
df = pd.DataFrame(array(sim_dfsn))
df = df.transpose()

df.plot(legend = None)
Out[33]:
<matplotlib.axes._subplots.AxesSubplot at 0x1e91046bb00>
In [34]:
df['time'] = range(1,26)
cols = df.columns.tolist()
cols = cols[-1:] + cols[:-1]
df = df[cols]
In [35]:
a = df.columns[range(1,26)]
b = ['sim ' + str(i) for i in range(1,len(a)+1)]
d = dict(zip(a, b))
df = df.rename(columns = d)
df
Out[35]:
time sim 1 sim 2 sim 3 sim 4 sim 5 sim 6 sim 7 sim 8 sim 9 ... sim 16 sim 17 sim 18 sim 19 sim 20 sim 21 sim 22 sim 23 sim 24 sim 25
0 1 12.0 17.0 12.0 10.0 9.0 10.0 6.0 6.0 7.0 ... 11.0 13.0 11.0 10.0 9.0 7.0 9.0 4.0 14.0 8.0
1 2 30.0 42.0 22.0 25.0 26.0 22.0 21.0 19.0 19.0 ... 19.0 30.0 32.0 29.0 26.0 17.0 23.0 15.0 32.0 24.0
2 3 42.0 68.0 42.0 41.0 51.0 42.0 42.0 40.0 41.0 ... 35.0 45.0 46.0 51.0 58.0 28.0 35.0 26.0 45.0 43.0
3 4 64.0 96.0 59.0 65.0 70.0 60.0 62.0 73.0 57.0 ... 53.0 73.0 62.0 82.0 87.0 48.0 55.0 40.0 67.0 70.0
4 5 97.0 140.0 88.0 92.0 110.0 94.0 97.0 97.0 72.0 ... 86.0 108.0 98.0 122.0 132.0 72.0 84.0 75.0 104.0 112.0
5 6 133.0 196.0 130.0 125.0 159.0 135.0 137.0 129.0 102.0 ... 115.0 168.0 142.0 162.0 180.0 115.0 113.0 115.0 169.0 160.0
6 7 187.0 267.0 188.0 171.0 217.0 191.0 186.0 192.0 142.0 ... 152.0 220.0 195.0 231.0 228.0 165.0 165.0 161.0 225.0 225.0
7 8 245.0 347.0 255.0 223.0 269.0 266.0 241.0 252.0 201.0 ... 223.0 311.0 249.0 296.0 310.0 227.0 211.0 222.0 291.0 296.0
8 9 326.0 436.0 331.0 284.0 338.0 326.0 304.0 329.0 262.0 ... 305.0 393.0 327.0 390.0 390.0 273.0 271.0 283.0 367.0 390.0
9 10 408.0 515.0 403.0 355.0 424.0 386.0 391.0 399.0 344.0 ... 391.0 474.0 409.0 491.0 470.0 342.0 345.0 365.0 465.0 461.0
10 11 507.0 607.0 503.0 441.0 501.0 463.0 477.0 494.0 428.0 ... 495.0 566.0 488.0 568.0 563.0 409.0 430.0 449.0 555.0 560.0
11 12 590.0 694.0 588.0 522.0 594.0 538.0 566.0 583.0 512.0 ... 584.0 650.0 553.0 661.0 665.0 502.0 524.0 542.0 631.0 646.0
12 13 661.0 762.0 677.0 611.0 685.0 617.0 654.0 678.0 586.0 ... 651.0 736.0 631.0 745.0 736.0 592.0 605.0 628.0 716.0 722.0
13 14 731.0 829.0 761.0 696.0 757.0 708.0 730.0 754.0 663.0 ... 716.0 799.0 710.0 805.0 800.0 667.0 692.0 719.0 781.0 784.0
14 15 800.0 878.0 825.0 766.0 818.0 779.0 792.0 809.0 723.0 ... 773.0 844.0 778.0 859.0 857.0 734.0 768.0 791.0 827.0 832.0
15 16 862.0 910.0 871.0 830.0 867.0 833.0 843.0 851.0 795.0 ... 838.0 889.0 831.0 894.0 892.0 792.0 841.0 841.0 869.0 878.0
16 17 906.0 932.0 897.0 874.0 898.0 877.0 877.0 896.0 844.0 ... 884.0 927.0 878.0 925.0 915.0 851.0 891.0 876.0 899.0 914.0
17 18 934.0 952.0 916.0 909.0 926.0 920.0 912.0 922.0 884.0 ... 930.0 948.0 913.0 942.0 938.0 896.0 920.0 907.0 935.0 931.0
18 19 952.0 965.0 943.0 934.0 954.0 938.0 934.0 943.0 908.0 ... 950.0 962.0 934.0 953.0 953.0 933.0 949.0 937.0 952.0 952.0
19 20 961.0 975.0 958.0 955.0 962.0 953.0 952.0 959.0 936.0 ... 962.0 971.0 952.0 971.0 968.0 950.0 963.0 954.0 966.0 957.0
20 21 972.0 983.0 965.0 968.0 970.0 966.0 970.0 972.0 955.0 ... 975.0 978.0 971.0 976.0 982.0 967.0 971.0 970.0 974.0 968.0
21 22 986.0 985.0 976.0 974.0 978.0 976.0 981.0 981.0 966.0 ... 983.0 983.0 984.0 982.0 990.0 977.0 977.0 980.0 980.0 979.0
22 23 989.0 988.0 979.0 984.0 985.0 982.0 986.0 989.0 978.0 ... 989.0 985.0 986.0 988.0 994.0 979.0 982.0 983.0 986.0 988.0
23 24 992.0 991.0 985.0 992.0 989.0 991.0 988.0 993.0 981.0 ... 994.0 990.0 992.0 992.0 996.0 986.0 994.0 988.0 991.0 989.0
24 25 994.0 997.0 988.0 993.0 992.0 994.0 994.0 996.0 988.0 ... 996.0 993.0 996.0 995.0 997.0 990.0 995.0 989.0 993.0 993.0

25 rows × 26 columns

In [36]:
num=0
for column in df.drop('time', axis=1):
    num+=1
    plt.plot(df['time'], df[column], marker='.', linewidth=1, alpha=0.3, label=column)

plt.title("ABM Formulation of Bass Diffusion Model", loc='left', fontsize=12, fontweight=0, color='orange')
plt.xlabel("Time")
plt.ylabel("Adopters")
plt.figure(figsize=(20,10))
Out[36]:
<Figure size 1440x720 with 0 Axes>
<Figure size 1440x720 with 0 Axes>

To run another simulation, let's consider a new company might not have many resources to heavily market or advertise the product but the word-of-mouth influence is significant.

In [37]:
sim_dfsn2 = [dfsn_abm(0.005, 0.4, 1000, 25) for _ in range(25)]
In [38]:
df = pd.DataFrame(array(sim_dfsn2))
df = df.transpose()

df.plot(legend = None)
Out[38]:
<matplotlib.axes._subplots.AxesSubplot at 0x1e9114f5b38>
In [39]:
df['time'] = range(1,26)
cols = df.columns.tolist()
cols = cols[-1:] + cols[:-1]
df = df[cols]
In [40]:
a = df.columns[range(1,26)]
b = ['sim ' + str(i) for i in range(1,len(a)+1)]
d = dict(zip(a, b))
df = df.rename(columns = d)
In [41]:
num=0
for column in df.drop('time', axis=1):
    num+=1
    plt.plot(df['time'], df[column], marker='.', linewidth=1, alpha=0.3, label=column)

plt.title("ABM Formulation of Bass Diffusion Model", loc='left', fontsize=12, fontweight=0, color='orange')
plt.xlabel("Time")
plt.ylabel("Adopters")
plt.figure(figsize=(20,10))
Out[41]:
<Figure size 1440x720 with 0 Axes>
<Figure size 1440x720 with 0 Axes>

We can see that compared to the first run of the model that the time to full adoption was somewhere between 20-25. With a larger q, the time for full adoption is decreased to somewhere between 15-20.

Lastly, let's consider a company has a large budget for mass communication and advertising of the new product - resulting in a large p.

In [42]:
sim_dfsn3 = [dfsn_abm(0.05, 0.3, 1000, 25) for _ in range(25)]
In [43]:
df = pd.DataFrame(array(sim_dfsn3))
df = df.transpose()

df.plot(legend = None)
Out[43]:
<matplotlib.axes._subplots.AxesSubplot at 0x1e91046b630>

Again, here we can see that the time to full adoption is cut and falls in the range of 10-15. As a marketer it is nice to see the fruits of your labor and how much marketing efforts really impact the popularity growth of the new product.

Future Research

This might seem like a common-sense idea...obviously if you are able to have a greater external influence on your target market then they are more likely to purchase and adopt your product and to some extent I would agree with you. In my future research I am hoping to address this and create something that will allow us to peel back one more layer of granularity. Eventually I want to find the optimal value of p based on the type of agent. This will result in a more micro-targeted approach to marketing. Additionally, we will be able to discover the ideal mix of innovators and early-adopters to target so that they will propagate the new product faster throughout the market.