Equations need to be viewed in Intenet Explorer. Symbol fonts are not available in Firefox.
In Chapter 3 we presented market-share attraction models in detail. As we tried to describe realistically the market and competitive structures, more and more complex models had to be introduced - ending at a cross-effects model which has a unique role for each piece of information (e.g., each price or each feature) on the brand to which it refers as well as on every other competitor. From the practical point of view, however, these complex models are not useful unless it is possible for one to calibrate them from the actual market performance of brands. Calibration establishes the value or importance of each of these roles in determining the market performance of each brand. In this chapter we will review the techniques to estimate the parameters of attraction models. We will begin with the most basic models, i.e., the simple-effects form of MCI and MNL models, and then proceed to more complex models such as differential-effects and cross-effects models. To remind the reader, the general specification of simple-effects attraction models is given below.
|
(5.1) |
|
where:
We may choose either MCI or MNL models, depending on whether fk is an identity transformation or an exponential transformation. We will often use the MNL model below in order to simplify our presentation, but the corresponding derivations for the MCI model would be straightforward. Before presenting the use of regression analysis, we will first discuss other estimation techniques applicable to model (5.1).
The maximum-likelihood approach to parameter estimation assumes that the data are obtained from a random sample (sample size n) of individuals who are asked to choose one brand from a set of brands (i.e., choice set ).See Haines, George, H., Jr., Leonard S. Simon & Marcus Alexis [1972], ``Maximum Likelihood Estimation of Central-City Food Trading Areas,'' Journal of Marketing Research , IX (May), 154-59. Also see McFadden [1974]. The resultant data consist of the number of individuals who selected object i , ni (i = 1, 2, ¼, m). This describes a typical multinomial choice process. In order for us to use this type of data, we must modify the definition of the model (5.1) slightly. We assume that the probability, pi , rather than the market share si , that an individual chooses brand i , is specified asSee sections 2.8 and 4.1 for discussions of when market shares and choice probabilities are interchangeable.
|
Clearly pi is a function of the parameters of the model, that is, the a 's and b 's. We may write the likelihood for a set of observed choices n1, n2,¼, nm as
|
(5.2) |
and the logarithm of the likelihood function as
|
By maximizing L or logL with respect to the parameters of the model, we obtain the maximum-likelihood estimates of them. The maximum-likelihood technique may be extended to the cases where observations are taken at more than one choice situation (multiple time periods, locations, customer groups, etc.) provided that an independent sample of individuals is drawn at each choice situation. For example, if a series of independent samples is drawn over time, the log-likelihood function may be written as
|
where nit and pit are the number of individuals who chose brand i in period t and the probability that an individual chooses brand i in period t , respectively, and T is the number of periods under observation.
The maximum-likelihood procedure is a useful technique for parameter estimation in that the properties of estimated parameters are well known,See Haines, et al. [1972].ut we choose not to use it in this book for several reasons. First, since the likelihood and log-likelihood functions are nonlinear in parameters a 's and b 's, the maximum-likelihood procedure requires a nonlinear mathematical-programming algorithm to obtain parameter estimates. Besides being cumbersome to use, such an algorithm does not ensure that the global maximum for the likelihood function is always found. Second, we will be using POS data primarily in calibrating the model. Since POS data generated at a store include multiple purchases in a period made by the same customers, the observed ni 's may not follow the assumptions of a multinomial distribution which underlie the likelihood function. Third, we will be in most cases using observed market shares, that is, the proportions of purchases of brand i , pi , based on an unknown but large total number of purchases.Neither multiple purchases in a single shopping trip, nor purchases of a brand on each of multiple shopping trips within a single reporting period (e.g., a week), fit well with the multinomial-sampling assumptions. Yet both such occurrences can be common in POS data. When analyzing POS data at the store-week level the market shares are not subject to the sampling variation with which maximum-likelihood procedures deal so well. Only the specification error requires special treatment. Section 5.4 presents generalized least-squares (GLS) procedures to cope with the issues.The regression techniques developed in the next section are more easily adaptable to this type of data than the maximum-likelihood procedure.
We will be presenting estimation procedures based on regression analysis in the next section, but the fact that logit models could be estimated by first applying a log-linear transformation and then applying a regression procedure has been known for a long time. We will review some of these procedures before we turn to the approach which we believe is the most convenient.
Over thirty-five years ago BerksonBerkson, Joseph [1953], ``A Statistically Precise and Relatively Simple Method of Estimating the Bioassay with Quantal Response, Based on the Logistic Function,'' Journal of the American Statistical Association , 48 (September), 565-99.showed that a logistic model of binary choice becomes linear in parameters by the so-called logit transformation . Suppose that each individual in a sample (of size n) independently chooses object 1 with probability p1 , given by
|
where:
If the logit transformation is applied to the above model, we have
|
(5.3) |
That equation (5.3) is linear in parameters logb0 and bk (k = 1, 2, ¼, K) suggests the use of regression analysis. But, since the probability p1 is unobservable, it must be replaced in the left-hand side of (5.3) by p1 which is the proportion of individuals in the sample who selected object 1. The final estimating equation is in the following form.
|
(5.4) |
The subscript t indicates the tth subgroup from which the p1 's are calculated. The error term et is the difference between logit transforms of p1 and p1 , and known to be a function of p1 and the sample size per subgroup from which p1 is calculated.To examine the property of the error term, first expand the left-hand side of (5.4) by the Taylor expansion, keep the first two terms, and apply the mean-value theorem to obtain
Berkson's method has been extended to the estimation of parameters of multinomial logit (MNL) models by Theil.Theil, Henri [1969], ``A Multinomial Extension of the Linear Logit Model,'' International Economics Review , 10 (October), 251-59.Assume a multinomial choice process in which each individual independently selects object i with probability pi from a set of m objects in a single trial, and let pi be specified by an MNL model
|
|
This model differs from (5.1) in that a single parameter a is specified instead of m parameters, a1, a2, ¼, am . Theil noted that
|
where 1 is an arbitrarily chosen object, and suggested the following estimation equation which is linear in parameters b1, b2,¼, bK .
|
(5.5) |
where pi is the proportion of individuals who chose object i in sample, and eit* is the combined error term. Subscript t indicates the tth subsample. It is obvious that equation (5.4) is a special case of (5.5) for which the number of objects in the choice set, m , equals 2. The total degrees of freedom for this estimation equation is (m - 1)T where T is the number of subsamples. It is known that the variances of eit* 's are unequal, and McFadden [1974] studied a method for correcting for this problem. The estimation technique which we will propose in the next section is a variant of Theil's method. It is true that both Theil's method and our method yield identical estimates of parameters and their properties are also identical, but we believe that our method has an advantage in its ease of interpretation.
As we have noted in Chapter 2, model (5.1) becomes linear in its parameters by applying the log-centering transformation. Take the MNL model, for example. First, take the logarithm of both sides of (5.1).
|
If we sum the above equation over i (i = 1, 2, ¼, m) and divide by m , we have
|
where [s\tilde] is the geometric mean of si and [`(a)], [`X]k and [`(e)] are the arithmetic means of ai , Xki and ei , respectively, over i . Subtracting the above equation from the preceding one, we obtain the following form which is linear in its parameters.
|
Similarly, the application of the log-centering transformation to the MCI model results in
|
where [X\tilde]k is the geometric mean of Xki . Since those two equations are linear in parameters ai* = (ai - [`(a)]) (i = 1, 2, ¼, m) and bk (k = 1, 2, ¼, K ), one may estimate those parameters by regression analysis.
Suppose that we obtain market-share data for T choice situations . In the following, we often let subscript t indicate the observations in period t , but this is simply an example. Needless to say, the data do not have to be limited to time-series data, and choice situations may be stores, areas, customer groups, or combinations such as store-weeks. Applying the log-centering transformation to the market shares and the marketing variables for each situation t creates the following variables:
Using the above notation, the regression models actually used to estimate the parameters are specified as follows.
MNL Model:
|
(5.6) |
MCI Model:
|
(5.7) |
where eit* = (eit - [`(e)]t) and [`(e)]t is the arithmetic mean of eit over i in period t . Variable dj is a dummy (binary-valued) variable which takes value of 1 if j = i and 0 otherwise. Note that estimated values of ai¢ (i = 2, 3, ¼, m) from (5.6 - 5.7) are not the estimates of original parameters ai , but the estimates of difference (ai - a1) where brand 1 is an arbitrarily chosen brand. Thus we have shown that the parameters of attraction model (5.1) are estimable by simple log-linear regression models (5.6 - 5.7). However, as was surmised from the discussion of Berkson's and Theil's methods, the error term eit* in those regression models may not have an equal variance for all i and t . We will turn to this problem in a later section.
In earlier workNakanishi, Masao & Lee G. Cooper [1982], ``Simplified Estimation Procedures for MCI Models,'' Marketing Science , 1, 3 (Summer), 314-22.we showed that the regression models (5.6 - 5.7) are in turn equivalent to the following regression models.
MNL Model:
|
(5.8) |
MCI Model:
|
(5.9) |
Variable Du is another dummy variable which takes value of 1 if u = t and 0 otherwise. The corresponding models (5.6 - 5.7) and (5.8 - 5.9) yield an identical set of estimates of ai¢'s and bk 's , and in this sense they are redundant. But one of the advantages of (5.8 - 5.9) is that it is not necessary to apply the log-centering transformation to market shares and marketing variables before regression analysis can be performed, and therefore reduces the need for pre-processing of data. If the number of choice situations, T , is reasonably small, it is perhaps easier to use (5.8 - 5.9). If T is so large that the specification of dummy variables Du (u = 2, 3, ¼, T) becomes cumbersome, then the use of (5.6 - 5.7) is recommended. In addition, the properties of the error term eit in (5.8 - 5.9) are easier to analyze than those of eit* in (5.6 - 5.7).
Leaving theoretical issues aside for a while, let us look at the actual procedures one must follow for parameter estimation. Given a standardized statistical-program package, such as SAS(R), the first thing one must do is to arrange the data so that the regression analysis program in such a package may handle regression models (5.6 - 5.7) and (5.8 - 5.9).
Suppose that we have market-share data for m brands in T choice situations (periods, areas, customer groups, etc.), and accompanying marketing activities data. Market-share data may be in the form of percentages (or proportions) or in absolute units. If one ignores for the moment the heteroscedasticity (i.e., unequal variances and nonzero covariances) problems associated with the error terms in regression models (5.6 - 5.9), whether the market-share data are in absolute units or in percentages is immaterial, because the log-centering transformation yields identical parameter estimates regardless of whether it is applied to proportions or the actual numbers of units sold.This property of log-centering is called the homogeneity of the 0th degree. The estimated values of a1 in (5.8 - 5.9) are the only terms affected by the choice between proportions and actual numbers, but it does not influence the values of market shares estimated by the inverse log-centering transformation. able 5.1 is an example of market-share data generated by a POS system.
These data were actually obtained at a single store in 14 weeks (i.e., T = 14 ). There are five national and two regional brands of margarine ( m = 7 ). Brand 2 is the same as brand 1 and brand 4 is the same as brand 3, but in larger packages. All brands are half-pound (225g) packages except brands 2 and 4 which are one-pound (450g) packages. The market shares do not sum to one presumably due to private-label brands not listed here. Market shares are volume shares computed by first converting the numbers of units sold to weight volumes and then computing the weight-volume share of each brand. Inspection of the table will show that the market is obviously very price-sensitive.
We will now try to estimate the price elasticity of market shares based on attraction model (5.1). Also given are average daily sales volumes of margarine in this store expressed in units of half-pound package equivalents. The first step in estimation is to create a data set which includes dummy variables dj (j = 2, 3, ¼, m) and Du (u = 2, 3, ¼, T) so that regression model (5.8 - 5.9) may be used. We chose (5.8 - 5.9) because the number of periods T is reasonably small (= 14). Table 5.2 shows a partial listing of the data set arranged for estimation with the REG procedure in the SAS(R) statistical package.
Market share and price data are taken from Table 5.1, and the logarithms of shares and prices are added. In addition two sets of dummy variables - week dummies and brand dummies - are put in the data set. The dummy variables (D1-D5) for only the first five weeks are reported to save space. If the reader examines the pattern of two sets of dummy variables, their meaning should be self-explanatory. The dummy
Table 5.1: POS Data Example (Margarine)
Ave. | ||||||||
Brands | Daily | |||||||
Weeks | 1 | 2 | 3 | 4 | 5 | 6 | 7 | Vol. |
1 share | 4 | 51 | 3 | 3 | 0 | 1 | 9 | 83 |
price | 192 | 139.5 | 158 | 146 | 163 | 128 | 148 | |
2 share | 2 | 75 | 2 | 1 | 0 | 0 | 5 | 103 |
price | 192 | 140 | 158 | 170 | 163 | 128 | 148 | |
3 share | 3 | 48 | 1 | 1 | 21 | 0 | 13 | 98 |
price | 192 | 138.5 | 158 | 170 | 100 | 138 | 133 | |
4 share | 4 | 44 | 24 | - | 0 | 0 | 11 | 72 |
price | 192 | 139 | 139 | 170 | 163 | 148 | 128 | |
5 share | 5 | 23 | 10 | 1 | - | 26 | 7 | 84 |
price | 192 | 139 | 141 | 170 | 163 | 128 | 128 | |
6 share | 6 | 6 | 3 | 2 | 0 | 36 | 13 | 61 |
price | 192 | 176 | 158 | 170 | 163 | 128 | 128 | |
7 share | 4 | 5 | 5 | 3 | - | 12 | 20 | 74 |
price | 192 | 179 | 163 | 170 | 163 | 128 | 128 | |
8 share | 3 | 2 | 2 | 2 | 41 | 8 | 11 | 107 |
price | 192 | 169 | 185 | 161 | 100 | 134 | 128 | |
9 share | 8 | 5 | 3 | 10 | - | 21 | 17 | 57 |
price | 192 | 168 | 188 | 129.5 | 163 | 138 | 128 | |
10 share | 19 | 3 | 1 | 47 | - | 5 | 8 | 77 |
price | 178 | 179 | 188 | 120 | 163 | 138 | 128 | |
11 share | 12 | 2 | 2 | 19 | 0 | 18 | 15 | 65 |
price | 178 | 179 | 188 | 136.5 | 163 | 138 | 128 | |
12 share | 6 | 47 | 1 | 5 | 0 | 10 | 9 | 87 |
price | 180 | 139.5 | 188 | 149 | 163 | 141 | 128 | |
13 share | 2 | 23 | 1 | 13 | 26 | 6 | 5 | 120 |
price | 192 | 139 | 188 | 137 | 100 | 138 | 128 | |
14 share | 28 | 15 | 10 | 19 | 3 | 3 | 6 | 107 |
price | 132 | 139 | 144 | 134 | 109 | 143 | 128 | |
a Brand 2 is the 1 lb. package of brand 1. | ||||||||
b Brand 4 is the 1 lb. package of brand 3. | ||||||||
c Market Share in %. | ||||||||
d Price per 1/2 pound in Yen. |
Table 5.2: Data Set for Estimation
B | S | P | Week | Brand | |||||||||||||
W | r | h | Log | r | Log | Dummies | Dummies | ||||||||||
e | a | a | i | ||||||||||||||
e | n | r | Share | c | Price | D | D | D | D | D | d | d | d | d | d | d | d |
k | d | e | e | 1 | 2 | 3 | 4 | 5 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | ||
1 | 1 | 4 | 1.38629 | 192 | 5.25750 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
1 | 2 | 51 | 3.93183 | 139 | 4.93806 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
1 | 3 | 3 | 1.09861 | 158 | 5.06260 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |
1 | 4 | 3 | 1.09861 | 146 | 4.98361 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 |
1 | 5 | 0 | . | 163 | 5.09375 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
1 | 6 | 1 | 0.00000 | 128 | 4.85203 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
1 | 7 | 9 | 2.19722 | 148 | 4.99721 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
2 | 1 | 2 | 0.69315 | 192 | 5.25750 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
2 | 2 | 75 | 4.31749 | 140 | 4.94164 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
2 | 3 | 2 | 0.69315 | 158 | 5.06260 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |
2 | 4 | 1 | 0.00000 | 170 | 5.13580 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 |
2 | 5 | 0 | . | 163 | 5.09375 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
2 | 6 | 0 | . | 128 | 4.85203 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
2 | 7 | 5 | 1.60944 | 148 | 4.99721 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
3 | 1 | 3 | 1.09861 | 192 | 5.25750 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
3 | 2 | 48 | 3.87120 | 138 | 4.93087 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
3 | 3 | 1 | 0.00000 | 158 | 5.06260 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |
3 | 4 | 1 | 0.00000 | 170 | 5.13580 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 |
3 | 5 | 21 | 3.04452 | 100 | 4.60517 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
3 | 6 | 0 | . | 138 | 4.92725 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
3 | 7 | 13 | 2.56495 | 133 | 4.89035 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
4 | 1 | 4 | 1.38629 | 192 | 5.25750 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
4 | 2 | 44 | 3.78419 | 139 | 4.93447 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
4 | 3 | 24 | 3.17805 | 139 | 4.93447 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |
4 | 4 | . | . | 170 | 5.13580 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 |
4 | 5 | 0 | . | 163 | 5.09375 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
4 | 6 | 0 | . | 148 | 4.99721 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
4 | 7 | 11 | .39790 | 128 | 4.85203 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
5 | 1 | 5 | .60944 | 192 | 5.25750 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
5 | 2 | 23 | .13549 | 139 | 4.93447 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
5 | 3 | 10 | .30259 | 141 | 4.94876 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |
5 | 4 | 1 | 0.00000 | 170 | 5.13580 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 |
5 | 5 | . | . | 163 | 5.09375 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
5 | 6 | 26 | 3.25810 | 128 | 4.85203 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
5 | 7 | 7 | 1.94591 | 128 | 4.85203 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
variables for weeks graphically reflect that the influence of a particular week is constant over brands. The dummy variables for brands graphically reflect that the baseline level of attraction for each brand is constant over weeks, and thus independent of variations in market conditions.
Now we are in a position to estimate the parameters of attraction model (5.1), in which the only marketing variable is price. Letting Pit be the price of brand i in week t , there is only one attraction component for the MCI version of (5.1) which may be written as
|
which in turn shows that the regression model (5.8) is applicable here.
|
Table 5.3 gives the estimation results from the SAS(R) REG procedure.
The dependent variable is, of course, the logarithm of market share. The first part of the output gives the analysis of variance results. The most important summary statistic for us is, of course, the R2 figure of 0.735 (or the adjusted R2 value of 0.65) which suggests that almost 75% of the total variance in the dependent variable (log of share) has been explained by the independent (=exploratory) variables (log of price, in this case) and dummy variables d2 through d7 and D2 through D14 . The F-test with the ``Prob > F'' figure of 0.0001 shows that the R2 value is high enough for us to put our reliance on the regression results.This test is really against a null hypothesis that all the parameters are zero. There is less than a one-in-ten-thousand chance that this null hypothesis is true. So we can be confident that something systematic is going on, but it takes a much closer look to understand the sources and meaning of these systematic influences.Note that the total degrees of freedom (i.e., the available number of observations -1) is not 97 but 83. This is because there are observations in the data set (see Table 5.2) for which the market share is zero. Since one cannot take the logarithm of zero, the program treats those observations as missing, decreasing the total degrees of freedom. The problems associated with zero market shares will be discussed in section 5.11.
The second part of the output gives the parameter estimates; the intercept gives the estimate of a1 ; D2 through D7 give estimates of
Table 5.3: Regression Results for MCI Equation (5.8)
Model: MODEL1 | |||||
Dep Variable: LSHARE | |||||
Analysis of Variance | |||||
Sum of | Mean | ||||
Source | DF | Squares | Square | F Value | Prob > F |
Model | 20 | 77.33391 | 3.86670 | 8.765 | 0.0001 |
Error | 63 | 27.79373 | 0.44117 | ||
C Total | 83 | 105.12764 | |||
Root MSE | 0.66421 | R-Square | 0.7356 | ||
Dep Mean | 1.92529 | Adj R-Sq | 0.6517 | ||
C.V. | 34.49902 | ||||
Parameter Estimates | |||||
Parameter | Standard | T for H0 | |||
Variable | DF | Estimate | Error | Parm=0 | Prob > |T| |
INTRCPT | 1 | 44.798271 | 4.25812533 | 10.521 | 0.0001 |
D2 | 1 | -0.623847 | 0.29148977 | -2.140 | 0.0362 |
D3 | 1 | -1.485840 | 0.26424009 | -5.623 | 0.0001 |
D4 | 1 | -1.866469 | 0.30893368 | -6.042 | 0.0001 |
D5 | 1 | -3.550847 | 0.61502980 | -5.773 | 0.0001 |
D6 | 1 | -1.971343 | 0.36375236 | -5.419 | 0.0001 |
D7 | 1 | -2.253214 | 0.37405428 | -6.024 | 0.0001 |
DD2 | 1 | 0.254732 | 0.40530020 | 0.629 | 0.5319 |
DD3 | 1 | 0.117670 | 0.38957828 | 0.302 | 0.7636 |
DD4 | 1 | 0.620464 | 0.43444539 | 1.428 | 0.1582 |
DD5 | 1 | 0.269731 | 0.38377375 | 0.703 | 0.4847 |
DD6 | 1 | 0.634560 | 0.38485999 | 1.649 | 0.1042 |
DD7 | 1 | 0.644783 | 0.38546807 | 1.673 | 0.0993 |
DD8 | 1 | 0.243568 | 0.37504599 | 0.649 | 0.5184 |
DD9 | 1 | 0.778571 | 0.38417509 | 2.027 | 0.0469 |
DD10 | 1 | 0.424670 | 0.38363952 | 1.107 | 0.2725 |
DD11 | 1 | 0.742352 | 0.38454418 | 1.930 | 0.0581 |
DD12 | 1 | 0.547800 | 0.38402005 | 1.426 | 0.1587 |
DD13 | 1 | 0.274498 | 0.37351312 | 0.735 | 0.4651 |
DD14 | 1 | -0.214251 | 0.37808396 | -0.567 | 0.5729 |
LPRICE | 1 | -8.337254 | 0.81605692 | -10.217 | 0.0001 |
a2¢, ¼, a7¢ ; DD2 through DD14 give the estimates of g2, g3, ¼, g14 ; the value next to LPRICE gives the estimate of bp , and so forth. From this table several important facts concerning the competitive structure of margarine in this store are learned.
First, the estimated price parameter is a large negative value, -8.34 , indicating that the customers of this store are highly price-sensitive. The statistical significance for the estimate is shown by the T-value and ``Prob > |T|'' column, both of which show that the estimate is highly significant.To be precise it is significantly different from zero. It should also be noted that the reported probability levels are for two-tailed tests. While nondirectional hypotheses are appropriate for time-period and brand dummy variables, we often have directional hypotheses about the influences of prices or other marketing instruments. The reported probabilities should be cut in half to assess the level of significance of one-sided tests.Recall from Chapter 2 that the parameter value is not the same as the share elasticity for a specific brand. In the case of an MCI model, the latter is given by bp(1 - sit) . For example, if a brand has a 20% share, its share elasticity with respect to price is approximately -8.34 ×(1 - 0.2) = -6.67 , indicating a 10% price cut should lead to a 66.7% increase in share (from 20% to 33%).
Second, the estimates of brand specific parameters, a2¢,¼, am¢ , are all negative and statistically significant. The true values of a2¢, ¼, am¢ are estimated by adding the corresponding regression estimates to the estimated value of a1 . Since a1 is estimated at 44.8, we know that brand 1 has the strongest attraction if other things are equal. Brand 5 has the weakest attraction with a1 + a5 = (44.8-3.55) = 41.25 . This implies that, other things being equal, brand 1 is 35 times (= exp 3.55) as attractive as brand 5. It is rather interesting to note that brand 2 (which is one-pound package of brand 1) has approximately one-half the attraction (exp-.62 » 0.54) of brand 1. Even within a brand a weaker size has to resort to lower unit prices than the stronger size to gain a larger share .
Third, the estimates of g2, g3, ¼, gT are with few exceptions (weeks 6, 7, and 11) statistically insignificant. This normally suggests that dummy variables D2, D3, ¼ , DT may be deleted from the regression model, which in turn suggests that a multiplicative model of market share (discussed in Chapter 2) probably would have done as well as the attraction (MCI) model in analyzing the data in Table 5.1. However, we chose an attraction model not only because of how well it fits the data but because it represents a more logically consistent view of the market
Table 5.4: Regression Results for MNL Equation (5.9)
Model: MODEL1 | |||||
Dep Variable: LSHARE | |||||
Analysis of Variance | |||||
Sum of | Mean | ||||
Source | DF | Squares | Square | F Value | Prob > F |
Model | 20 | 77.22749 | 3.86137 | 8.719 | 0.0001 |
Error | 63 | 27.90015 | 0.44286 | ||
C Total | 83 | 105.12764 | |||
Root MSE | 0.66548 | R-Square | 0.7346 | ||
Dep Mean | 1.92529 | Adj R-Sq | 0.6504 | ||
C.V. | 34.56501 | ||||
Parameter Estimates | |||||
Parameter | Standard | T for H0 | |||
Variable | DF | Estimate | Error | Parm=0 | Prob > |T| |
INTRCPT | 1 | 11.250720 | 1.01638598 | 11.069 | 0.0001 |
D2 | 1 | -0.743850 | 0.29829963 | -2.494 | 0.0153 |
D3 | 1 | -1.582301 | 0.26788475 | -5.907 | 0.0001 |
D4 | 1 | -1.980421 | 0.31598491 | -6.267 | 0.0001 |
D5 | 1 | -3.087742 | 0.58245966 | -5.301 | 0.0001 |
D6 | 1 | -2.074613 | 0.37148467 | -5.585 | 0.0001 |
D7 | 1 | -2.309865 | 0.37915203 | -6.092 | 0.0001 |
DD2 | 1 | 0.240284 | 0.40596127 | 0.592 | 0.5560 |
DD3 | 1 | 0.133747 | 0.39036655 | 0.343 | 0.7330 |
DD4 | 1 | 0.648161 | 0.43501731 | 1.490 | 0.1412 |
DD5 | 1 | 0.301956 | 0.38439750 | 0.786 | 0.4351 |
DD6 | 1 | 0.665472 | 0.38586824 | 1.725 | 0.0895 |
DD7 | 1 | 0.680742 | 0.38658443 | 1.761 | 0.0831 |
DD8 | 1 | 0.282518 | 0.37617724 | 0.751 | 0.4554 |
DD9 | 1 | 0.829837 | 0.38524729 | 2.154 | 0.0351 |
DD10 | 1 | 0.486656 | 0.38459756 | 1.265 | 0.2104 |
DD11 | 1 | 0.773457 | 0.38552149 | 2.006 | 0.0491 |
DD12 | 1 | 0.555236 | 0.38479525 | 1.443 | 0.1540 |
DD13 | 1 | 0.315302 | 0.37443996 | 0.842 | 0.4029 |
DD14 | 1 | -0.236656 | 0.37918145 | -0.624 | 0.5348 |
PRICE | 1 | -0.053868 | 0.00528884 | -10.185 | 0.0001 |
and competition.The parameters for the time periods merely serve the role of insuring that the other parameters are identical to those of the original nonlinear model. This structure guarantees that the model will produce market-share estimates which are always non-negative and always sum to one over estimates for all alternatives in a choice situation. ince our purpose is to estimate the parameters of an attraction model correctly, it is not justified for us to drop those dummy variables from the regression equation.
Table 5.4 gives the estimation results by equation (5.8) of the MNL version of attraction model (5.1). The independent variables are the same as those of (5.9), except that price itself is used instead of the logarithm of price. The overall pattern of estimated parameters is very similar to those from (5.9). The estimated value of the price elasticity parameter, bp , is -0.054 . Recall that the share elasticity with respect to a marketing variable (price in this case) is given by bp Pit(1 -sit) . If sit is 0.2 and price is 150 yen for a brand, the price elasticity is approximately -6.5 , which agrees well with the estimated elasticity value from equation (5.9).
It may added that regression models (5.8 - 5.9) are equivalent to an analysis-of-covariance (ANCOVA) model of the following form.
MNL Model:
|
or
MCI Model:
|
where:
There is no brand-by-period interaction term because there is one observation per brand-period combination. The ANCOVA models yield parameter estimates that are identical to those obtained from models (5.8 - 5.9). This ANCOVA representation clarifies the characteristics of (5.8); an attraction model requires that the period main effects be taken out before the parameters of marketing variables are to be estimated. If we ignore the properties of the error term (discussed in the next section), the ANCOVA model may be convenient to use in practice since it does not require cumbersome specification of brand and period dummy variables.
We have deferred the discussion of the analysis of the error term up to this point, though it has been suggested that the error terms in regression models (5.6 - 5.7) and (5.8 - 5.9) are known to have unequal variances and non-zero covariances in some cases and may require special care in estimation. Before we show this, we will have to make some assumptions as to the composition of the error term with respect to the sources of error.
It is important to recognize two sources of errors inherent in the estimation of market-share models. The variability due to sampling is clearly one source of error, but there is another source of error we must consider. Recall that attraction model (5.1) includes an error term, ei , which arises due to the omission of some relatively minor factors from its specification of explanatory variables, the Xkit 's, in (5.1). We will call this source of error the specification error . Considering those sources of error, the error terms in regression models (5.8 - 5.9) may be expressed as
|
where ei1t is the specification-error term and e2it is the sampling-error term.To be precise, the error term in attraction model (5.1) should be written as e1it , but we will not change the notation at this point for the reasons that will become apparent later. he error term in regression model (5.6) is given by subtracting [`(e)]t , the means of eit over i in period t , from eit . Hence we may write
|
where [`(e)]1t and [`(e)]2t are respective means of e1it and e2it over i in period t .
We will make the following assumptions regarding the specification-error term, e1it , throughout the remainder of this book.
We have so far made no assumption about the sampling-error term (except that it is uncorrelated with the specification-error term) because the method of data collection greatly affects the properties of sampling errors. Two basic methods of data collection will be distinguished.
One is the survey method in which a sample is randomly drawn from a universe of consumers/buyers. In this case the unit of analysis is the individuals in the sample. One may ask the respondent which brand he/she selected or how many times he/she purchased each brand in a period. Individual selections or purchases are then aggregated over the sample to yield market-share estimates. It may be noted that the so-called consumer panels - diary or optical-scanner - share essentially the same characteristics as the survey method as a data collection technique because the unit of analysis is an individual consumer or household.
Another basic method concerns data gathered from POS system. It should be emphasized that POS-generated market-share data are based on all purchases made in a store in a period and not on the responses obtained from a sample of customers to the store. This means that we need not be concerned with the normal sources of sampling variations (i.e., sampling variations among customers within a store). Our only concern is with sampling variations between stores, since POS data currently available to syndicated users are usually based on a sample of stores. We will deal with each type of data collection method in turn.
Let us assume that a series of samples of consumers or buyers is obtained by a simple random sampling. We assume that an independent sample is drawn for each period (or choice situation). Since the following analysis is limited within a period, time subscript t is dropped for simplicity. As noted above, one may ask the respondent either which brand he/she chose or how many times he/she bought each brand in a period. We will have to treat those two questioning techniques separately.
First consider the case in which each respondent is asked which single brand he/she chose from a set of available brands (= choice set). In this case we may assume that the aggregated responses to the question follow a multinomial choice process. Formally stated, given a sample size n and the probability that a respondent chose brand i is pi (i = 1, 2, ¼,m) (m is the number of available brands), the joint probability that brand i is chosen by ni individuals (i = 1, 2, ¼, m) is given by
|
The market-share estimates are pi = ni/n (i = 1, 2, ¼, m). These estimates are subject to sampling variations.
Let us now turn to the properties of the sampling-error term
|
when market-share estimates, the pi 's, are generated by the multinomial process described above. It is well known that for a reasonably large sample size (n > 30 , say), pi is approximately normally distributed with mean pi and variance pi(1 - pi)/n . Given this approximate distribution, we want to know how e2i is distributed. We will use the same technique as that used by Berkson. First, expand logpi by the Taylor expansion around logpi and retain only the first two terms. Then apply the mean-value theorem to obtain
|
where pi* is a value between pi and pi . This shows that for a reasonably large sample size, logpi is approximately normally distributed with mean logpi and variance pi(1 -pi)/npi*2 . The approximation will improve with the increase in sample size, n . Thus the sampling error is also approximately normally distributed with mean zero and variance pi(1 - pi)/npi*2 . Furthermore, due to the nature of a multinomial process, it is known that e2i and e2j ( j ¹ i) in the same period are correlated and have an approximate covariance -pipj/npi* pj* where pj* is a value between pj and pj . For a reasonably large sample size, we may take
|
(5.10) |
Clearly the variance of the error term is a function in pi and takes a minimum value 1/n for pi = 0.5 and a large value for very small values of pi . For example, if pi = 0.01 , the variance of e2i is approximately equal to 99/n . This phenomenon is called heteroscedasticity in the variance of e2i . But we must also be concerned with the covariance between e2i and e2j to the extent 1/n is not negligible.
The above properties of the error term are based on the assumptions that each respondent is asked which brand he/she chose in a given choice situation. The properties change considerably if the respondent is asked how many units of each brand he/she purchased in a period. The individual responses are aggregated over the sample to yield the number of units of brand i bought by the entire sample, xi (i = 1, 2, ¼, m ). The estimate of market share of brand i is given by [^s]i = xi/ x where x is the sum of the xi 's over i . What are the properties of the error term when the logarithm of [^s]i is used as the dependent variable in regression model (5.8 - 5.9) or the log-centered value of [^s]i is used in (5.6 - 5.7)? The answer depends on the assumption we make on the process which generates the xi 's. In general the derivation of the properties of the error term is a complicated task since [^s]i is a ratio of two random variables xi and x , the latter including the former as a part of it. Luckily for us, however, the estimated value of parameters of (5.6 - 5.7) will not change if we used the log-centered value of [`x]i , the mean of xi , in place of the log-centered value of [^s]i in (5.6 - 5.7), since
|
where [x\tilde] and [[^s]\tilde] are the geometric means of xi and [^s]i over i in a given period. This in turn suggests that in regression model (5.8 - 5.9) we may use log(xi) as the dependent variable without changing the estimated values of parameters other than a1 . This reduces our task in analyzing the properties of the error term considerably.
Suppose that the xi 's are generated by an arbitrary multivariate process with means m1, m2, ¼, mm and covariance matrix Q with elements {qij}. Note that the true market share is given by si = mi/ m where m is the sum of the mi 's over i . The sample mean of xi , [`x]i , is an estimate of mi . We obtain the linear approximation of logxi by the usual method, that is,
|
where xi* is a value between [`x]i and mi . If we replace log([^s]i) in the equations leading to (5.8 - 5.9) by log[`x]i , the sampling-error term becomes
|
When the sample size is reasonably large, the approximate variances and covariances among the e2i 's are given by
|
These results agree with those for the multinomial process, if we note that mi = pi and [`x]i = pi in the latter process. The variance and covariances of the sampling error term are clearly functions of mi and may take a large value if mi or mj are near zero. The existence of heteroscedasticity is obvious.
We now combine the above results with our assumptions on the specification-error term. Under the assumptions of a multinomial choice process and a single choice per individual, the approximate variances and covariances among the ei 's in a same period are given by
|
where Var(e2i) and Cov(e2i, e2j) are given either by (5.10). Because of the heteroscedasticity of the error term, it is known that the estimated parameters of regression models (5.6 - 5.9) based on the ordinary least-squares (OLS) procedure do not have the smallest variance among the class of linear regression estimators. Nakanishi and Cooper [1974] suggested the use of a two-stage generalized least-squares (GLS) procedure in the case of a multinomial choice situation to reduce the estimation errors associated with regression models (5.6 - 5.9). The interested reader is referred to Appendix 5.14 for more details of this GLS procedure.
When the market-share estimates are obtained from POS systems, it is not necessary for us to consider the sampling errors within a store, but, if our market-share data are obtained by aggregating market-share figures for a number of stores, we should expect that there are variations between stores. This presents us the heteroscedasticity problem similar to what we encountered with survey data. But there are additional problems as well. Each store tends to offer its customers a uniquely packaged marketing activities. If we aggregate market-share figures from several stores, we will somehow have to aggregate marketing variables over the stores. As discussed in Chapter 4, aggregation is safe if the causal condition (i.e., promotional variables) are homogeneous over the stores - as might be the case when stores within a grocery chain are combined. One should avoid the ambiguity which results from aggregation, either by explicitly recognizing each individual store or by aggregating only over stores (within grocery chains) with relatively homogeneous promotion policies. We will take this approach in the remainder of this book.
Stated more formally, let siht be the market share of brand i in store h in period t , and Xkiht be the value of the kth marketing variable in store h in period t . Regression model (5.6 - 5.7) may be rewritten with the new notation as
MNL Model:
|
(5.11) |
MCI Model:
|
(5.12) |
where s*iht is the log-centered value of siht in store h in period t , and [`X]kht and [X\tilde]kht are the arithmetic mean and geometric mean of Xkiht over i in store h in period t .
The main advantage of a disaggregated model such as (5.11 - 5.12) is that we do not have to deal with sampling errors in estimation. Similar expressions may be obtained for (5.7) or (5.9), but in actual applications there will be too many dummy variables which have to be included in the model. It will be necessary to specify (H ×T - 1) dummy variables, where H is the number of stores, which replaces the (T - 1) period dummy variables in (5.8 - 5.9). With only a moderate number of stores and periods it may become impractical to try to include all necessary dummy variables for estimation, in which case the use of models (5.11 - 5.12) is recommended.
In the preceding section we noted that the error terms in regression models for estimating parameters of market-share models tend to be heteroscedastic, i.e., have unequal variances and nonzero covariances. If market-share figures are computed from POS data, the error terms in regression models (5.6 - 5.12) involve only what we call specification errors . Let S be the variance-covariance matrix of specification errors with variances si2 (i = 1, 2, ¼, m) on the main diagonal and covariances sij (j ¹ i) as off-diagonal elements. Because matrix S is heteroscedastic, Bultez and NaertBultez, Alain V. & Philippe A. Naert [1975], ``Consistent Sum-Constrained Models,'' Journal of the American Statistical Association, 70, 351 (September) 529-35.proposed an iterative GLS procedure. The steps of an iterative GLS procedure are as follows.
There is one minor problem in applying this iterative procedure. It may be remembered that, in regression model (5.6), the log-centering transformation is applied to the dependent variable, the variance-covariance matrix for the e*it 's is given by
|
where I is an identity matrix and J is a matrix, all elements of which are equal to 1. The dimensions of both I and J is m ×m , where m is the number of available brands. [^(S)]* computed from OLS residuals is therefore singular and not invertible. Since regression models (5.8 - 5.9) are equivalent to (5.6 - 5.7), the residuals estimated from the former are identical to those estimated from the latter, and hence the estimated covariance matrices are also identical. In general, if both brand-dummy variables and period- (or store-) dummy variables are inserted in a regression model, the estimated residual covariance matrix becomes singular. This certainly is an impediment to the GLS estimation procedure which requires the inverses of estimated covariance matrices.
There are three methods of circumventing this problem. One is to delete one row and corresponding column from [^(S)]* and invert it. One observation (which corresponds to the deleted row/column of [^(S)]*) per period is deleted and the parameters are estimated on the remaining data. The drawback of this technique is that estimated parameters will be transformations of original parameters, and hence will have to be transformed back to the original, a process which is rather cumbersome. A second method is to set to zero those off-diagonal elements of an estimated residual covariance matrix which are nearly zero. Though theoretically less justifiable, it has its merit in simplicity. Usually it is sufficient to set just a few elements to zero before the inverse may be obtained.If
one wishes to be more formal in this method, one may set to zero those elements which are not significantly different from zero statistically. On the other hand, by setting all off-diagonal elements to zero we obtain an easily implemented, weighted least-squares () procedures which compensates only for differences the variance of specification errors between brands.The third method is to find the generalized inverse of [^(S)]* .
As an illustration of the GLS technique consider the data set given in Table 5.1. The OLS estimation technique applied to regression model (5.8) yielded the parameter estimates in Table 5.3. Residual errors were then computed from the above OLS results and S was estimated. The estimated S and its inverse are shown below. Those elements of the estimated S which were less than 0.3 were set to zero before the matrix was inverted.
Covariance Matrix
B | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
1 | 0.183164 | -.050222 | -.009048 | -.063581 | 0.102546 | -.142236 | 0.020543 |
2 | -.050222 | 0.386247 | -.233128 | -.057280 | -.156411 | -.261190 | 0.186987 |
3 | -.009048 | -.233128 | 0.302828 | 0.062024 | -.066156 | -.020426 | -.086928 |
4 | -.063581 | -.057280 | 0.062024 | 0.234230 | -.304044 | -.024887 | -.078644 |
5 | 0.102546 | -.156411 | -.066156 | -.304044 | 0.359436 | 0.074360 | 0.020021 |
6 | -.142236 | -.261190 | -.020426 | -.024887 | 0.074360 | 0.880167 | -.444807 |
7 | 0.020543 | 0.186987 | -.086928 | -.078644 | 0.020021 | -.444807 | 0.289530 |
Inverse Covariance Matrix
B | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
1 | -62.656 | 13.7604 | -13.3493 | 14.4918 | 47.1276 | -65.075 | -108.934 |
2 | 13.760 | -0.4766 | 3.5862 | -8.6329 | -13.2922 | 12.164 | 17.728 |
3 | -13.349 | 3.5862 | 1.5801 | -2.0956 | 6.5368 | -12.807 | -22.087 |
4 | 14.492 | -8.6329 | -2.0956 | -4.2420 | -14.5748 | 13.098 | 23.917 |
5 | 47.128 | -13.2922 | 6.5368 | -14.5748 | -36.9378 | 45.266 | 76.131 |
6 | -65.075 | 12.1642 | -12.8072 | 13.0983 | 45.2664 | -61.315 | -102.342 |
7 | -108.934 | 17.7275 | -22.0868 | 23.9170 | 76.1315 | -102.342 | -165.360 |
The square-root of the above inverse matrix was pre-multiplied by the data matrix for each week, and the estimates of the following form are obtained.
|
where Xt is the independent variable matrix and yt is the vector of the dependent variable for period t . The re-estimated parameter values are shown in Table 5.5.
Table 5.5: GLS Estimates for Table 5.3
Parameter | Parameter | ||
Variable | Estimate | Variable | Estimate |
Intercept | 45.4977 | ||
D2 | -0.6529 | DD6 | 0.4764 |
D3 | -1.505 | DD7 | 0.4892 |
D4 | -1.8942 | DD8 | 0.0709 |
D5 | -3.4476 | DD9 | 0.6449 |
D6 | -2.0313 | DD10 | 0.5546 |
D7 | -2.2964 | DD11 | 0.6610 |
DD2 | -0.1283 | DD12 | 0.2626 |
DD3 | -0.1412 | DD13 | 0.0022 |
DD4 | 0.4260 | DD14 | -0.3082 |
DD5 | 0.1464 | LOG(PRICE) | -8.4395 |
Table 5.5 gives the so-called two-stage GLS estimates. If necessary, residual errors and S may be computed from the above results again and another GLS estimates may be obtained. But, since the parameter estimates in Table 5.3 are extremely close to those in Table 5.5, further iterations seem unnecessary. In fact it has been our experience that OLS and GLS estimates are very similar in many cases. The OLS procedure appears satisfactory in many applications.
So far we have reviewed estimation techniques applicable to relatively simple attraction models (5.1). We have shown in Chapter 3 that attraction models may be extended to include differential effects and cross effects between brands. In the following sections we will discuss more advanced issues related to the parameter estimation of differential-effects and cross-effects (fully extended) models.
The differential-effects version of attraction model (5.1) is expressed as follows.
|
(5.13) |
where either an identity or exponential transformation may be chosen for fk , depending on whether an MCI or MNL model is desired. The chief difference between (5.1) and (5.13) is the fact that parameter bki has an additional subscript i , suggesting that the effectiveness (and hence the elasticity) of a marketing variable may differ from one brand to the next. This is certainly a plausible model in some situations and worth calibrating.
The estimation of parameters bki (i = 1, 2, ¼, m) is not extremely complicated. Only a slight modification of regression models (5.6 - 5.9) achieves the result. Using the previous definitions for dummy variables dj and Du , the differential-effects versions of regression models (5.6 - 5.7) are given by
MNL Model:
|
(5.14) |
MCI Model:
|
(5.15) |
In regression models (5.14 - 5.15) the independent variables are replaced by each variable multiplied by (dj - 1/m), which equals (1 - 1/m) if j = i , and -1/m otherwise. Thus the number of independent variables is (m×K) + m - 1 . Note that regression models (5.14 - 5.15) will have to be estimated without the intercept term. Most regression programs provide us with this option.If an intercept term is included, its estimated value will be zero. We cannot obtain the estimate of a1 from (5.14) or (5.15), but this poses no problem in computing market shares since the estimated value of ai is actually the difference between true ai and a1 .Rather than automatically assigning a1 as the brand intercept to drop, one can run the regression with all brand intercepts (which will be a singular model) and find the intercept closest to zero as the one to drop.Similarly regression models (5.8 and 5.9) may be modified as follows for their respective differential-effect versions.
MNL Model:
|
(5.16) |
MCI Model:
|
(5.17) |
Regression models (5.14 - 5.15) and (5.16 - 5.17) yield identical estimates of parameters a 's (except a1) and b 's. If the number of periods (or choice situations) is large, (5.14 - 5.15) will be preferred.
The reader may feel that the following regression models are more straightforward modifications of (5.6 - 5.7), but it is not the case.
MNL Model:
|
(5.18) |
MCI Model:
|
(5.19) |
Models (5.18 - 5.19) do not represent an attraction model, but a log-linear market-share model in which the share of brand i is specified as
|
where X*ki is a centered value of Xki , that is, ( Xkit -[`X]kt) if fk is an exponential transformation and (Xkit/ [X\tilde]kt) if fk is an identity transformation. While these models themselves may have desirable features as market-share models, models (5.18 - 5.19) are not the estimating equations for (5.13).The difference here is that (5.14 - 5.15) log-center the differential-effect variable, while (5.18 - 5.19) log-center the simple-effect variable and then multiply these log-centered variables by the brand-specific dummy variables.
Let us see what those modifications mean from the illustrative data of Table 5.1. The independent variable in this case is price. In order to estimate regression model (5.17) (for an MCI version), data must be arranged as in Table 5.6. Only the dependent variable and a part of explanatory variables (log(price) × brand dummy variables) are shown. The week and brand dummy variables are the same style as in Table 5.2.
The estimation results are shown in Table 5.7. The fit of model, as measured by R2 , improved from 0.736 to 0.826. The gain from adding six more independent variables (LPD1 through LPD7 instead of LOG(PRICE)) may be measured by the incremental F-ratio 4.9386 ( = (86.8406-77.3339)/(6 × .32083)), which is significant at the .99 level (df = 6, 57). This shows that the differential-effect model is a significant improvement over the explanatory power of the simple-effects model. The estimated parameter values are markedly different from one brand to the next. Looking at the price-parameter estimates, we note that a larger size tends to be more price sensitive than a smaller size even within a brand. Brands 2 and 4 have greater (in absolute values) values than brands 1 and 2. Brand 5 is most price sensitive with the estimated value of -24.08, but this may reflect the fact that this brand's share was zero and hence not available for estimation for 10 weeks out of 14. We shall discuss this issue in a later section. Two brands, 6 and 7, are not price sensitive. Their price parameters are not statistically different from zero as indicated by their respective ``Prob. > |T|'' values. As to the estimates of a 's, we may note that they are negatively correlated with price-parameter estimates over brands, but we will not attempt to make generalizations on the basis of this single example.
The arrangement of data for estimating model (5.15) is given in Table 5.8. Only the dependent variable and the price × brand dummy
Table 5.6: Data Set for Differential-Effects Model
W | B | Log | Log(Price) × Brand Dummy Variables | ||||||
e | r | ||||||||
e | n | Share | LPD1 | LPD2 | LPD3 | LPD4 | LPD5 | LPD6 | LPD7 |
k | d | ||||||||
1 | 1 | 1.38629 | 5.2575 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
1 | 2 | 3.93183 | 0.0000 | 4.9381 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
1 | 3 | 1.09861 | 0.0000 | 0.0000 | 5.0626 | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
1 | 4 | 1.09861 | 0.0000 | 0.0000 | 0.0000 | 4.9836 | 0.0000 | 0.0000 | 0.0000 |
1 | 5 | . | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 5.0938 | 0.0000 | 0.0000 |
1 | 6 | 0.00000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 4.8520 | 0.0000 |
1 | 7 | 2.19722 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 4.9972 |
2 | 1 | 0.69315 | 5.2575 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
2 | 2 | 4.31749 | 0.0000 | 4.9416 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
2 | 3 | 0.69315 | 0.0000 | 0.0000 | 5.0626 | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
2 | 4 | 0.00000 | 0.0000 | 0.0000 | 0.0000 | 5.1358 | 0.0000 | 0.0000 | 0.0000 |
2 | 5 | . | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 5.0938 | 0.0000 | 0.0000 |
2 | 6 | . | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 4.8520 | 0.0000 |
2 | 7 | 1.60944 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 4.9972 |
3 | 1 | 1.09861 | 5.2575 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
3 | 2 | 3.87120 | 0.0000 | 4.9309 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
3 | 3 | 0.00000 | 0.0000 | 0.0000 | 5.0626 | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
3 | 4 | 0.00000 | 0.0000 | 0.0000 | 0.0000 | 5.1358 | 0.0000 | 0.0000 | 0.0000 |
3 | 5 | 3.04452 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 4.6052 | 0.0000 | 0.0000 |
3 | 6 | . | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 4.9273 | 0.0000 |
3 | 7 | 2.56495 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 4.8904 |
4 | 1 | 1.38629 | 5.2575 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
4 | 2 | 3.78419 | 0.0000 | 4.9345 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
4 | 3 | 3.17805 | 0.0000 | 0.0000 | 4.9345 | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
4 | 4 | . | 0.0000 | 0.0000 | 0.0000 | 5.1358 | 0.0000 | 0.0000 | 0.0000 |
4 | 5 | . | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 5.0938 | 0.0000 | 0.0000 |
4 | 6 | . | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 4.9972 | 0.0000 |
4 | 7 | 2.39790 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 4.8520 |
5 | 1 | 1.60944 | 5.2575 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
5 | 2 | 3.13549 | 0.0000 | 4.9345 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
5 | 3 | 2.30259 | 0.0000 | 0.0000 | 4.9488 | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
5 | 4 | 0.00000 | 0.0000 | 0.0000 | 0.0000 | 5.1358 | 0.0000 | 0.0000 | 0.0000 |
5 | 5 | . | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 5.0938 | 0.0000 | 0.0000 |
5 | 6 | 3.25810 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 4.8520 | 0.0000 |
5 | 7 | 1.94591 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 4.8520 |
6 | 1 | 1.79176 | 5.2575 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
6 | 2 | 1.79176 | 0.0000 | 5.1705 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
6 | 3 | 1.09861 | 0.0000 | 0.0000 | 5.0626 | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
6 | 4 | 0.69315 | 0.0000 | 0.0000 | 0.0000 | 5.1358 | 0.0000 | 0.0000 | 0.0000 |
6 | 5 | . | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 5.0938 | 0.0000 | 0.0000 |
6 | 6 | 3.58352 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 4.8520 | 0.0000 |
6 | 7 | 2.56495 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 4.8520 |
Table 5.7: Regression Results for Differential-Effects Model (MCI)
Model: MODEL1 | |||||
Dep Variable: LSHARE | |||||
Analysis of Variance | |||||
Sum of | Mean | ||||
Source | DF | Squares | Square | F Value | Prob > F |
Model | 26 | 86.84061 | 3.34002 | 10.411 | 0.0001 |
Error | 57 | 18.28703 | 0.32083 | ||
C Total | 83 | 105.12764 | |||
Root MSE | 0.56641 | R-Square | 0.8260 | ||
Dep Mean | 1.92529 | Adj R-Sq | 0.7467 | ||
C.V. | 29.41967 | ||||
Parameter Estimates | |||||
Parameter | Standard | T for H0 | |||
Variable | DF | Estimate | Error | Parm=0 | Prob > |T| |
INTRCPT | 1 | 36.797056 | 9.03858643 | 4.071 | 0.0001 |
D2 | 1 | 28.212012 | 11.56852482 | 2.439 | 0.0179 |
D3 | 1 | 1.325003 | 11.79902587 | 0.112 | 0.9110 |
D4 | 1 | 12.426886 | 11.09247984 | 1.120 | 0.2673 |
D5 | 1 | 77.155688 | 41.01572380 | 1.881 | 0.0651 |
D6 | 1 | -32.861595 | 16.87525706 | -1.947 | 0.0564 |
D7 | 1 | -43.161568 | 18.20494075 | -2.371 | 0.0211 |
DD2 | 1 | 0.144666 | 0.34886522 | 0.415 | 0.6799 |
DD3 | 1 | 0.160885 | 0.34227101 | 0.470 | 0.6401 |
DD4 | 1 | 0.783674 | 0.38370743 | 2.042 | 0.0458 |
DD5 | 1 | 0.560437 | 0.33938645 | 1.651 | 0.1042 |
DD6 | 1 | 1.070890 | 0.34384160 | 3.114 | 0.0029 |
DD7 | 1 | 1.087786 | 0.34488085 | 3.154 | 0.0026 |
DD8 | 1 | 0.479316 | 0.33693519 | 1.423 | 0.1603 |
DD9 | 1 | 0.997026 | 0.34999923 | 2.849 | 0.0061 |
DD10 | 1 | 0.689770 | 0.35708659 | 1.932 | 0.0584 |
DD11 | 1 | 1.035196 | 0.35245369 | 2.937 | 0.0048 |
DD12 | 1 | 0.565334 | 0.35135768 | 1.609 | 0.1131 |
DD13 | 1 | 0.176690 | 0.34733555 | 0.509 | 0.6129 |
DD14 | 1 | 0.107222 | 0.36223872 | 0.296 | 0.7683 |
LPD1 | 1 | -6.837585 | 1.72929552 | -3.954 | 0.0002 |
LPD2 | 1 | -12.511968 | 1.47178224 | -8.501 | 0.0001 |
LPD3 | 1 | -7.357565 | 1.51269846 | -4.864 | 0.0001 |
LPD4 | 1 | -9.629287 | 1.46960177 | -6.552 | 0.0001 |
LPD5 | 1 | -24.078656 | 8.34529863 | -2.885 | 0.0055 |
LPD6 | 1 | -0.478779 | 2.78016380 | -0.172 | 0.8639 |
LPD7 | 1 | 1.657518 | 3.33624420 | 0.497 | 0.6212 |
Table 5.8: Log-Centered Differential-Effects Data
B | |||||||||
W | r | Log- | Centered Log(Price) × Brand Dummy Variables | ||||||
e | a | Centered | |||||||
e | n | Share | LPD1 | LPD2 | LPD3 | LPD4 | LPD5 | LPD6 | LPD7 |
k | d | ||||||||
1 | 1 | -0.232 | 4.381 | -0.823 | -0.844 | -0.831 | 0.000 | -0.809 | -0.833 |
1 | 2 | 2.313 | -0.876 | 4.115 | -0.844 | -0.831 | 0.000 | -0.809 | -0.833 |
1 | 3 | -0.520 | -0.876 | -0.823 | 4.219 | -0.831 | 0.000 | -0.809 | -0.833 |
1 | 4 | -0.520 | -0.876 | -0.823 | -0.844 | 4.153 | 0.000 | -0.809 | -0.833 |
1 | 6 | -1.619 | -0.876 | -0.823 | -0.844 | -0.831 | 0.000 | 4.043 | -0.833 |
1 | 7 | 0.578 | -0.876 | -0.823 | -0.844 | -0.831 | 0.000 | -0.809 | 4.164 |
2 | 1 | -0.769 | 4.206 | -0.988 | -1.013 | -1.027 | 0.000 | 0.000 | -0.999 |
2 | 2 | 2.855 | -1.052 | 3.953 | -1.013 | -1.027 | 0.000 | 0.000 | -0.999 |
2 | 3 | -0.769 | -1.052 | -0.988 | 4.050 | -1.027 | 0.000 | 0.000 | -0.999 |
2 | 4 | -1.463 | -1.052 | -0.988 | -1.013 | 4.109 | 0.000 | 0.000 | -0.999 |
2 | 7 | 0.147 | -1.052 | -0.988 | -1.013 | -1.027 | 0.000 | 0.000 | 3.998 |
3 | 1 | -0.665 | 4.381 | -0.822 | -0.844 | -0.856 | -0.768 | 0.000 | -0.815 |
3 | 2 | 2.108 | -0.876 | 4.109 | -0.844 | -0.856 | -0.768 | 0.000 | -0.815 |
3 | 3 | -1.763 | -0.876 | -0.822 | 4.219 | -0.856 | -0.768 | 0.000 | -0.815 |
3 | 4 | -1.763 | -0.876 | -0.822 | -0.844 | 4.280 | -0.768 | 0.000 | -0.815 |
3 | 5 | 1.281 | -0.876 | -0.822 | -0.844 | -0.856 | 3.838 | 0.000 | -0.815 |
3 | 7 | 0.802 | -0.876 | -0.822 | -0.844 | -0.856 | -0.768 | 0.000 | 4.075 |
4 | 1 | -1.300 | 3.943 | -1.234 | -1.234 | 0.000 | 0.000 | 0.000 | -1.213 |
4 | 2 | 1.098 | -1.314 | 3.701 | -1.234 | 0.000 | 0.000 | 0.000 | -1.213 |
4 | 3 | 0.491 | -1.314 | -1.234 | 3.701 | 0.000 | 0.000 | 0.000 | -1.213 |
4 | 7 | -0.289 | -1.314 | -1.234 | -1.234 | 0.000 | 0.000 | 0.000 | 3.639 |
5 | 1 | -0.432 | 4.381 | -0.822 | -0.825 | -0.856 | 0.000 | -0.809 | -0.809 |
5 | 2 | 1.094 | -0.876 | 4.112 | -0.825 | -0.856 | 0.000 | -0.809 | -0.809 |
5 | 3 | 0.261 | -0.876 | -0.822 | 4.124 | -0.856 | 0.000 | -0.809 | -0.809 |
5 | 4 | -2.042 | -0.876 | -0.822 | -0.825 | 4.280 | 0.000 | -0.809 | -0.809 |
5 | 6 | 1.216 | -0.876 | -0.822 | -0.825 | -0.856 | 0.000 | 4.043 | -0.809 |
5 | 7 | -0.096 | -0.876 | -0.822 | -0.825 | -0.856 | 0.000 | -0.809 | 4.043 |
6 | 1 | -0.129 | 4.381 | -0.862 | -0.844 | -0.856 | 0.000 | -0.809 | -0.809 |
6 | 2 | -0.129 | -0.876 | 4.309 | -0.844 | -0.856 | 0.000 | -0.809 | -0.809 |
6 | 3 | -0.822 | -0.876 | -0.862 | 4.219 | -0.856 | 0.000 | -0.809 | -0.809 |
6 | 4 | -1.227 | -0.876 | -0.862 | -0.844 | 4.280 | 0.000 | -0.809 | -0.809 |
6 | 6 | 1.663 | -0.876 | -0.862 | -0.844 | -0.856 | 0.000 | 4.043 | -0.809 |
6 | 7 | 0.644 | -0.876 | -0.862 | -0.844 | -0.856 | 0.000 | -0.809 | 4.043 |
variables are shown. In addition, we need (dj - 1/m), where dj is the usual brand dummy variable for each brand. Note that all variables sum to zero within each week. Note also that those observations for which log(share) is missing are deleted prior to centering. The estimated values of a2, a3, ¼, am , bp1, bp2,¼, bpm based on the data in Table 5.8 are identical to those given in Table 5.7.
Bultez and Naert [1975] reported that estimating the parameters of a differential-effects model by equations (5.14) and (5.15) was greatly inconvenienced by the existence of model-induced collinearity. To see their point, consider the data set shown in Table 5.9.
Table 5.9: Hypothetical Data for Differential-Effects Model
B | ||||||||
W | r | log | X1 ×Brand Dummies | X2 × Brand Dummies | ||||
e | a | |||||||
e | n | share | ||||||
k | d | X1D1 | X1D2 | X1D3 | X2D1 | X2D2 | X2D3 | |
1 | 1 | log(s11) | X111 | 0 | 0 | X211 | 0 | 0 |
1 | 2 | log(s21) | 0 | X121 | 0 | 0 | X221 | 0 |
1 | 3 | log(s31) | 0 | 0 | X131 | 0 | 0 | X231 |
2 | 1 | log(s12) | X112 | 0 | 0 | X212 | 0 | 0 |
2 | 2 | log(s22) | 0 | X122 | 0 | 0 | X222 | 0 |
2 | 3 | log(s32) | 0 | 0 | X132 | 0 | 0 | X232 |
3 | 1 | log(s13) | X113 | 0 | 0 | X213 | 0 | 0 |
3 | 2 | log(s23) | 0 | X123 | 0 | 0 | X223 | 0 |
3 | 3 | log(s33) | 0 | 0 | X133 | 0 | 0 | X233 |
. | . | . | . | . | . | . | . | . |
. | . | . | . | . | . | . | . | . |
This data set is for the estimation of regression model (5.16) in which three brands and two independent variables are assumed. (In actual estimation we will need brand and week dummy variables in addition to the variables above.) Collinearity (i.e., high correlations between two or more independent variables) is observed between independent variables for the same brand, e.g., between X1D1 and X2D1 , between X1D2 and X2D2 , between X1D3 and X2D3 , and so forth. The reason for this phenomenon is demonstrated mathematically later in this section, but is easy to understand. Take variables called X1D1 and X2D1 for example. Those two variables have many zeroes in common for the same observations (weeks). When one takes the correlations between the two variables, those common zeroes artificially inflate the value of the correlation coefficient.
Because of the potential for artificially inflated correlations Bultez and Naert warned against careless usage of differential-effect models. Their warning was, however, somewhat premature. There are two aspects to the problem - the first concerning numerical analysis, and the second concerning the stability of parameters estimates.
Problems arise in numerical analysis when the crossproducts matrix for a regression model becomes singular or so nearly so that it cannot be inverted accurately. But, the crossproducts matrix for regression model (5.15) has a unique structure which is robust against high correlations induced by the model structure. (This is not to say that it is robust against any high correlations.) To simplify the discussion, assume that observations are taken only for three weeks. Then the number of independent variables in regression will be 11 (the intercept term, two week dummy variables, two brand dummy variables, and six variables X1D1 through X2D3). The crossproduct matrix for this set of variables will look as follows.
|
In the above matrix summation is always over t (in this case over three weeks).
Collinearity in regression becomes a numerical-analysis problem when the crossproduct matrix such as above is nearly singular and thus the determinant is near zero. Since this matrix is in a block-matrix form, the critical issue is if sub-matrix
|
is invertible. This matrix may be put in the form of a block-diagonal matrix by simple row-column operations and thus is invertible, if each of the following three matrices is invertible.
|
This is to say that original correlations between X11 and X21 , X12 and X22 , and X13 and X23 over t are low. This is true even if the apparent (model-induced) correlations between them are high. The important condition for the invertibility of the cross-product matrix as a whole is that the correlations between original variables Xkit and Xhit (h ¹ k) over t are not too high to begin with. (If the correlations between original variables are high, composite measures, such as those based on principal components, will have to be used for any differential-effects market-share model to be effective!) This conclusion does not change if the independent variables are the logarithms of original variables Xki 's. Thus the numerical-analysis problems created by collinearity in the usual sense are not the real issues in this case.
Even though the matrix will usually be invertible, collinearity can still harm the regression estimates. A further look at the source and remedies for collinearity in these models is helpful. Since Bultez and Naert's [1975] discussion of the problem, their warning about collinearity in differential-effects attraction models has been echoed by Naert and Weverbergh and others.Naert, Philippe A. & Marcel Weverbergh [1981], ``On the Prediction Power of Market Share Attraction Models,'' Journal of Marketing Research, 18 (May), 146-153. Naert, Philippe A. & Marcel Weverbergh [1985], ``Market Share Specification, Estimation and Validation: Toward Reconciling Seemingly Divergent Views,'' Journal of Marketing Research , 22 (November), 453-61. Brodie, Roderick & Cornelius A. de Kluyver [1984], ``Attraction Versus Linear and Multiplicative Market Share Models: An Empirical Evaluation,'' Journal of Marketing Research, 21 (May), 194-201. Ghosh, Avijit, Scott Neslin & Robert Shoemaker [1984], ``A Comparison of Market Share Models and Estimation Procedures,'' Journal of Marketing Research, 21 (May), 202-210. Leeflang, Peter S. H. & Jan C. Reuyl [1984a], ``On the Predictive Power of Market Share Attraction Models,'' Journal of Marketing Research 21 (May), 211-215. Leeflang, Peter S. H. & Jan C. Reuyl [1984b], ``Estimators of the Disturbances in Consistent Sum-Constrained Market Share Models,'' Working Paper, Faculty of Economics, University of Gronigen, P.O. Box 9700 AV Gronigen, The Netherlands.While most of these articles also investigated differential-effects versions of multiplicative and linear-additive market-share models, no mention has been made in the marketing literature of possible collinearities in these model forms.
This section shows that the linear-additive and multiplicative versions of differential-effects market-share models suffer from the same sources of collinearities as the MCI and MNL versions. It is shown that the structural sources of collinearity are largely eliminated by two standardizing transformations - zeta-scores or the exponential transform of a standard z-score - discussed in section 3.8.
The three basic specifications of the differential-effects market-share models - linear-additive (LIN), multiplicative (MULT), and multiplicative competitive-interaction (MCI) or attraction versions - are given in equations (5.20 - 5.22) parallel to the definitions in Naert & Weverbergh's [1984] equations:
LIN
|
(5.20) |
MULT
|
(5.21) |
and MCI
|
(5.22) |
|
All of these models are reduced to their corresponding simple-effects versions by assuming:
|
The reduced formThe reduced form is simply the variables after they are transformed to be ready for input into a multiple-regression routine.resulting from this simplified estimation procedure allows us to see the similarities among all three specifications of the differential-effects model, as seen in Tables 5.2 and 5.9. Note in Table 5.9 that each differential effect has only one nonzero entry in each time period. The difference between LIN and MULT models is just that the MULT model uses the log of the variable as the nonzero entry and the LIN model uses the raw variable. The difference between the MULT and MCI models is basically that the MCI form incorporates a series of time-period dummy variables from Table 5.2 which insure that the estimated parameters are those of the original nonlinear model in equation (3.1). Another difference, of course, is that the estimates of market share in the MCI model come from inverse log-centering,Nakanishi & Cooper [1982].while in the MULT model the exponential transformation of the estimated dependent variable serves as the market-share estimate. Inverse log-centering and the time-period dummy variables guarantee that the MCI model will provide logically consistent market-share estimates (all estimates being between zero and one, and summing to one over all brands in each time period), while neither LIN or MULT provide logically consistent estimates.
The problem of collinearity can be traced to within-brand effects. There is zero correlation between a time-period dummy variable and a brand-specific dummy variable. Since the time-period dummy variables cannot be a major source of collinearity, then the MULT and MCI models do not differ substantially in their sources of collinearity. Nor do the correlations between effects for different brands contribute substantially to collinearity. For m brands the correlation between brand-specific dummy variables for different brands is -1/(m-1) . With even ten brands there is only 1% overlap in variance between intercepts for different brands. An analogous result holds for the correlations between dummy variables for different time periods. The within-brand effects are analyzed in the next section.
The special problems of jointly longitudinal and cross-sectional analysis have been discussed in psychometrics, econometrics, as well as the quantitative-analysis areas in education, sociology, and geography. The earliest reference is to Robinson'sRobinson, W. S. [1950], ``Ecological Correlation and the Behavior of Individuals,'' American Sociological Review, 15, 351-357.covariance theorem, which was presented by AlkerAlker, Hayward R. Jr. [1969], ``A Typology of Ecological Fallacies,'' in Mattei Dogan & Stein Rokkan (editors), Quantitative Ecological Analysis in the Social Sciences, Cambridge, MA: The M.I.T. Press, 69-86.as:
|
(5.23) |
where:
Looking again at Table 5.9 shows that for differential effects within a brand, all the nonzero entries are aligned and all the zero entries are aligned in the reduced form, and there is only one nonzero entry in each time period. This results in very simplified forms for the components of Robinson's covariance theorem. If we let xt and yt be the single nonzero entries in period t for column X and Y , respectively, then for our special case:
|
This is a congruence coefficient, often used for assessing the agreement between ratio-scaled measures.Tucker, Ledyard R [1951], ``A Method of Synthesis of Factor Analysis Studies,'' Personnel Research Section Report, No. 984, Washington, D.C., Department of the Army. Also see Korth, Bruce & Ledyard R Tucker [1975], ``The Distribution of Chance Coefficients from Simulated Data,'' Psychometrika, 40, 3 (September), 361-372. ecause the mean levels of the variables influence the congruence, x and y of the same sign push WRXY toward 1.0 much faster than the simple correlation. For prices (greater than $1.00) and advertising expenditures the reduced form would have a series of positive log-values which might well have a very large value for WRXY. For these same variables in share form (price-share or advertising-share), the reduced form would have matched negative numbers, which still could lead to large values for WRXY. For variables of consistently opposite signs, WRXY could push toward -1.0 even in cases of modest simple correlations.
For both raw variables (e.g., price and advertising) and for marketing variables in their share form (e.g., relative price and advertising share) the correlation ratios EXR2 and EYR2 have a maximum value of 1/m .
|
(5.24) |
So when correlating two effects within a brand we have at best:
|
(5.25) |
Thus the correlation rXY is composed of two parts. A small part, at most 1/m , is due to the simple correlation of the X and Y values for brand j over time periods. A very large part, at least [(m-1)/ m] is due to the congruence coefficient WRXY. Thus, for raw-score or share-form marketing variables, pairwise collinearity is likely for any two effects within a brand in differential-effects models. But collinearity is not merely a pairwise problem in these models.For further discussion see Mahajan, Vijay, Arun K. Jain & Michel Bergier [1977], ``Parameter Estimation in Marketing Models in the Presence of Multicollinearity: An Application of Ridge Regression,'' Journal of Marketing Research, 14 (November), 586-591.Collective collinearity for all the within-brand effects is very likely indeed. This is true for the differential-effects versions of the linear-additive model, the multiplicative model as well as the MCI model. Fortunately there exist simple remedies which are the topic of the next section.
The remedies for collinearity were hinted at in the Bultez and Naert [1975] article which first discussed the problem. They said, ``... if the variables have zero means'' the correlations in the extended model would be the same as the correlation in the simple model (p. 532). More precisely, it can be said that if the reduced form of the values for brand i for two different variables each have a mean of zero over time periods, then WRXY is equal to ERXY, and thus rXY would be equal to the simple correlation of the reduced forms of the brand i values. This remedy is not a general solution for all variables in a differential-effects model because forming deviation scores within a brand over time ignores competitive effects. One case where this remedy might be appropriate, however, is for a variable reflecting the promotion price of a brand. This variable would reflect current price as a deviation from a brand's historic average price.
As potential remedies, consider zeta-scores and the exponential transformation of standard scores discussed in Chapter 3 (section 3.8). Both transformations standardize the explanatory variables, making the information relative to the competitive context in each time period. There are several advantages to standardizing measures of marketing instruments in each time period. First, one should remember that the dependent measures (share or choice probability) are expressed in a metric which, while normalized rather than standardized, is still focused on representing within time-period relations. Representations of the explanatory variables which have a have similar within time-period focus have the advantage of a compatible metric. In this respect, variables expressed in share form have as much of an advantage as zeta-scores or exp(z-scores). Any of the three would be superior to raw scores in reflecting the explanatory information in a way which aligns with the dependent variable. While raw prices might have a stronger relation with category volume or primary demand, relative prices could have more to do with how the total volume is shared among the competitors.
A second advantage applies to standardizations, rather than normalizations. In the reduced form , the means (of a brand over time periods) of a zeta-score or exp(z-score) are more likely to be closer to zero, than the corresponding means of the reduced form of a normalized variable. Thus WRXY for a zeta-score or exp(z-score) would be less inflated (closer to the value of the simple correlation ERXY) than would be the congruence coefficient for two within-brand effects represented in share form.
Table 5.10 provides an empirical demonstration of the effects on collinearity of zeta-scores and exp(z-scores), compared with the raw scores or the share scores. The data concern price and advertising measures representing competition among 11 brands in an Australian household-products category.Carpenter, Cooper, Hanssens, and Midgley [1988].There are 11 differential-price effects, 10 differential-advertising effects, and 10 brand-specific intercepts in a differential-effects market-share model for this category. The tabled values are condition indices reflecting the extent of collinearity or near dependencies among the explanatory variables. A condition index is the ratio of the largest singular value (square root of the eigenvalue) to the smallest singular value of the reduced form of the explanatory variables in the market-share model.Belsley, David A., Edwin Kuh & Roy E. Welsch [1980], Regression Diagnostics: Identifying Influential Data and Sources of Collinearity, New York: John Wiley & Sons, 103-4.The higher the condition index the worse the collinearity in the system of equations. Belsley, Kuh, and Welsch [1980] develop empirical evidence that weak dependencies are associated with condition indices between 5 and 10, moderate to strong relations are associated with indices of 30 to 100, and indices of 100 or more ``appear to be large indeed, causing substantial variance inflation and great potential harm to regression estimates''(p. 153). Note in Table 5.10 for raw scores, Xkit, all three models (LIN, MULT, and MCI) reflect potential problems. These problems are not remedied when marketing instruments are expressed in share form. As a market-share model, which uses the share form of marketing instruments, becomes more comprehensive, by including more brands, the problems would worsen. This is because the price shares and advertising shares would, in general, become smaller, thus making the log of the shares negative numbers of larger and larger absolute value. This would press WRXY closer to +1.0.
Table 5.10: Condition Indices Australian Household-Products Example
Transformation of Raw Scores | ||||
Model | Raw Scores | Share Form | Zeta-Scores | Exp(Z-Scores) |
LIN | 3065 | 313 | 61 | 75 |
MULT | 484 | 3320 | 22 | 17 |
MCI | 627 | 3562 | 24 | 23 |
Standardizing within each competitive set using zeta-scores or exp(z-scores) has a dramatically favorable impact on the collinearity of the system of equations. The condition indices for the MULT and MCI models are less than 25. This is below the level indicating moderate collinearity, and far below the danger point.The absolute standards given by Belsley Kuh and Welsh [1980] for condition indices are probably too conservative. As the number of variables and observations increases we can expect the ratio of the largest and smallest singular values to grow larger. Further study is needed to see what boundaries are acceptable for large data sets.Linear or nonlinear trends in the mean level of the raw variables are major contributors to collinearity. By removing the mean level of the raw variables in each time period, the two remedies illustrated in Table 5.10 both eliminate one major source contributing to high (positive or negative) values in WRXY. By standardizing the variance over competitors in each time period, both remedies help keep the mean values for each brand over time nearer to zero.
These basic results mean that, if one standardizes variables in a manner appropriate for these multiplicative models, it is practical to use differential-effects market-share models.
We now come to the estimation problems associated with the fully extended attraction (or cross-effects) model discussed in Chapter 3.
|
(5.26) |
|
As before, the fk in the above equation may be an identity (for an MCI model) or an exponential (for an MNL model) transformation. The most important property of the above model is, of course, the existence of cross-effect parameters, bkij ( i, j = 1, 2, ¼,m; k = 1, 2,¼, K). We are now faced with the seemingly insurmountable problem of estimating (m ×m ×K) + m parameters.
Surprisingly, estimating parameters of a cross-effects model is not very difficult, and in some sense easier than estimating parameters of a differential-effects model. McGuire, Weiss, and HoustonMcGuire, Timothy W., Doyle L. Weiss & Frank S. Houston [1977], ``Consistent Multiplicative Market Share Models,'' in Barnett A. Greenberg & Danny N. Bellenger (editors), Contemporary Marketing Thought. 1977 Educators Proceedings (Series # 41), Chicago: American Marketing Association.showed that the following regression models estimate the parameters of (5.26).
MNL Model:
|
(5.27) |
MCI Model:
|
(5.28) |
where s*it is the log-centered value of sit , the share of brand i in period t . Variable dj is the usual brand dummy variable, but its value changes depending on where it is used in the above equation. In the first summation, dj = 1 if j = i, and dj = 0 otherwise; in the second summation, dh = 1 if h = j, and dh = 0 otherwise. It must be pointed out that b*kij in models (5.27 - 5.28) are not the same as parameter bkij in model (5.26), but a deviation of the form
|
where [`(b)]k.j is the arithmetic mean of bkij over all brands (i = 1, 2, ¼, m). But it may be shown that the estimated values of b*kij 's are sufficient for computing the cross elasticities. Recall from Chapter 3 that the elasticities and cross elasticities of brand i 's share with respect to a change in the kth variable for brand j is given by
MCI Model:
|
MNL Model:
|
Take the MCI version, for example. Substitute b*kij for bkij in the above equation.
|
since the sum of sh over all brands is one. Thus the knowledge of the b*kij 's is sufficient to estimate esi.j for both the MCI-type and MNL-type cross-effects models.
Let us apply the regression model proposed by McGuire et al. to the illustrative data in Table 5.1. Since the data necessary for estimation involve 56 variables (including the intercept term), no table of data set-up is shown. Only the estimation results are given in Table 5.11. The model was estimated without the intercept. The notation for independent variables, LPiDj , where i and j are appropriate numbers, indicates the effect of log(price) of the ith brand on brand j 's market share. There is a warning that the model is not full rank , because there are only four observations for brand 5 with a positive market share. Direct-effect parameters, LPiDi 's, for brand 1 through 4 are negative and statistically significant, and others are non-significant. Cross-effect parameters are mostly positive and/or statistically non-significant, but one of them, LP7D6, is negative and significant. Although we should refrain from making generalizations from this one set of data, it is perhaps justified to say that, as we move toward more complex models, the limitations of the test data set have become obvious. The number of observations is too small to provide one with stable parameter estimates. Furthermore, there seem to be other factors than price which affect market shares of margarine in this store. It is desirable then to obtain more data, especially from more than one store, along with the information on marketing variables other than price.
Table 5.11: Regression Results for Cross-Effects Model (MCI)
Model: MODEL1 | |||||
Note : no intercept in model. R-square is redefined. | |||||
Dep Variable: LSHARE | |||||
Analysis of Variance | |||||
Sum of | Mean | ||||
Source | DF | Squares | Square | F Value | Prob > F |
Model | 52 | 92.58365 | 1.78045 | 8.451 | 0.0001 |
Error | 32 | 6.74182 | 0.21068 | ||
U Total | 84 | 99.32547 | |||
Root MSE | 0.45900 | R-Square | 0.9321 | ||
Dep Mean | -0.00000 | Adj R-Sq | 0.8218 | ||
C.V. | -7.89278E+17 | ||||
NOTE : | Model is not full rank. Least-squares solutions for the | ||||
parameters are not unique. Some statistics will be | |||||
misleading. A reported DF of 0 or B means that the | |||||
estimate is biased. The following parameters have been | |||||
set to 0, since the variables are a linear combination | |||||
of other variables as shown. | |||||
LP4D5 = +2.9223*D5 + 0.9531*LP1D5 + 0.7226*LP2D5 - 0.2564*LP3D5 | |||||
LP5D5 = +5.8144*D5 - 0.2300*LP1D5 | |||||
LP6D5 = +6.3121*D5 - 0.2992*LP1D5 - 0.7777*LP2D5 + 0.7946*LP3D5 | |||||
LP7D5 = +5.2704*D5 + 0.1566*LP1D5 - 0.0181*LP2D5 - 0.2200*LP3D5 | |||||
Parameter Estimates | |||||
Parameter | Standard | T for H0 | |||
Variable | DF | Estimate | Error | Parm=0 | Prob > |T| |
D1 | 1 | 73.924694 | 46.10070856 | 1.604 | 0.1186 |
D2 | 1 | -38.091548 | 46.10070856 | -0.826 | 0.4148 |
D3 | 1 | 38.942488 | 46.10070856 | 0.845 | 0.4045 |
D4 | 1 | -79.689368 | 64.73512217 | -1.231 | 0.2273 |
D5 | B | -52.022706 | 14.10815828 | -3.687 | 0.0008 |
D6 | 1 | 59.130760 | 69.49275660 | 0.851 | 0.4012 |
D7 | 1 | -17.793812 | 46.10070856 | -0.386 | 0.7021 |
LP1D1 | 1 | -5.096306 | 1.97465203 | -2.581 | 0.0146 |
LP2D1 | 1 | -0.365029 | 2.43911085 | -0.150 | 0.8820 |
LP3D1 | 1 | 1.507052 | 2.27537629 | 0.662 | 0.5125 |
LP4D1 | 1 | -2.353595 | 1.89200824 | -1.244 | 0.2225 |
LP5D1 | 1 | 0.503063 | 0.70624068 | 0.712 | 0.4814 |
LP6D1 | 1 | -6.100657 | 3.98554609 | -1.531 | 0.1357 |
LP7D1 | 1 | -2.894448 | 4.25472197 | -0.680 | 0.5012 |
LP1D2 | 1 | -0.252472 | 1.97465203 | -0.128 | 0.8991 |
Parameter Estimates | |||||
Parameter | Standard | T for H0 | |||
Variable | DF | Estimate | Error | Parm=0 | Prob > |T| |
LP2D2 | 1 | -8.625451 | 2.43911085 | -3.536 | 0.0013 |
LP3D2 | 1 | 2.107563 | 2.27537629 | 0.926 | 0.3613 |
LP4D2 | 1 | 3.041118 | 1.89200824 | 1.607 | 0.1178 |
LP5D2 | 1 | 0.800421 | 0.70624068 | 1.133 | 0.2655 |
LP6D2 | 1 | 1.336924 | 3.98554609 | 0.335 | 0.7395 |
LP7D2 | 1 | 9.615896 | 4.25472197 | 2.260 | 0.0308 |
LP1D3 | 1 | -0.128008 | 1.97465203 | -0.065 | 0.9487 |
LP2D3 | 1 | 1.150772 | 2.43911085 | 0.472 | 0.6403 |
LP3D3 | 1 | -6.671369 | 2.27537629 | -2.932 | 0.0062 |
LP4D3 | 1 | -0.446255 | 1.89200824 | -0.236 | 0.8150 |
LP5D3 | 1 | 0.378551 | 0.70624068 | 0.536 | 0.5957 |
LP6D3 | 1 | -0.622859 | 3.98554609 | -0.156 | 0.8768 |
LP7D3 | 1 | -1.518813 | 4.25472197 | -0.357 | 0.7235 |
LP1D4 | 1 | -1.081137 | 2.35763232 | -0.459 | 0.6496 |
LP2D4 | 1 | 6.997517 | 3.37627497 | 2.073 | 0.0463 |
LP3D4 | 1 | -3.559763 | 3.99419070 | -0.891 | 0.3795 |
LP4D4 | 1 | -6.089339 | 1.89510791 | -3.213 | 0.0030 |
LP5D4 | 1 | 0.194514 | 0.74768568 | 0.260 | 0.7964 |
LP6D4 | 1 | 12.210535 | 7.30809600 | 1.671 | 0.1045 |
LP7D4 | 1 | 7.680597 | 4.90866219 | 1.565 | 0.1275 |
LP1D5 | B | 6.205448 | 2.34288476 | 2.649 | 0.0124 |
LP2D5 | B | 3.608572 | 3.17368514 | 1.137 | 0.2640 |
LP3D5 | B | 0.569965 | 3.76236952 | 0.151 | 0.8805 |
LP4D5 | 0 | 0 | 0.00000000 | . | . |
LP5D5 | 0 | 0 | 0.00000000 | . | . |
LP6D5 | 0 | 0 | 0.00000000 | . | . |
LP7D5 | 0 | 0 | 0.00000000 | . | . |
LP1D6 | 1 | 3.523658 | 2.50575576 | 1.406 | 0.1693 |
LP2D6 | 1 | 0.112065 | 3.39656633 | 0.033 | 0.9739 |
LP3D6 | 1 | -0.322265 | 4.07062686 | -0.079 | 0.9374 |
LP4D6 | 1 | 1.837399 | 2.12068866 | 0.866 | 0.3927 |
LP5D6 | 1 | 1.098908 | 0.83529373 | 1.316 | 0.1977 |
LP6D6 | 1 | -1.221249 | 7.39266721 | -0.165 | 0.8698 |
LP7D6 | 1 | -17.414894 | 5.82079344 | -2.992 | 0.0053 |
LP1D7 | 1 | 0.104280 | 1.97465203 | 0.053 | 0.9582 |
LP2D7 | 1 | 1.630654 | 2.43911085 | 0.669 | 0.5086 |
LP3D7 | 1 | 2.086093 | 2.27537629 | 0.917 | 0.3661 |
LP4D7 | 1 | 1.615467 | 1.89200824 | 0.854 | 0.3995 |
LP5D7 | 1 | -0.313301 | 0.70624068 | -0.444 | 0.6603 |
LP6D7 | 1 | -2.643566 | 3.98554609 | -0.663 | 0.5119 |
LP7D7 | 1 | 1.060157 | 4.25472197 | 0.249 | 0.8048 |
It should be pointed out that the parameter estimates of Table 5.11 may be obtained by applying a simple regression model of the following form to the data for each brand separately.
|
(5.29) |
In the above equation, ai is simply the intercept term for brand i . The parameters thus estimated are identical to those in Table 5.11, although the significance level of each parameter is usually different from the one in Table 5.11, because the t-statistic and associated degrees of freedom are not the same. If one wishes only parameter estimates, model (5.26) is simpler to calibrate than model (5.13).If we replace log(Pjt) with Pjt , the corresponding MNL model can be estimated.
The fact that (5.29) may be used to estimate the parameters of (5.26) has an extremely important implication. Note that, in estimating (5.29), the data for every brand involve the same set of independent variables, log(P1t),log(P2t), ¼, log(Pmt) , plus an intercept term. One may summarize model (5.29) for m brands in the following multivariate regression model.
|
(5.30) |
where:
Recall our assumptions for the specification-error term are still applicable to the error term, eit , in the above model. It is well known that under our assumptions on the error term, the OLS procedure, applied to each column of Y in (5.30) separately, yields the best linear-unbiased estimates (BLUE) of the parameters of B.See, for example, Finn, Jeremy D. [1974], A General Model for Multivariate Analysis , New York: Holt, Rinehart & Winston.In other words, it is not necessary to resort to the GLS procedure to obtain minimum-variance estimates of a cross-effects model such as (5.27) or (5.28).
This fact, combined with the availability of equation (5.29) for brand-by-brand estimation, reduces the task of estimating the parameters of a cross-effects model and increases its usefulness as a market-diagnostic tool. When one has a sufficient number of observations (that is, T > 1 + m ×K), it is perhaps best to estimate a cross-effects model first, and then, after examining the pattern of estimated coefficients, determine if a simpler model, such as the simple attraction model or a differential-effects model, is adequate. When the number of observations is barely sufficient for a cross-effects model, one may decide to adopt a strategy to estimate a full cross-effects model first, and then decide to restrict some elements of the B matrix (the parameter matrix) to be zero (cf. Carpenter, Cooper, Hanssens, & Midgley [1988]). In this case, however, the OLS procedure is not applicable and a GLS procedure will have to be used.
So far we have considered the various techniques which may be used to estimate the parameters of market-share models, but the forecasting of brand sales volumes requires more than the knowledge of market shares. Because the sales volume of a given brand in a period is a product of the brand's share and (total) category sales volume for the period, one needs the forecast of category sales volumes.Hereafter we will use category volume instead of industry sales volume , since the former fits better in the context of stores and market shares.
In this section we deal with the estimation of the parameters of category-volume models. Compared with the market-share estimation, the modeling for category sales volumes is a more straightforward application of econometric techniques. The illustrative data in Table 5.1 include the average daily sales volumes of margarine for this store. We will use these data to show some examples of category-volume models.
In this particular data set, brand price is the only marketing variable. We hypothesize that if the overall price level is low, the total volume will be high. We also hypothesize that if sales are extremely high in one week, the sales in the following weeks should be low because the store customers have not used up their stock. In order to represent those two hypotheses, we propose the following model.
|
(5.31) |
where:
We let the geometric mean of prices in a period be [P\tilde]t . The following is the estimation result.
|
T-values are in the parentheses directly below the corresponding parameter estimates. The fit of the model is acceptable, judging from the R2-value of 0.58 . The estimated parameters and their t-values bear out our initial guess that the average price level in the week and the sales volume in the preceding week are influential in determining the category volume.
There is another line of thought concerning the effect of price on category volumes that the prices of different brands have differential effects on category volumes. A brand's price reduction may increase its share, but may not affect category volumes, while another brand's price reduction may increase both its share and category volumes. To incorporate differential effects of brand price, we propose the following model.
|
(5.32) |
where the ci 's are the differential price-effect parameters. The estimation results for this model are given below.
|
R-Square = 0.9581
The fit of the model is much improved. Brand 2 and 5 have significant effects on category volumes indicating that when those brands cut prices the customers to this store purchase more than their usual amounts, and that the following week's total volume suffers as a consequence. Note that the brand sales elasticity with respect to price, which measures the overall impact of brand i 's price on its sales volume, is decomposed into two components:
|
For example, if we assume the differential-effects model, then
|
where ci is in model (5.32) and bpi is estimated by one of the models (5.14 - 5.17).
With the R2 -value of 0.96, equation (5.32) should give reasonably good estimates of category volumes. The positive sign of the estimated parameter for logP7t poses a theoretical problem, but it probably reflects the effects of some marketing activities within the store which are not included in the model. As a forecasting model for category volume, this model should be used as it is.
Model (5.32) is in the form of a distributed-lag models. It is known that the ordinary-least squares procedure applied to (5.32) yields biased estimates of the model parameters. If there are an adequate number of observations, it is recommended to use time-series analysis procedures for parameter estimation. Weekly data produce a sufficient number of observations in two years for a time-series analysis model. If the number of observations is less than 50, however, it is perhaps best to use the OLS procedure.
These simplest category-volume models are linear in the effects of previous category volume while being linear in the logs of prices. As we incorporate marketing variables other than price, it is advisable to postulate more general, fully interactive models such as:
|
(5.33) |
The reduced form of such a model may be characterized as being a log-log model in the effects of price and previous category volume, and log-linear in the other marketing variables (such as newspaper features, in-store displays and other marketing instruments which may be binary variables). This general form will be used with the coffee-market example developed in section 5.12.
In Chapter 6 we deal with the market-structure analysis based on the factor analysis of market-share elasticities. The reader may recall that there are two types of market-share elasticities, namely, point- and arc-share elasticities. Since the elasticities obtainable in practice are arc elasticities, one may think of factor-analyzing arc elasticities to investigate the structure of the market and competition. Unfortunately, this is not at all feasible.
Recall the definition of an arc elasticity for variable Xk .
|
Dsi in the above definition is not the total change in si , but the change corresponding to the change in Xki , DXki . We have no means of separating the effects of various marketing variables on market shares, unless, of course, we apply some models to observed market shares. Indeed it is the main purpose of the models discussed in this book to identify the effects of marketing variables. Thus, in order to estimate share-elasticities specific to a marketing variable, we propose first to estimate the parameters of a market-share model from a data set (i.e., brand shares and marketing variables), and then use theoretical expressions for point elasticities (see Chapter 3) for the relevant model to obtain elasticities estimates.
A numerical example may clarify this procedure. When we applied the raw-score attraction model to the margarine data in Table 5.1, we have obtained a price-parameter estimate of -8.337. If a brand's share is 0.2, then the point-elasticity estimate is given by -8.337×(1 - 0.2) = -6.67 . Although we are unable to estimate arc elasticities in this manner, point-elasticity estimates will serve as approximations for arc elasticities.
Since the dependent variable in log-linear regression is the logarithm of either market shares or the numbers of units sold, it is impossible to compute the value of the dependent variable if observed market shares or numbers of units are zero. In any data collection procedure one may observe a zero market share or number of units sold for some brand-period combination. There are two procedures for handling those data sets which contain zero market shares.
The first is to assign some arbitrarily small values (0.001, say) to zero market shares. But this procedure amounts to assigning a large negative value to log 0, and tends to bias the estimated parameter values. (The smaller the assigned value, the greater the absolute values of estimated parameters.)
The second procedure is to delete from the data set those brand-period combinations for which observed market-shares are zero. Young, Kan H. & Linds Y. Young [1975], ``Estimation of Regressions Involving Logarithmic Transformations of Zero Values in the Dependent Variables,'' The American Statistician , 29 (August), 118-20.Though this procedure may seem arbitrary at first glance, it has some logic of its own. First, if a brand were not bought in a certain period, that would be sufficient basis to infer that the brand was not in the consumers' choice set. Second, since one is usually more interested in estimating accurately the behavior of those brands which command large shares, it may be argued that one need not bother with those brands which often take zero market shares. Third, that zero market shares are not usable for estimation is not a problem limited to log-linear regression procedures. Consider, for example, the case in which the share estimate for brand i in period t is based on the number of consumers who purchased that brand, nit (i = 1, 2, ¼, m ). Assuming that numbers {n1t, n2t, ¼, nmt} are generated by a multinomial process (see section 5.1.1 on maximum-likelihood estimation), one may wish to use a maximum-likelihood procedure for estimating parameters of attraction models. Note, however, that those observations for which nit = 0 do not contribute at all to the likelihood function (5.2). In a sense, the maximum-likelihood procedure ignores all brand-period combinations for which nit = 0 .
There are two drawbacks to the deletion of zero market shares. One is the reduction of the degree of freedom due to the deletion. But this drawback may be compensated by a proper research design in that, if the number of brands per period is reduced by the deletion, the number of periods (or areas) may be increased to obtain an adequate degree of freedom. The second drawback is that the estimated parameters are somewhat biased (in the direction of smaller absolute values). But, we believe that the biases which are introduced by this procedure are far less than those which are introduced by replacing zero shares by an arbitrarily small constant. It may be added that we found in our simulation studies that the true parameter values lie between those estimated after deleting zero-share observations and those estimated after replacing zero shares by an arbitrary constant. This finding leads us to consider another somewhat arbitrary, and so far untested, procedure, which adds a small constant to all brand-period combinations, disregarding if they are zero share or not. In other words, we suggest that the dependent variable, logsit , is to be replaced by log(sit + c) , where sit is the share of brand i in period t and c is the arbitrary constant. We found that, if one selects the value of c properly, the estimated parameters are free of biases which other two procedures tend to create. The appropriate value of c seems to vary from one data set to the next. So far we have been unable to find a logic to determining the correct value of c that is applicable to a particular data set. Here we only indicate that a fruitful course of research may lie in the direction of this estimation procedure.
Zero market shares create particularly difficult problems for the multivariate regression in (5.30). The missing market share for one brand may cause the observation to be deleted from all the regressions. In cases such as this, when it is particularly important to have all the dependent measures present, the EM algorithm discussed by MalhotraMalhotra, Naresh [1987], ``Analyzing Market Research Data with Incomplete Information on the Dependent Variable,'' Journal of Marketing Research , XXIV (February), 74-84. ould be useful.
When imputing values which are missing in the data one should always ask why are the data missing? The imputation literatureFor an excellent recent treatment see Little, Roderick J. A. & Donald B. Rubin [1987], Statistical Analysis with Missing Data . New York: John Wiley & Sons, Inc.treats data missing-at-random (MAR), missing-completely-at-random (MCAR), and missing-by-unknown-mechanisms (MBUM), but rarely do these conditions fit the zero market shares in POS data. If a brand simply is not distributed in one or more of the retail outlets, neither MAR, MCAR, nor MBUM assumptions are appropriate. Even if the brand is distributed, it is not always possible to tell if the zero market share results from an out-of-stock condition or simply from no sales. But, in either case, these conditions are neither random or by unknown mechanisms. One clue comes from the other data associated with a brand. If price and promotional variables are present for the zero-market-share brand, one can assume the brand is distributed, but nothing more. The problem concerns only imputing the value of the dependent measure. If price and promotional measures are also missing, the imputation problem is more severe. Widely differing patterns of distribution would greatly complicate the multivariate regression in (5.30). In such cases it is probably simpler to delete the missing observations in the market-share model, and use the method discussed in section 5.12 for estimating cross effects.
While simply deleting the observation is an acceptable solution to the problem of differing patterns of distribution in market-share models, it is not an acceptable approach to this problem in category-volume models. Zero market share isn't the issue, since the dependent measure is the (log of) total sales volume. But missing values for prices are particularly worrisome, since we cannot take the log of a missing value. In the market-share model for POS data, there is an observation for each brand in each store in each week. For the corresponding category-volume model there is just an observation for each store in each week. The measures in an observation reflect the influence of each brand's prices and promotional activity on total volume. If we were to delete the whole observation whenever a single brand was not in distribution, widely differing distribution patterns over stores could result in the deletion of all observations. We wish to minimize the influence that the missing value has on the parameter corresponding to that measure, but allow the other measures in the observation to have their normal influence in parameter estimation.
While an developing an algorithm to minimize the influence of missing prices is a worthwhile topic for future research, there is a simple approach for achieving a reasonable result in the interim. We merely need to create brand-absence dummy variables, which would take a value of one when then brand is absent and a value of one when present. If we then replace the missing (log) price with a zero, the parameter of the brand-absence measure show the penalty uniquely associated with not distributing the brand. This approach will be illustrated in the next section.
To illustrate the use of these estimation techniques on POS data, consider the ground, caffeinated coffee market. Data, provided by Information Resources, Inc., from BehaviorScan stores in two cities, report price, newspaper feature, in-store display and store-coupon activity for all brands. The small-volume, premium brands were aggregated into an ``All Other Branded'' (AOB) category, and the small ``Private Label'' (PL) brands were aggregated into an ``All Other Private Label'' (AOPL) category. Consequently, twelve brands of coffee were analyzed: Folgers, Regular Maxwell House, Maxwell House Master Blend, Hills Bros., Chock Full O'Nuts, Yuban, Chase & Sanborne, AOB, PL 1, PL 2, PL 3, and AOPL. For eighteen months, each week's data for a brand were aggregated over package weights, and over stores-within-grocery chains in the two cities. These are aggregate data from stores, not discrete-choice data from BehaviorScan consumer panels. Price for each brand was aggregated into average price per pound, net of coupons redeemed. Feature, display and coupon were represented as percent of volume sold on promotions of each type to allow for aggregation over stores with slightly differing promotional environments. The data were divided into a year for calibration of the market-share model, and six months for cross-validation. The average price and market share of each brand appear in Table 5.12.
Table 5.12: Coffee Data - Average Prices and Market Shares
Average | Average | |
Brand | Price/lb. | Share |
Folgers | $2.33 | 28.5 |
Maxwell House | $2.22 | 24.2 |
Master Blend | $2.72 | 7.8 |
Hills Bros. | $2.13 | 4.3 |
Chock Full O Nuts | $2.02 | 15.3 |
Yuban | $3.11 | 0.2 |
Chase & Sanborne | $2.34 | 0.3 |
All Other Branded | $2.64 | 2.4 |
Private Label 1 | $1.99 | 3.9 |
Private Label 2 | $1.95 | 3.6 |
Private Label 3 | $1.93 | 3.7 |
All Other Private Labels | $1.95 | 5.7 |
With four marketing instruments per brand the full cross-effects model would have 587 parameters (4 ×12 ×12 + 11). To avoid estimating so many parameters an asymmetric market-share model was estimated by procedures similar to those discussed in Carpenter, Cooper, Hanssens, and Midgley [1988].Carpenter et al. suggest forming dynamically weighted, attraction components to deal with the lagged effects of marketing instruments. Chapter 3 discusses alternative methods for specifying the dynamic components, but neither of these approaches was used in this illustration. Store-week data are sufficiently disaggregate that they rarely have the complex time-series properties dealt with in Carpenter et al., so that no dynamically weighted, attraction components were needed.The distinctiveness of marketing efforts were incorporated by using exp(z-scores) for each marketing instrument. A differential-effects model was estimated with a unique parameter for each brand's price, feature, display, and store coupons, and a brand-specific intercept for the qualitative features of each brand using OLS procedures. The brand-specific intercept which was closest to zero (PL 2) was set to zero to avoid singularity. The residuals from this differential-effects model were cross-correlated brand by brand with the transformed contemporaneous explanatory variables for all other brands. The cross-competitive effects which were significant in the residual analysis were entered into the model.The criteria for inclusion of a cross effect were that it had to be based on more than 52 observations and the correlation had to be significant beyond the .05 level.
This specification approach leads to a generalized attraction model:
|
where ai is brand i 's constant component of attraction, e1i is specification error, bki is brand i 's market-response parameter on the kth marketing-mix element, exp(zkit) is brand i 's attraction component for the kth marketing-mix element (standardized over brands within a store-week), Ci is the set of cross-competitive effects on brand i, exp(zk*j*t) is the standardized attraction component of the cross-competitive influence of brand j* 's marketing-mix element k* on brand i , (k*j*)eCi , and bk*ij* is the cross-effect parameter for the influence of brand j* 's attraction component k* on brand i 's market share.
For the final model the residuals from the OLS estimation were used to estimate the error variances for each brand. The weights for a regression were formed as
|
These weights compensate for heteroscedasticity of error variances over brands, but do not treat the possibility of nonzero error covariances. The results for the calibration period of 52 weeks appear in Table 5.13.
The resulting model has an R2 of .93 with 140 parameters estimated and 2,051 residual degrees of freedom (F2051140 = 181). Since the model is estimated without an intercept, R2 is redefined as is noted on the regression output. In models estimated without an intercept R2 is like the congruence coefficient discussed in section 5.6. If the mean of the dependent measure is equal to zero, the lack of an intercept doesn't matter, and R2 has the normal interpretation as the proportion of linearly accountable variation in the reduced form of the dependent measure. The dependent measure in the OLS-estimation phase does have a mean of zero (and an R2 of .92) but rescaling by the weights affects the mean of the dependent measure. So while it is obvious that the cross-effects model fits extremely well, it is not strictly proper to interpret .93 as the proportion of explained variation.Because reweighting changes the interpretation of R2 , to assess the incremental contribution of the cross effects, it is simpler to compare the OLS differential-effects model to the OLS cross-effects model. In this case the OLS differential-effects model has an R2 of .82, so that the cross effects represent a substantial improvement over the good-fitting differential-effects model.
We cross validate these models by combining the parameter values in Table 5.13 with fresh data to form a single composite prediction variable, and then correlate the predicted dependent measure with the actual dependent measure for the new observations; 26 weeks of fresh data were used in cross validation. The squared cross-validity correlation is .85 using the parameters in Table 5.13. This is an excellent result for a relationship that uses just one composite variable to predict over 1,000 observations (F10121 = 5808). The OLS differential-effects model has a squared cross-validity correlation of .79, indicating that the cross effects do enhance the model in a stable manner.
Table 5.13: Regression Results for Cross-Effects Model (MCI)
Coffee Data Base For Pittsfield And Marion Markets | |||||
Ground-Caffeinated Coffee Brands Only | |||||
MCI Regression | |||||
Model: Coffee | |||||
Dep Variable: LCSHARE Log-Centered Share | |||||
Analysis Of Variance | |||||
Sum of | Mean | ||||
Source | DF | Squares | Square | F Value | Prob > F |
Model | 140 | 11556.72 | 82.55 | 181.54 | 0.01 |
Error | 2051 | 932.62 | 0.45 | ||
U Total | 2191 | 12489.34 | |||
Root MSE | 0.67 | R-Square | 0.93 | ||
Dep Mean | 0.18 | Adj R-Sq | 0.92 | ||
C.V. | 383.34 | ||||
Note: | No intercept term is used. R-Square is redefined. | ||||
Parameter Estimates | |||||
Parm | Std | T For H0: | Prob > | ||
Variable | DF | Est | Err | Parm=0 | |T| |
Folg Intercept | 1 | 2.54 | 0.17 | 15.07 | 0.01 |
Folg Price Z-Score | 1 | -0.96 | 0.07 | -13.07 | 0.01 |
Folg Featv Z-Score | 1 | 0.06 | 0.04 | 1.52 | 0.13 |
Folg Dispv Z-Score | 1 | 0.16 | 0.05 | 3.56 | 0.01 |
Folg Coupv Z-Score | 1 | -0.13 | 0.05 | -2.53 | 0.01 |
RMH Intercept | 1 | 1.92 | 0.12 | 15.50 | 0.01 |
RMH Price Z-Score | 1 | -0.58 | 0.06 | -10.02 | 0.01 |
RMH Featv Z-Score | 1 | 0.00 | 0.03 | 0.12 | 0.91 |
RMH Dispv Z-Score | 1 | 0.06 | 0.03 | 1.74 | 0.08 |
RMH Coupv Z-Score | 1 | 0.11 | 0.04 | 2.92 | 0.01 |
MHMB Intercept | 1 | 1.79 | 0.17 | 10.27 | 0.01 |
MHMB Price Z-Score | 1 | -0.24 | 0.07 | -3.21 | 0.01 |
MHMB Featv Z-Score | 1 | 0.19 | 0.04 | 5.33 | 0.01 |
MHMB Dispv Z-Score | 1 | 0.22 | 0.05 | 4.79 | 0.01 |
MHMB Coupv Z-Score | 1 | -0.08 | 0.06 | -1.43 | 0.15 |
HlBr Intercept | 1 | -0.50 | 0.11 | -4.49 | 0.01 |
HlBr Price Z-Score | 1 | 0.04 | 0.07 | 0.57 | 0.57 |
HlBr Featv Z-Score | 1 | 0.48 | 0.05 | 8.96 | 0.01 |
HlBr Dispv Z-Score | 1 | 0.23 | 0.05 | 4.57 | 0.01 |
HlBr Coupv Z-Score | 1 | 1.52 | 0.19 | 7.97 | 0.01 |
CFON Intercept | 1 | 0.61 | 0.11 | 5.37 | 0.01 |
CFON Price Z-Score | 1 | -1.33 | 0.09 | -14.50 | 0.01 |
CFON Featv Z-Score | 1 | 0.12 | 0.05 | 2.27 | 0.02 |
CFON Dispv Z-Score | 1 | -0.04 | 0.04 | -0.94 | 0.35 |
CFON Coupv Z-Score | 1 | -0.22 | 0.07 | -3.35 | 0.01 |
Yub Intercept | 1 | -0.15 | 0.21 | -0.71 | 0.48 |
Yub Price Z-Score | 1 | -0.77 | 0.09 | -8.70 | 0.01 |
Yub Featv Z-Score | 1 | 0.21 | 0.21 | 0.98 | 0.33 |
Yub Dispv Z-Score | 1 | 0.70 | 0.25 | 2.82 | 0.01 |
Yub Coupv Z-Score | 1 | 0.15 | 0.22 | 0.70 | 0.49 |
C_S Intercept | 1 | -0.42 | 0.17 | -2.48 | 0.01 |
C_S Price Z-Score | 1 | -0.27 | 0.14 | -2.01 | 0.05 |
C_S Featv Z-Score | 1 | -0.07 | 0.31 | -0.22 | 0.83 |
C_S Dispv Z-Score | 1 | 1.19 | 0.33 | 3.65 | 0.01 |
C_S Coupv Z-Score | 1 | 0.78 | 0.24 | 3.21 | 0.01 |
Parameter Estimates, Continued | |||||
Parm | Std | T For H0: | Prob > | ||
Variable | DF | Est | Err | Parm=0 | |T| |
AOB Intercept | 1 | 0.50 | 0.12 | 4.00 | 0.01 |
AOB Price Z-Score | 1 | -0.49 | 0.06 | -8.28 | 0.01 |
AOB Featv Z-Score | 1 | -0.24 | 0.06 | -3.75 | 0.01 |
AOB Dispv Z-Score | 1 | 0.13 | 0.04 | 2.87 | 0.01 |
AOB Coupv Z-Score | 1 | 0.16 | 0.09 | 1.84 | 0.07 |
PL1 Intercept | 1 | 0.28 | 0.16 | 1.75 | 0.08 |
PL1 Price Z-Score | 1 | -1.07 | 0.09 | -11.64 | 0.01 |
PL1 Featv Z-Score | 1 | -0.06 | 0.04 | -1.47 | 0.14 |
PL1 Dispv Z-Score | 1 | -0.06 | 0.04 | -1.62 | 0.10 |
PL1 Coupv Z-Score | 1 | 0.03 | 0.03 | 0.79 | 0.43 |
PL2 Price Z-Score | 1 | -1.11 | 0.17 | -6.68 | 0.01 |
PL2 Featv Z-Score | 1 | 0.06 | 0.14 | 0.43 | 0.67 |
PL2 Dispv Z-Score | 1 | 0.12 | 0.13 | 0.91 | 0.36 |
PL2 Coupv Z-Score | 1 | 0.41 | 0.42 | 0.97 | 0.33 |
PL3 Intercept | 1 | -0.30 | 0.22 | -1.36 | 0.17 |
PL3 Price Z-Score | 1 | -1.00 | 0.15 | -6.53 | 0.01 |
PL3 Featv Z-Score | 1 | 0.02 | 0.06 | 0.28 | 0.78 |
PL3 Dispv Z-Score | 1 | 0.35 | 0.41 | 0.84 | 0.40 |
PL3 Coupv Z-Score | 1 | 0.05 | 0.05 | 0.95 | 0.34 |
AOPL Intercept | 1 | 0.25 | 0.15 | 1.68 | 0.09 |
AOPL Price Z-Score | 1 | -0.21 | 0.06 | -3.47 | 0.01 |
AOPL Featv Z-Score | 1 | 0.07 | 0.03 | 2.62 | 0.01 |
AOPL Dispv Z-Score | 1 | -0.04 | 0.05 | -0.68 | 0.50 |
AOPL Coupv Z-Score | 1 | 0.02 | 0.04 | 0.43 | 0.67 |
Crs Of RMH Price Effect On Folg | 1 | -0.27 | 0.07 | -3.96 | 0.01 |
Crs Of MHMB Price Effect On Folg | 1 | -0.10 | 0.08 | -1.29 | 0.20 |
Crs Of HlBr Price Effect On Folg | 1 | 0.06 | 0.06 | 0.98 | 0.33 |
Crs Of CFON Price Effect On Folg | 1 | 0.05 | 0.06 | 0.92 | 0.36 |
Crs Of Yub Price Effect On Folg | 1 | -0.32 | 0.06 | -5.85 | 0.01 |
Crs Of AOB Price Effect On Folg | 1 | -0.31 | 0.06 | -5.20 | 0.01 |
Crs Of RMH Featv Effect On Folg | 1 | -0.13 | 0.03 | -3.75 | 0.01 |
Crs Of Yub Featv Effect On Folg | 1 | -0.04 | 0.18 | -0.24 | 0.81 |
Crs Of RMH Dispv Effect On Folg | 1 | -0.09 | 0.04 | -2.40 | 0.02 |
Crs Of MHMB Dispv Effect On Folg | 1 | 0.12 | 0.05 | 2.72 | 0.01 |
Crs Of Yub Dispv Effect On Folg | 1 | 0.01 | 0.21 | 0.04 | 0.97 |
Crs Of AOB Dispv Effect On Folg | 1 | 0.02 | 0.04 | 0.44 | 0.66 |
Crs Of RMH Coupv Effect On Folg | 1 | 0.03 | 0.04 | 0.68 | 0.50 |
Crs Of MHMB Coupv Effect On Folg | 1 | 0.03 | 0.05 | 0.47 | 0.64 |
Crs Of HlBR Coupv Effect On Folg | 1 | 1.04 | 0.17 | 6.06 | 0.01 |
Crs Of Yub Coupv Effect On Folg | 1 | 0.30 | 0.18 | 1.66 | 0.10 |
Crs Of AOPL Coupv Effect On Folg | 1 | -0.06 | 0.04 | -1.70 | 0.09 |
Crs Of Folg Price Effect On RMH | 1 | -0.10 | 0.06 | -1.54 | 0.12 |
Crs Of Yub Price Effect On RMH | 1 | -0.05 | 0.04 | -1.31 | 0.19 |
Crs Of AOB Price Effect On RMH | 1 | -0.22 | 0.04 | -4.83 | 0.01 |
Crs Of AOPL Price Effect On RMH | 1 | 0.17 | 0.04 | 4.63 | 0.01 |
Crs Of Folg Featv Effect On RMH | 1 | -0.00 | 0.03 | -0.09 | 0.93 |
Crs Of Yub Featv Effect On RMH | 1 | 0.19 | 0.17 | 1.12 | 0.26 |
Crs Of AOB Featv Effect On RMH | 1 | -0.12 | 0.05 | -2.24 | 0.03 |
Crs Of Folg Dispv Effect On RMH | 1 | -0.04 | 0.04 | -0.92 | 0.36 |
Crs Of HlBr Dispv Effect On RMH | 1 | -0.08 | 0.03 | -2.37 | 0.02 |
Crs Of Yub Dispv Effect On RMH | 1 | -0.49 | 0.20 | -2.46 | 0.01 |
Crs Of HlBr Coupv Effect On RMH | 1 | 0.31 | 0.15 | 2.09 | 0.04 |
Crs Of CFON Coupv Effect On RMH | 1 | -0.05 | 0.05 | -0.87 | 0.39 |
Parameter Estimates, Continued | |||||
Parm | Std | T For H0: | Prob > | ||
Variable | DF | Est | Err | Parm=0 | |T| |
Crs Of Yub Coupv Effect On RMH | 1 | 0.54 | 0.18 | 3.01 | 0.01 |
Crs Of AOB Coupv Effect On RMH | 1 | 0.20 | 0.07 | 2.76 | 0.01 |
Crs Of Yub Price Effect On MHMB | 1 | -0.10 | 0.05 | -2.10 | 0.04 |
Crs Of AOB Price Effect On MHMB | 1 | -0.29 | 0.06 | -4.92 | 0.01 |
Crs Of AOPL Price Effect On MHMB | 1 | 0.38 | 0.04 | 9.73 | 0.01 |
Crs Of RMH Featv Effect On MHMB | 1 | -0.04 | 0.03 | -1.27 | 0.21 |
Crs Of Yub Featv Effect On MHMB | 1 | 0.52 | 0.17 | 3.02 | 0.01 |
Crs Of AOB Featv Effect On MHMB | 1 | -0.12 | 0.06 | -2.19 | 0.03 |
Crs Of HlBr Dispv Effect On MHMB | 1 | -0.09 | 0.03 | -2.69 | 0.01 |
Crs Of Yub Dispv Effect On MHMB | 1 | -0.43 | 0.22 | -2.01 | 0.04 |
Crs Of AOPL Dispv Effect On MHMB | 1 | -0.06 | 0.05 | -1.01 | 0.31 |
Crs Of RMH Coupv Effect On MHMB | 1 | 0.08 | 0.04 | 2.19 | 0.03 |
Crs Of HlBr Coupv Effect On MHMB | 1 | 0.50 | 0.16 | 3.04 | 0.01 |
Crs Of Yub Coupv Effect On MHMB | 1 | 0.42 | 0.16 | 2.56 | 0.01 |
Crs Of AOB Coupv Effect On MHMB | 1 | 0.14 | 0.07 | 1.89 | 0.06 |
Crs Of AOPL Coupv Effect On MHMB | 1 | -0.00 | 0.04 | -0.14 | 0.89 |
Crs Of MHMB Price Effect On HlBr | 1 | 0.19 | 0.07 | 2.71 | 0.01 |
Crs Of AOB Price Effect On HlBr | 1 | 0.29 | 0.05 | 5.82 | 0.01 |
Crs Of MHMB Featv Effect On HlBr | 1 | -0.05 | 0.07 | -0.78 | 0.44 |
Crs Of MHMB Dispv Effect On HlBr | 1 | -0.00 | 0.08 | -0.02 | 0.99 |
Crs Of CFON Dispv Effect On HlBr | 1 | 0.03 | 0.04 | 0.78 | 0.43 |
Crs Of AOB Dispv Effect On HlBr | 1 | -0.04 | 0.05 | -0.76 | 0.44 |
Crs Of RMH Price Effect On CFON | 1 | 0.31 | 0.08 | 3.70 | 0.01 |
Crs Of MHMB Price Effect On CFON | 1 | -0.69 | 0.06 | -10.81 | 0.01 |
Crs Of HlBr Price Effect On CFON | 1 | -0.17 | 0.07 | -2.48 | 0.01 |
Crs Of Folg Featv Effect On CFON | 1 | 0.10 | 0.06 | 1.72 | 0.09 |
Crs Of AOB Featv Effect On CFON | 1 | 0.01 | 0.06 | 0.11 | 0.91 |
Crs Of AOB Dispv Effect On CFON | B | -0.03 | 0.05 | -0.70 | 0.49 |
Crs Of Folg Coupv Effect On CFON | 0 | -0.07 | 0.08 | -0.90 | 0.37 |
Crs Of MHMB Coupv Effect On CFON | 1 | -0.63 | 0.14 | -4.39 | 0.01 |
Crs Of HlBr Coupv Effect On CFON | 1 | 0.01 | 0.19 | 0.05 | 0.96 |
Crs Of Folg Price Effect On Yub | 1 | 0.10 | 0.06 | 1.58 | 0.12 |
Crs Of Folg Dispv Effect On Yub | 1 | -0.12 | 0.08 | -1.48 | 0.14 |
Crs Of MHMB Dispv Effect On Yub | 1 | 0.49 | 0.10 | 4.92 | 0.01 |
Crs Of Folg Coupv Effect On Yub | 1 | -0.07 | 0.06 | -1.27 | 0.21 |
Crs Of Folg Price Effect On AOB | 1 | 0.52 | 0.10 | 5.43 | 0.01 |
Crs Of RMH Price Effect On AOB | 1 | 0.94 | 0.09 | 10.61 | 0.01 |
Crs Of HlBr Price Effect On AOB | 1 | 0.35 | 0.08 | 4.36 | 0.01 |
Crs Of CFON Price Effect On AOB | 1 | -0.00 | 0.08 | -0.03 | 0.98 |
Crs Of Yub Price Effect On AOB | 1 | 0.33 | 0.05 | 6.06 | 0.01 |
Crs Of AOPL Price Effect On AOB | 1 | 0.91 | 0.07 | 13.87 | 0.01 |
Crs Of Folg Featv Effect On AOB | 1 | 0.01 | 0.04 | 0.25 | 0.80 |
Crs Of Yub Featv Effect On AOB | 1 | 0.09 | 0.05 | 1.74 | 0.08 |
Crs Of Folg Dispv Effect On AOB | 1 | -0.14 | 0.05 | -2.88 | 0.01 |
Crs Of Yub Dispv Effect On AOB | 1 | -0.18 | 0.06 | -2.97 | 0.01 |
Crs Of RMH Coupv Effect On AOB | 1 | 0.15 | 0.04 | 3.32 | 0.01 |
Crs Of CFON Coupv Effect On AOB | 1 | -0.19 | 0.07 | -2.91 | 0.01 |
Crs Of Yub Coupv Effect On AOB | 1 | 0.06 | 0.13 | 0.46 | 0.65 |
Crs Of Folg Price Effect On AOPL | 1 | -0.21 | 0.09 | -2.30 | 0.02 |
Crs Of RMH Price Effect On AOPL | 1 | -0.48 | 0.07 | -6.66 | 0.01 |
Crs Of MHMB Price Effect On AOPL | 1 | 0.09 | 0.03 | 2.66 | 0.01 |
Crs Of AOB Price Effect On AOPL | 1 | -0.08 | 0.06 | -1.33 | 0.18 |
These results differ in minor fashion from those previously summarized by Cooper.Cooper, Lee G. [1988b], ``Competitive Maps: The Structure Underlying Asymmetric Cross Elasticities,'' Management Science , 34, 6 (June), 707-23.There are two sources of difference. First, the article is based on the OLS results. Second, the brand-specific effects estimated in that article are based on z-scores, rather than the more traditional brand-specific intercepts adopted in this book. Only the parameter values for the brand-specific effect are substantially affected by the differences between the two approaches. A brand-by-brand summary follows.
Folgers has the largest brand-specific intercept indicating a relatively high baseline level of attraction. If all brands were at the market average for prices and all other marketing instruments, so that only the differences in brand intercepts were reflected in the market share, Folgers would be predicted to capture 36% of the market. This is what we will call a baseline market share .Baseline shares can differ substantially from the average shares reported in Table 5.12. Average shares are a straightforward statistical concept, but baseline shares reflect something of a brand's fundamental franchise, all other things being equal. But all other things are rarely equal. Market power can come from the way a brand uses its marketing instruments (i.e., its promotion policy) as well as from its fundamental franchise. Baseline share figures are reported for each of the brands. These can be usefully compared to the average-share figures, but should not be thought of as a prediction of long-run market share.Folgers has a very strong and significant price parameter. Being priced above the market average will sharply reduce its baseline market share, while price reductions will sharply increase share. There is a positive but insignificant feature effect. There is a strong positive effect for in-store displays. The effect of store coupons is negative and statistically extreme. While we would normally expect store-coupon promotions to have a positive effect, we should note two things. First, the average number of pounds-per-week of Folgers sold on store coupons is 1,175 compare to 2,018 pounds sold on in-store displays and 1,397 pounds sold per week of newspaper features. So there is some indication in these data that this might not be a spurious coefficient. Second, the price measure is net of coupons redeemed. While this reflects the influence of manufacturers coupons as well as store coupons, it does mean that some of the benefits of store coupons are folded into the price effect. There are four significant cross-price effects impacting Folgers. Regular Maxwell House, Maxwell House Master Blend, Yuban, and the AOB category all have significantly less price impact on Folgers than reflected in the differential-effects model. Folgers has significantly more of a price effect on the AOB category and significantly less price impact on the AOPL brands than would otherwise be expected. For features, only the increased competitive impact of Regular Maxwell House is significant. For displays, Regular Maxwell House has more of an effect, while Master Blend has less of an effect than otherwise expected. Folgers' displays exert more pressure on the AOB category than otherwise expected. Hills Bros. coupons put significantly less pressure on Folgers than expected from differential effects alone.
Regular Maxwell House also has a strong, positive brand-specific intercept, which translates into a baseline market share of 19%. It has significant price and coupon effects. Regular Maxwell House has significant competitive price effects on Chock Full O'Nuts and the AOB category, but it exerts significantly less competitive pressure on Folgers and AOPL with its price. AOPL has a significant competitive price effect, while the AOB category exerts significantly less price pressure. RMH features attack Folgers, and features for the AOB category exert significant pressure on RMH. RMH displays exert significant competitive pressure on Folgers, while Hills Bros. and Yuban attack RMH with their displays. RMH coupons have less competitive effect on Master Blend and the AOB category than would otherwise be expected, and coupons for Hill Bros., Yuban and the AOB category have significantly less impact on RMH in return.
Maxwell House Master Blend has a significant intercept which translates into a baseline share of 17%. Price, feature, and display effects are significant in the expected directions. The coupon effect is insignificant and wrong signed. Master Blend receives more price pressure from AOPL, but less from Yuban and the AOB category than would otherwise be expected. In return Master Blend exerts more price pressure on Hills Bros. and AOPL, and less pressure on CFON and Folgers than the differential-effects models could reflect. AOB features are more competitive and Yuban features are less competitive due to their significant cross effects on Master Blend. Master Blend displays are less competitive with both Folgers and Yuban than otherwise expected, while displays for Hills Bros. and Yuban exert extra pressure on Master Blend. Store coupons for Regular Maxwell House, Hills Bros. and Yuban all have less effect than otherwise expected. Store coupons for Master Blend do exert pressure on Chock Full O'Nuts.
Hills Bros.' intercept translates into a baseline share of 2%. It shows strong effects for features, displays, and coupons. The self-price effect is not significant, but it does have a significant competitive price effect on the AOB category. It has less price effect on CFON than otherwise expected. Master Blend and the AOB category exert stable competitive price effects on Hills Bros. There are no feature cross effects, but Hill Bros. has significant competitive display effects on Regular Maxwell House and Maxwell House Master Blend (as already noted).
Chock Full O'Nuts has a small baseline share (5%), but strong price and feature effects. Its use of these instruments helps it maintain the third largest average market share (15%). The Regular Maxwell House has a strong, competitive price effect on Chock Full O'Nuts. But both Master Blend and Hills Bros. exert significantly less price pressure on CFON. There are no significant feature or display cross effects, but CFON's store coupons exert extra pressure on the AOB category and Master Blend's store coupons exert extra pressure on CFON.
Yuban has a baseline share of 2%, but its high price results in a much smaller average share. It has significant price and display effects. Yuban exerts less price pressure on Folgers and Master Blend, but more pressure on the AOB category than otherwise expected. Features for Yuban have less impact on Master Blend than reflected in simpler models. Yuban displays have significant competitive effect on both Maxwell House brands and the AOB category, while Master Blends displays are less competitive in return. The display effect of both Maxwell House brands is reversed in the only two coupon effects concerning Yuban. This is such a small brand in these markets that it probably should have been folded into the AOB category. Its stronger position on the West Coast may have led the authors astray.
Chase & Sanborne also has a baseline share of 2%. Its average share is even less, due to its high price and the infrequency of promotions. Its price, display, and coupon effects are statistically significant. There are no cross effects involving Chase & Sanborne.
The premium brands in the AOB category collectively have a baseline share of 5%. There are strong price and display effects, but the feature effect is statistically extreme in the expected direction. With aggregates of brands such as AOB, it may be hard to get a clear signals from all the parameters. AOB exerts additional competitive price pressure on Hills Bros., but seems to complement Folgers and both Maxwell House brands. The AOB category receives extra price pressure from Folgers, Regular Maxwell House, Hills Bros., Yuban, and AOPL. Features for the AOB category have an extra competitive effect on both Maxwell House brands. Store coupons for AOB and Regular Maxwell House have less effect on each other than otherwise expected, but store coupons for CFON do hurt the AOB category.
The private-label brands (PL 1, PL 2, PL 3 and AOPL) collectively have a baseline share of 13%. All four have significant price effects, and AOPL has a significant feature effect. AOPL exerts price pressure on both Maxwell House brands and the AOB category. While Master Blend returns the press, both Folgers and Regular Maxwell House are less price competitive than otherwise expected. There are no cross effects for features, displays, or store coupons for the private label brands.This was in part dictated by the criterion for a minimum of 53 observations before a significant residual correlation could qualify as a cross effect. This excluded all but the AOPL brand. In the category-volume model presented later in this chapter and in the brand planning exercise in Chapter 7 all the private label brands are aggregated together. If this had been done in the market-share model, more cross effects involving these brands might have been identified. If market-share analysis is done as an iterative process (as was discussed early in this book), this refinement could be undertaken.
That price is a major instrument in this market is reflected in having 11 of 12 self-price effects significant. Four self-feature effects, six self-display effects, three self-coupon effects, and seven brand-specific intercepts were significant.
Residual analysis seems to be a practical means for identifying cross effects. The criterion identified 29 cross-price effects, of which 22 were statistically significant in the final model. There were 12 cross-feature effects, 4 of which were significant in the final model; 18 display effects were identified and half of these were significant in the final model. Of the 20 cross-coupon effects identified in the residuals from the differential-effects model, 10 were significant in the final model.
Reading through a regression output like this is a tedious but useful step in developing an initial understanding of market and competitive structure. But two more elements are needed before responsible brand planning can take place. First, parameters have to be converted to elasticities before an overall picture of the structure can be achieved (see Chapter 6). And second, a category-volume model must be calibrated before a market simulator can be developed. This is the topic of the next section.
A category-volume model of the style in equation (5.33) is reported in Table 5.14.Only data from grocery chains 1 - 3 are used in this model so that the results would correspond to the competitive maps developed in Chapter 6 and the market simulator developed in Chapter 7.The private-label brands were aggregated into a single
Table 5.14: Regression Results for Category-Volume Model
Dep Variable: LTWVOL | |||||
Analysis of Variance | |||||
Sum of | Mean | ||||
Source | DF | Squares | Square | F Value | Prob > F |
Model | 31 | 42.88 | 1.38 | 38.29 | 0.01 |
Error | 124 | 4.48 | 0.04 | ||
C Total | 155 | 47.36 | |||
Root MSE | 0.19 | R-Square | 0.91 | ||
Dep Mean | 7.55 | Adj R-Sq | 0.88 | ||
C.V. | 2.52 | ||||
Parameter Estimates | |||||
Parm | Std | T for H0 | |||
Variable | DF | Est | Err | Parm=0 | Prob > |T| |
INTERCEP | 1 | 6.73 | 0.84 | 7.98 | 0.01 |
BA4-HLBR | 1 | -0.13 | 0.35 | -0.36 | 0.72 |
LPR1-Folg | 1 | -0.74 | 0.38 | -1.96 | 0.05 |
LPR2-RMH | 1 | -0.73 | 0.40 | -1.83 | 0.07 |
LPR3-MHMB | 1 | 0.56 | 0.51 | 1.09 | 0.28 |
LPR4-HLBR | 1 | -0.13 | 0.40 | -0.33 | 0.74 |
LPR5-CFON | 1 | -2.09 | 0.42 | -4.97 | 0.01 |
LPR6-Yub | 1 | -0.32 | 0.73 | -0.43 | 0.67 |
LPR7-CAS | 1 | 0.77 | 1.02 | 0.75 | 0.45 |
LPR8-AOB | 1 | 3.25 | 0.25 | 13.08 | 0.01 |
LPRPL-APL | 1 | -0.67 | 0.45 | -1.50 | 0.14 |
D1-Folg | 1 | 0.62 | 0.14 | 4.47 | 0.01 |
D2-RMH | 1 | 0.50 | 0.10 | 4.79 | 0.01 |
D3-MHMB | 1 | 0.29 | 0.12 | 2.53 | 0.01 |
D4-HLBR | 1 | 0.13 | 0.06 | 1.98 | 0.05 |
D5-CFON | 1 | -0.05 | 0.09 | -0.50 | 0.62 |
D8-AOB | 1 | 0.38 | 0.12 | 3.13 | 0.01 |
DPL-APL | 1 | 0.05 | 0.10 | 0.48 | 0.63 |
C1-Folg | 1 | -0.13 | 0.18 | -0.70 | 0.49 |
C2-RMH | 1 | 0.08 | 0.10 | 0.81 | 0.42 |
C3-MHMB | 1 | 0.04 | 0.40 | 0.10 | 0.92 |
C4-HLBR | 1 | -2.06 | 0.95 | -2.16 | 0.03 |
C5-CFON | 1 | 0.30 | 0.23 | 1.28 | 0.20 |
C8-AOB | 1 | -0.68 | 0.59 | -1.14 | 0.26 |
CPL-APL | 1 | 0.17 | 0.12 | 1.44 | 0.15 |
F1-Folg | 1 | -0.08 | 0.12 | -0.67 | 0.50 |
F2-RMH | 1 | 0.01 | 0.09 | 0.07 | 0.95 |
F3-MHMB | 1 | 0.03 | 0.08 | 0.39 | 0.70 |
F4-HLBR | 1 | -0.01 | 0.10 | -0.14 | 0.89 |
F5-CFON | 1 | 0.06 | 0.09 | 0.63 | 0.53 |
F8-AOB | 1 | 0.56 | 0.12 | 4.68 | 0.01 |
FPL-APL | 1 | 0.01 | 0.10 | 0.06 | 0.95 |
PL brand. A preliminary model showed that lagged volume had no significant effect (t = -.96), that there were no features, displays, or coupons in Chains 1 - 3 for either Yuban or Chase & Sanborne (so that these effects were deleted). Only Hills Bros. had a distribution pattern that required a brand-absence coefficient (BA4).
The overall fit of the model is quite good (R2 = .91).This would be boosted to .99 by the inclusion of chain-specific intercepts. But this category-volume model is destined for use in the market simulator to be used in Chapter 7. We feel that the generality of the planning frame used in that chapter is enhanced by predicting volume for a generic chain rather than chain by chain.The strongest price influences on total volume come from discounts for Folgers, Maxwell House, and Chock Full O'Nuts. Discounts for these brands clearly expand the weekly volume. As prices for the aggregate AOB category increase, total volume increases - perhaps reflecting supply conditions or prestige effects for these premium brands. Displays for Folgers, both Maxwell House brands, Hills Bros., and AOB drive up category volume. Hills Bros. store coupons seem to contract total volume, reflecting the infrequent (and apparently counter-cyclical) store-couponing policy for this brand. The only significant feature effect is associated with the AOB category.
The choice of measures incorporated into both the market-share and category-volume models was dictated in large part by the need for a diagnostically useful market simulator. To the extent that the variables inside these markets can explain market behavior, we obtain a way of translating market history into elasticities. Chapter 6 develops methods for mapping the market and competitive structure implied by the elasticities - as well as methods for visualizing the sources driving changes in competitive structure. In Chapter 7 the market-share and category volume models are combined into a market simulator for evaluating the consequences of marketing actions for all brands.
This section addresses two questions. The first concerns whether or not market-share analysis can be done on a large enough scale to be practical. Simply stated, the issue is how large is too large ? The second issue centers on the fixation managers seem to have concerning the signs of parameters developed using best linear-unbiased estimation. Simply stated, the issue is is BLUE always best ? Both of these topics will be discussed using experience arising from the implementation of market-share models on optical-scanner (POS) records of weekly store sales from Nielsen Micro-Scantrack databases and IRI store-level databases.
There are 15 steps which have been integrated into a SAS(R) macro program to perform the analytical tasks in estimating asymmetric market-share models.
The size implications of two applications are summarized in Table 5.15. The two applications reported there involve data from IRI and A.C. Nielsen. The IRI data are those just summarized for the ground, caffeinated coffee market. The Micro-Scantrack data involve a mature category of a frequently purchased, branded good. There were around 30 brands which were represented at the brand-size level - leading to 66 competitors in the model. The IRI data tracked four marketing instruments: prices, newspaper features, store coupons, and in-store displays. These data predate the size grading of newspaper features now standard with IRI data. The Nielsen data tracked five marketing instruments: prices, major ads, line ads, coupon ads, and in-store displays. Including the brand-specific intercepts, the Step 3 differential-effects file for the IRI example has 60 variables, while the Nielsen application contains 396 differential-effect variables. With seven grocery chains reporting 52 weeks of sales, the IRI example has about 2200 observations in the calibration data set. The Nielsen example has up to 155 stores reporting each week, which translates to about 113,000 observations in 26 weeks.
Step 10 involves a user-controlled, statistical criterion for which residual correlations are translated into cross-competitive effects. In the IRI application any correlation with more than 52 observations and a significance level more extreme than .05 was selected. This produced 81 cross effects involving all marketing instruments and leading to a Step 12 covariance matrix around 140 × 140. Using the same criterion on the Nielsen example led to the identification of around 4,000 potential cross-competitive effects. This would require the computation of a 4,400 × 4,400 covariance matrix, which is too large to compute in SAS(R) on an IBM 3083. Making the required number of observations much larger and the required significance level wildly extreme still lead to around 700 potential cross-competitive effects. Finally only the 200 statistically most extreme, cross-competitive effects were selected. These most-extreme effects all involved prices.
The comparison of timing results are somewhat exaggerated by the differences in the mainframes involved. The IBM 3090 model 200 on which the smaller example was run is a enormously capable computer.
Table 5.15: Computer Resources for Two Applications
IRI | Nielsen | ||
Chain-Level Data | Micro Scantrack Data | ||
12 | Brands | 66 | Brand-Sizes |
4 | Instruments | 5 | Instruments |
Price | Price | ||
Features | Major Ads | ||
Line Ads | |||
Store Coupons | Coupon Ads | ||
Displays | Displays | ||
60 | Differential Effects | 396 | Differential Effects |
7 | Chains/Week | Up to 155 | Stores/Week |
52 Weeks | ~ 2200 Obs. | 26 Weeks | ~ 113000 Obs. |
Cross Effects | |||
Obs > 50 | p < .05 | Obs > 50 | p < .05 |
79 | Cross Effects | ~ 4000 | Cross Effects |
Pick 200 Most Extreme | |||
Timing | |||
On IBM 3090 | On IBM 3083 | ||
~ 32 CPU Seconds | ~ 120 CPU Minutes | ||
Steps 1 - 15 | Steps 1 - 10 | ||
~ 120 CPU Minutes | |||
Steps 11 - 12 | |||
~ 10 CPU Minutes | |||
Step 13 |
While neither the vector or parallel capabilities of this machine were really involved in this illustration, the size of the problem did not tax the resources of the 3090. All 15 steps in the analysis took around 32 CPU seconds. The IBM 3083 used in the large application is an extended architecture (XA) machine, but the time and space required still reflected a substantial strain on the machine resources. The first ten steps required two hours of CPU time, most of which was spent forming the large ( ~ 400 ×400 ) covariance matrix. Forming the extended covariance matrix, including 200 cross effects, required another two hours of CPU time. Once the covariance matrix was stored, however, trying out different specifications in search of a final model only took about 10 CPU minutes per run. The estimation step was not run on the large example.
The huge number of initial cross effects in the 66-competitor example makes it clear that we can get too large unless careful judgment is exercised. The size of the analysis is quite sensitive to the number of competitors for which a full differential-effects specification is attempted. This application would have been more manageable if the 30 brands were considered the basis of the differential-effects specification, and size had been treated as a simple variable in most cases.
The 66-competitor illustration is near the limit of practicality using the system of models employed here. For comparison, however, it is useful to assess the resources needed to estimate this size illustration using the analytical methods developed by ShuganShugan, Steven M. [1987], ``Estimating Brand Positioning Maps from Supermarket Scanning Data,'' Journal of Marketing Research , XXIV (February), 1-18.for data such as these. Shugan's method requires the computation of many simple regressions. If a very fast machine required only 40 nanoseconds to compute a regression, it would take 2 ×1083 CPU seconds to complete the 66-competitor illustration. This means that if a super computer had begun at the moment of the creation of the universe, it would still not be done. In fact, the age of the universe could be taken to the seventh power and computation would still be incomplete.
Best linear-unbiased estimation provides the robust foundation on which the competitive-analysis system relies for its parameter estimates. But, as every analyst knows, some parameters can turn up with the ``wrong signs.'' Price parameters which are positive are difficult to explain except perhaps in prestige product classes. Negative parameters for promotions or advertising are difficult to explain - particularly to the managers running the promotions.
It seems to be left to the analyst to explain such events, as managers seem to presume that they are the consequences or quirks of the models. Analysts assume that the explanation is in the data, and the managers typically know the market conditions reflected in the data far better than the analysts.
There are several basic problems with this scenario. First is a problem of salience - are wrong-signed parameters more salient than they should be? The second problem concerns orientation. In simple constant-elasticity models the parameters are the elasticities. But complex market-response models recognize that elasticities vary as market conditions change. Management needs to know how markets respond to a firm's marketing efforts, but that knowledge is reflected far better in elasticities than in parameters. Third, there is an organizational problem. In the tension between management science and management, analysts should be more responsible for the models and managers more responsible for the data and how results are interpreted. But what one side does not understand should be the responsibility of both sides to figure out. Management scientists must develop and apply techniques across a number of managerial domains. They should not be expected to know the data of a domain with the kind of intimacy needed to manage. The second and third problems are addressed in more depth in Chapter 7, so that only the first is considered further here.
The problem of salience asks if wrong-signed parameter estimates get more attention than their frequency should command. Tables 5.16 and 5.17 summarize the parameter estimates for the two illustrations.
Table 5.16: Summary of BLUE Parameters - IRI Data
Differential-Effects Model | Cross-Effects Model | |||||
R2 = .83 | F218459 = 180 | R2 = .93 | F2051140 = 181 | |||
Marketing | Right | No. | Wrong Sign | Right | No. | Wrong Sign |
Instruments | Sign | Signif. | p < .05 | Sign | Signif. | p < .05 |
Prices | 11/12 | 9/12 | 0/12 | 11/12 | 11/12 | 0/12 |
Features | 7/12 | 3/12 | 1/12 | 9/12 | 4/12 | 1*/12 |
Displays | 9/12 | 8/12 | 0/12 | 9/12 | 6/12 | 0/12 |
Coupons | 8/12 | 1/12 | 1/12 | 9/12 | 3/12 | 2/12 |
Totals | 35/48 | 21/48 | 2/48 | 38/48 | 24/48 | 3*/48 |
* One aggregate brand. |
Table 5.17: Summary of BLUE Parameters - Nielsen Data
Cross-Effects Model | |||
R2 = .67 | F113000446 = 503 | ||
Marketing | Wrong Sign | ||
Instruments | Right Sign | Significant | p < .05 |
Prices | 62/66 | 55/66 | 4/66 |
Major Ads | 50/66 | 35/66 | 1/66 |
Line Ads | 57/66 | 29/66 | 1/66 |
Coupon Ads | 43/66 | 21/66 | 7/66 |
Displays | 55/66 | 47/66 | 2/66 |
Totals | 267/330 | 187/330 | 15/330 |
In Table 5.16 we see that in the differential-effects model 21 of 48 parameters are statistically significant in the expected direction, while only 2 of 48 parameters are statistically extreme with the wrong sign. Moving to the cross-effects model, 24 of 48 differential effects are statistically significant in the expected direction, in spite of the inclusion of 81 cross effects. In the cross-effects model there are 3 of 48 differential-effect parameters which are statistically extreme in the unexpected direction, and one of these relates to a brand aggregate. Since brand aggregates are not expected to behave as regularly as brands, these parameters probably present no problems for the management scientist or the manager. This is certainly not different than one might expect by random chance. Yet it is very likely that these parameters will be the ones questioned by managers. The analyst is forced to track the stability of the pattern of coefficients between the differential-effects model and the cross-effects model, as well as check the possible sources of collinearity of the variables or lack of variability in the instruments in question. But because of the strong prior hypotheses of managers about the directions of marketing effects, the focus is often on the two unusual parameters, rather than the 24 significant differential effects or the 45 significant cross effects which seem to be driving the market. The burden of explanation is on the analysts who may know little about the market data from which these parameters arise.
The problem is tractable perhaps, when only a few parameters require special explanation. But with large-scale applications the number of parameters to follow can reasonably grow large. Table 5.17 summarizes the cross-effects model for the 66-competitor example. While 187 of 330 differential effects are significant in the expected direction, 15 of 330 have the wrong sign and p < .05. 15 of 330 beyond the .05 level is well within expectation, but explaining the source of these potentially anomalous effects is at least time consuming and diverting from the main task of understanding market response.
Given the strong prior hypotheses of managers, there is another approach to parameter estimation which merits study. Quadratic programming would allow us to specify a set of inequality constraints on the parameters which would correspond to the prior hypotheses of managers. Consider an estimation scheme in which the differential-effect parameters estimated in Steps 5 and 7 would be bounded by a quadratic program to conform to the prior hypotheses. The residual analysis in Steps 8 - 11 would proceed as before. But at Step 13 the cross-competitive effect parameters would be estimated against the full set of residuals, rather than recombined with the differential effects in a BLUE scheme for overall recalibration against market shares. This approach gives primacy to the explanatory power of the differential effects. Whatever they can explain which is consistent with prior hypotheses is given to them. The cross-competitive effects are used to explain the systematic part of whatever is left over.
Whenever one considers moving away from BLUE schemes, caution and study are advised. But given the strong priors regarding the effects of marketing instruments, this avenue of research should be pursued.
Nakanishi and Cooper [1974] showed that the total covariance matrix of errors Se is approximately the sum of the variance-covariance matrix among sampling errors, Se2 , and the variance-covariance matrix among specification errors, Se1 . For the simplified estimation procedures the estimate of Se2t comes from
|
(5.34) |
where nt is the number of individuals (purchases) in time period t , [^(P)]-1t is an (mt ×mt) diagonal matrix with entries equal to the inverse of the market shares estimated by the OLS procedure for the mt brands in this period, and J is a conformal matrix of ones.
The variance-covariance matrix of specification errors, Se1 , is assumed to be constant in each time period and is estimated by [^(s)]e12I where
|
(5.35) |
where Q is the sum of squares of the OLS errors, and Zt is an (mt ×[K+T]) matrix containing the logs of the K explanatory variables with the T time-period dummy variables concatenated to it. This formula for [^(s)]e12 is considerably simpler than the one in Nakanishi and Cooper [1974, p. 308)] and also corrects a typographical error in that equation.
The total variance-covariance matrix [^(S)]e is a block-diagonal matrix in which each block is the sum of [^(S)]e2t + [^(S)]e1 .