Chapter 8
A Research Agenda

A number of research issues have come up in the writing of this book which will be recapitulated here. The basic topics include problems arising in estimation, decision support, integration of panel data, and trans-category research.

8.1  Estimation Problems

There are three issues here which should be high on a research agenda: methods for dealing with missing data, the issue of constrained parameter estimation, and the understanding of long-run effects.

8.1.1  Missing Data

For a category-volume model, different patterns of distribution can lead to a lot of missing data. The approach taken in Chapter 5 was simply to create brand-absence variables which have a zero whenever the brand is present and a one whenever it is absent. The log(price) variable corresponding to the missing brand is given a value of zero whenever the brand-absence coefficient has a value of one. This is a practical remedy for the missing explanatory variables in the category-volume model. The issue which needs to be addressed concerns how much influence these missing values have on the parameters of a category-volume model. It is possible to derive estimates of the missing values which are specifically designed to minimize the influence of missing explanatory variables.See Cooper [1987b].It would be very worthwhile to compare the parameters estimated by such a scheme with those in Table 5.14.

Note that for the category-volume model the missing explanatory variables are the problem, not missing dependent variables. If a brand is not distributed it simply does not increment total category volume. If exp(z-scores) or zeta-scores are used in the corresponding market-share model, the missing explanatory variables can simply be given a value at the mean for that occasion. Missing dependent measures in the market-share model only affect the full cross-effects model (see section 5.7). In the other versions of the market-share model we can drop the observation corresponding to a missing dependent variable. In the full cross-effects model, the ease of parameter estimation is dependent on having a complete matrix of dependent variables. Practical methods along the lines of Malhotra's [1987] developments need to be investigated.

8.1.2  Constrained Parameter Estimation

We have the means to guarantee that the differential-effect parameters which are expected to be positive (or expected to be negative) turn out that way, or turn out to be zero. The residuals from the differential-effects model may be used to identify cross-competitive effects. The question is whether we should recombine all effects and re-estimate parameters in a best linear unbiased fashion. Alternatively we could use a two-step procedure which would estimate the parameters of the cross effects from the residuals of the constrained differential-effect parameterization.

This is a straightforward problem which simply requires thorough study before recommendations can be made.

8.1.3  Long-Run Effects

HanssensHanssens, Dominique M. [1987], ``Marketing and the Long Run,'' Center for Marketing Studies Working Paper No. 164, Anderson Graduate School of Management, UCLA, September.asks if marketing efforts have a permanent impact on sales; 106 tests on scanner data (instant coffees) showed the series to be stationary (non-stationarity is a necessary, though not sufficient condition for showing a permanent component to marketing efforts). While this means that the kinds of models developed here for the coffee market are free of the time-series issues discussed in Chapter 3, we believe the issue will require more study in the future.

The brand franchise or market power is summarized, in a sense, in the brand-specific parameter ai . These parameters are used in Chapter 5 to develop estimates of baseline market shares . Since the parameter estimate is a constant it will, of course, have no correlation with variations in marketing efforts. But it seems reasonable to ask if national advertising affects these brand intercepts. It may be possible to address the question if differences in these intercepts across brands have a permanent component which can be related to differences in advertising, distribution, or other longer-term influences across brands.

8.2  Issues in Decision Support

8.2.1  Game Theory

The basic message in the brief discussion on game theory in Chapter 7, is that more development is needed of games which reflect the conditions we see in the data. What we see is that manufacturers try to push their products through the channels by offers of trade deals as well as pull the products through by national advertising. While advertising budgets used to exceed greatly promotion budgets, the balance has shifted in many firms. The optimal balance between push and pull strategies is an area of study which might well help manufacturers.

Retailers in general know that the trade-promotion offers they receive from manufacturers are also received by other retailers. But they do not know what offers will be accepted or when performance on those offers will occur. While the focus of analysis here has been on a grocery chain, the game among the retailers is at the level of a trading area (which may be conveniently specified by the local newspapers for grocery categories). At this level it is obvious that the onset of promotions must be random, for if one retailer knows another is scheduled for a major promotion there would be substantial advantage in a preemptive sale. The amount of an advantage would, of course, be influenced by the frequency of promotions in the category, the length of time retailers have to perform on an offer, the length of the interpurchase period, and the household inventory policy for the category. Developing games which reflect all these influences will not be an easy task.

8.2.2  Expert Systems

While we recognize the desire on the part of managers and students of management to look for an expert - a guru to provide the answers would greater simplify the tasks at hand - we are not very sanguine about the possibility of finding such expertise. This is a very young field and we are reminded of a passage from Henry Adams:Adams, Henry [1918], The Education of Henry Adams , New York: Random House, 1931.

What one knows is, in youth, of little moment; they know enough who know how to learn.

Thus our emphasis is on how one can learn from the vast amounts of data and the extensive development of models. Is there a way to structure the process of inquiry into these markets? Can we develop expert inquiry systems?

8.3  The Integration of Panel Data

The growing availability of consumer panels tied to store-level scanner data creates the opportunity to study many complex processes.

Disentangling the effects of forward buying, stockpiling, and increased consumption will ultimately be done using panel data. Totten and colleaguesTotten, John C. & Martin P. Block [1987], Analyzing Sales Promotion: Text and Cases , Chicago: Commerce Communications, Inc. McAlister & Totten [1985].have identified the consumer groups which contribute to these effects, but more research is needed to model the interrelations among these groups and how they are influenced by market forces.

This raises the fundamental issue of integrating panel data and store-level data. Connecting panel measurements and the corresponding store-level conditions to the sales results is the important step as is emphasized in Moore and Winer [1987]. When subgroups have to be incorporated as implied by Totten's research, the problem becomes more complex. We know the market conditions for each group since these conditions are homogeneous over groups within a week. If the identifying characteristics of the groups are assessable from the panel data, then a market-share model can be estimated at the group level. Oh yes, there is one thing missing. At this point we don't have a dependent measure. We don't know the market shares or sales by subgroup.

If we simply aggregate the panelists' choices by group, all the advantages of the store-level data are lost. Besides, experimentation with such an approach when IRI's Academic Database was first released convinced us that even with 1,000 families per city, aggregated panel-based market shares are too sparse to allow estimation of cross effects. The real opportunity is in integrating the two data sources, not in using aggregated panel data as if they were store-level data.

Using market shares computed from the panel data for each group, deriving group-level estimates of the corresponding store-level market shares should be a solvable problem. For example, one of the most likely segmentations would be to minimize within group heterogeneity in mean purchase frequency. In Chapter 2 we emphasized that forecasting of brand sales and market shares would become more accurate if market shares are forecast for each segment and weighted by the mean purchase frequency for the segment to obtain the estimate of over-all market shares (p. 44 above). Totten and Block [1987] showed that the heavy-user group was the most likely to increase consumption due to a promotion. Chapter 4 (see Table 4.1 and section 4.2) warns that if subgroups have heterogeneous market shares, as implied by Totten's finding, then aggregation can diminish our ability to reflect the underlying process. In this example our problem can be simplified by thinking that heavy and light users form two mutually exclusive and exhaustive groups. We know that a weighted sum of the group market shares must equal the store-level market shares. Such a constraint should make the solution more useful and may make the problem easier to solve.

The constrained estimation of subgroup market shares allows us to calibrate asymmetric market-share models reflecting how the groups respond differently to market conditions. The estimation method could be used in the analysis of split-cable experiments. Market shares could be estimated based on the partitions in the experimental design. The analysis-of-covariance model from section 5.2.3 could be useful here. It may be that elaborations on the segmentation scheme could help integrate more and more of the television-viewing data now being collected in parts of panels.

Along with segment-level characteristics and records of viewing behavior, survey techniques can be used for subsamples of the panels. If the resulting interval-scale measures are transformed to zeta-scores or exp(z-scores), these measures can be used like any other variable in the market-share model. It is through this connection that we may assess if brand image or perceptual positioning can affect market shares.

With survey data connected in we can begin to ask questions about how consumer judgments of value or importance relate to purchase behavior. Cooper and FinkbeinerCooper, Lee G. & Carl T. Finkbeiner [1984], ``A Composite MCI Model for Integrating Attribute and Importance Information,'' in Thomas C. Kinnear (editor), Advances in Consumer Research, Volume XI , Provo, UT: Association for Consumer Research, 109-13.propose some basic models for the integration of consumers' judgments of importance into MCI models. This should make a much more realistic platform for asking if statistically estimated importance weights are superior to subjective estimation, or if the combination of the two is better still.

8.4  Market-Basket Models

If we have data on the total transactions of each consumer, someone is going to try to model them. Extrapolating the approach in this book to that task, we would divide the market basket into categories, model the total expenditures as we would a category-volume model, and model the shares among categories as we would a market-share model. Within each category we would have a nested pair of models for category volume and brand shares.

This illustrates two levels of what might turn out to be a more articulately leveled scheme. But the principles are the same. If the category label is beverages, the subcategories might cross hot and cold with carbonated and noncarbonated. The hot, noncarbonated subcategory might be divided into coffee and tea, and the coffee subsubcategory is caffeinated, ground coffee. At some point we will get to the brand level we have illustrated in this book, and we will know the principles involved in modeling each level of categorization and connecting the results with their siblings and parents.

Such an analysis would provide the elasticity needed to drive shelf-space allocation models. While such an undertaking may be unreasonable at the present time, we believe such efforts will be more doable as experience with these models grows and the diffusion of computing technology continues.