Conditions under which index models are useful: Reply to bio-index commentaries
with Scott Armstrong
Journal of Business Research. (Online first)
Graefe, A. & Armstrong, J. S. (2010). Conditions under which index models are useful, Journal of Business Research (forthcoming).
The commentators raise important issues that were not adequately addressed in our paper. Voss (2010) argues that (1) many of the variables used in the bio-index might have no causal relationship with the election outcome. and (2) the bio-index might yield a poor selection of candidates, as it does not consider candidates’ personalities, their stands on political issues, or their likely performance. More generally, Cote (2010) is concerned with the suitability of the index method for other problems such as business decision-making.
The commentaries help to explain the conditions under which one can expect the bio-index in particular, and the index method in general, to be valuable for forecasting. Here, we provide a more complete summary of the key conditions that favor the index method and expand on the procedures one should use when developing index models.
We also address the concern of selecting inferior candidates when using the bio-index as a nomination helper. Political decision-makers should not use the bio-index as a stand-alone method but should combine forecasts from a variety of different methods. In addition, we suggest the use of a new bio-index based on variables that relate to the question of how a candidate will perform as a leader rather than how he will emerge as a leader. Such a model could aid voters and political parties to select the best candidate.
Key conditions of the index method
The index method is an alternative to multiple regression models in situations with small samples and many variables. The usefulness of multiple regression models for prediction depends on the availability of valid and reliable quantitative data relative to the number of causal variables. Multiple regression analyzes historical data to determine the variable weights that provide the best fit. As noted in Armstrong and Graefe (2010), given non-experimental data, the literature recommends a ratio of 100 observations per predictor variable for using multiple regression models to draw conclusions and make forecasts about human behavior.
In contrast to regression, the index method is suitable for situations in which a large number of causal variables are important and for which one can, at least subjectively, assess the directional effect of each variable on the outcome. Rather than estimate relationships, it uses the prior knowledge on the problem to develop the forecasting model.
The index method is based partly on the idea of unit weighting. That is, all variables are assumed to be equally important until proven otherwise. This avoids the problem of spurious effects that occur with non-experimental data. When providing ex ante forecasts, unit weights have often been found to be more accurate than weights estimated by regression using exactly the same data. This is especially likely when weights are estimated from non-experimental data.
Using simulated data unit-weighting was more accurate than regression when the sample was small and the number of, and inter-correlation among, predictor variables was high (Einhorn & Hogarth 1975). Empirical studies support this finding. In analyzing published data in psychology, Schmidt (1971) found unit weighting to be more accurate than regression weights. A review of the literature (Armstrong 1985, pp.230) found unit weights to be slightly less accurate in three studies (for academic performance, personnel selection, and medicine) but more accurate in five (three on academic performance, and one each on personnel selection and psychology).
Czerlinski et al. (1999)compared the methods for 20 selection problems (including psychological, economic, environmental, biological, and health problems), for which the number of predictor variables varied between 3 and 19 (average: 7.7). Most of these examples were taken from statistical textbooks where they were being used to demonstrate the application of multiple regression analysis. Not surprisingly, when calculating in-sample forecasts, multiple regression model forecasts were more accurate at 77% correct predictions than forecasts from a unit weight model (73% correct). However, when making out-of-sample predictions, the unit-weight model forecasts were more accurate (69% correct) than multiple regression model forecasts (68% correct).
The index method differs from the models in the unit weights literature in that index models are not limited to using only the set of variables and data that are available for regression analysis. Rather, the method draws upon the cumulative knowledge of a problem, which might come from experts’ domain knowledge or from prior empirical studies. Index models can easily accommodate new knowledge. In effect, index models are “knowledge models.”
Type of problems
The conditions favoring the index method are common in selection problems such as choosing political and job candidates, choosing sites for retail outlets, choosing between potential marriage partners, or choosing between contending advertising proposals. As we showed with the bio-index model (Armstrong & Graefe 2010), forecasts can also be made about the relative performance of alternatives. Where sufficient historical data are available on a quantitative dependent variable and causal variable values can be assessed, a model estimated by simple linear regression against index scores can be used to produce quantitative forecasts such as the percentage vote-share of candidates in an election.
Selection of variables
When building an index model, use prior knowledge to prepare a list of predictor variables and to state each variable’s directional influence on the outcome. This prior knowledge can come from empirical evidence or expert domain knowledge. Results from experiments are especially useful. If possible, draw on findings from meta-analyses of experimental studies.
If no data or prior studies are available, judgment can be used to assess the variables. In such cases, use structured approaches such as the Delphi method to combine judgments from several experts. If prior knowledge is ambiguous or contradictory and thus does not allow for estimating a variable’s directional influence on the outcome, do not include the variable in the model.
Index models as decision aids
The primary advantage of index models is that they can be designed so that decision-makers can take action upon the forecasts. In this case, political parties can use the bio-index to inform their decision about whom to nominate. By comparison, traditional econometric models provide limited or no advice with respect to questions such as what type of candidate a party should nominate or what issues should be stressed in the campaign.
Validity of variables
Voss was concerned about the validity of some of the variables included in the bio-index. The composite of variables is indeed easy to criticize as, for many variables, we had little prior evidence to draw upon. Thus, we sometimes had to rely on weak domain knowledge for selecting and coding the variables. There is certainly much more that might be learned about the validity of some of the variables in our model.
The index method is especially valuable in environments with many compensatory variables. In such an environment, there is no single variable that is more important than any combination of other variables. Thus, candidates can compensate disadvantages on one variable by scoring favorably on another. For example, the fact that a candidate has not written a book will not, on its own, lose the election for a candidate.
In contrast, the index method is less useful for constructing forecasting models that include non-compensatory variables. In some situations the importance of one variable might be greater than the importance of all other variables put together. In such an environment, the take-the-best heuristic (Gigerenzer & Goldstein 1996), which makes predictions based on a single piece of information, can be used. For example, we used the take-the-best heuristic to develop a forecasting model that predicts election outcomes based on how voters expect the candidates to handle the single most important issue facing the country. The forecasts from the take-the-best model were almost as accurate as an index model based on all the issues (Graefe & Armstrong 2010a).
Combining forecasts
We do not recommend using the bio-index as a stand-alone method for forecasting elections. The approach evaluates only one dimension of a candidate. Other factors clearly matter. For example, we developed two models that provide accurate forecasts of the winner in U.S. presidential elections based on information about how voters expect the candidates to handle the issues facing the country (Graefe & Armstrong 2010a, 2010b). Similarly, we expect further advances by developing forecasting models that incorporate information about candidates’ personalities. Forecasts from different models that draw on different information can be combined to get a more complete picture of the future. Such a combined forecast would also conform to Simonton’s inferential framework for judging presidential candidates. Simonton (1993) argues that voters evaluate candidates along three dimensions when deciding whom to vote for: performance, policy, and personality.
Selecting the best candidate
Bio-indexes aim at solving the question of who will win, not who should win. Thus, we selected variables that were expected to have an influence on leader emergence but that were not necessarily related to leader performance. For example, while being clean-shaven may improve how voters perceive a candidate, a beard would probably not harm the performance of the candidate once elected.
Voss raises concerns that the candidates themselves might want to take action and improve their biographical score by wearing glasses or getting plastic surgery. We expect candidates trying to do well on each variable do not pose a serious problem as there will still be variables on which the candidates differ. In addition, the bio-index model can use relative measures (e.g., who published the most popular book). Even if the candidates were able to gain equality with one another, it would be beneficial to reduce the impact of these variables, most of which have nothing to do with the competency of a president.
As noted by Voss, because bio-indexes do not provide evidence on which candidate would be best for the country, they could yield the selection of an inferior president. This could be addressed by developing an index model based solely on variables that are known to have an impact on performance (e.g., intelligence that exceeds a certain cut-off level). Such a model could be a useful decision tool for voters, as it could aid in deciding which candidate to vote for.
References
Armstrong, J.S. Long-range forecasting: From crystal ball to computer. New York: John Wiley, 1985.
Armstrong, Scott J. S., Graefe, Andreas. Predicting elections from biographical information about candidates, Journal of Business Research 2010; XX.
Cote, Joseph A. Predicting elections from biographical information about candidates: A Commentary Essay, Journal of Business Research 2010; XX.
Czerlinski, Jean, Gigerenzer, Gerd, Goldstein, Daniel G. How good are simple heuristics? In: Gigerenzer, Gerd, Todd, Peter M. (Eds.), Simple heuristics that make us smart, New York: Oxford University Press, pp. 97-118, 1999.
Einhorn, Hillel J., Hogarth, Robin. M. Unit weighting schemes for decision-making, Organizational Behavior & Human Performance 1975; 13(2): 171-192.
Gigerenzer, Gerd, Goldstein, Daniel G. Reasoning the fast and frugal way: Models of bounded rationality, Psychological Review 1996; 103(4): 650-669.
Graefe, Andreas, Armstrong, Scott J. Predicting elections from the most important issue: A test of the take-the-best heuristic, Journal of Behavioral Decision Making 2010a; forthcoming.
Graefe, Andreas, Armstrong, Scott J. Forecasting Elections from Voters’ Perceptions of Candidates’ Ability to Handle Issues, Under review (2010b). Available at: http://dl.dropbox.com/u/3662406/Articles/PollyIssues.pdf
Schmidt, Frank L. The relative efficiency of regression and simple unit predictor weights in applied differential psychology, Educational and Psychological Measurement 1971; 31(3): 699-714.
Simonton Dean K. Putting the Best Leaders in the White House: Personality, Policy, and Performance. Political Psychology 1993; 14(3): 537-548.
Voss, K. E. (2010). Voss Wins the Presidency! A Commentary Essay on “Predicting Elections from Biographical Information about Candidates: A Test of the Index Method”, Journal of Business Research 2010; XX.



