Predicting death from behavioral test performance.

Journal of Gerontology 1978. Voi. 33. No. 5. 755-762

Predicting Death from Behavioral Test Performance1 Jack Botwinick, PhD, 2 Robin West, BA, 3 and Martha Storandt, PhD 2

growing body of research since the early 1960s indicates that psychological test A performance of elderly people may be used to

B within the next five years. The other concept is relative decline in test scores over a number of years. If Person A scores poorly predict ensuing death: poor test performances at Test 1 but scores much the same years later may signal death within the next five years. at Test 2, and Person B scores well at Test 1 If this be true, the research is of obvious im- but poorly at Test 2, Person B may have the portance — preventive and remedial efforts poorer potential for survival. Assuming the can be compared, life-styles may be adjusted. validity of the test measures for the subject Much of this literature has been reviewed samples investigated, these two concepts are in some detail in several sources, for example, not necessarily incompatible. In the first Palmore and Jeffers (1971), Botwinick (1973), instance (relative standing on Test 1) it is posand Siegler (1975). Thus, further review is not sible to view the poor score of Person A as necessary except to emphasize some important reflecting a decline that had already occurred. factors. To begin with, the data of Riegel and This decline, however, is one which the testRiegel (1972) suggest that the prediction of giver could not observe. This view becomes survival is better with adults younger than more tenable as the subjects for study are 65 years. While it is not clear how general izable recruited to be homogeneous in educational, these data are, they do indicate the possibility occupational, and social status, and thus that the goal of accurate prediction based on expected to be relatively homogeneous in older samples may not be obtainable. test scores. In the present study, scores at As important, perhaps, are four meth- Test Time 1 were used to predict death apodological considerations. First, when pre- proximately five years afterwards, and in dicting death on the basis of test results, two addition, the predictions were made on the different concepts apply: One concept relates basis of changes in test scores over three to a person's relative standing within his testing sessions in the course of one year. group at a particular time. If Person A makes Thus, the present study examined both relative a very poor score and Person B a very good standing on Test 1 and relative change in test one, Person A may be more likely to die than scores in the prediction of life and death. The second fact to recognize in the deathprediction literature relates to the type of 'This research was supported in part by N1A program project grant, AG 00535. subjects often studied. Many of the studies Dept. of Psychology, Washington Univ., St. Louis 63130. have dealt with institutionalized subjects or Dept. of Psychology, Vanderbilt Univ., Nashville 37240. 2

3

755

Downloaded from http://geronj.oxfordjournals.org/ at University of Arizona on June 3, 2015

This study described several brief behavioral measures which, with further validation, could be useful in predicting the deaths of older adults within a five-year period following testing. Such tests can be used in routine biomedical examinations, alerting the physician to possible problems in the future. The study was based on a battery of 18 tasks as well as 8 measures of health, social activity, and demographic characteristics administered to 380 healthy men and women aged 60 to 89 years. Five years later, the scores of those who had subsequently died (N = 83) were compared with a matched sample of those still living. Thirteen of the 18 task performances significantly distinguished between those still living and those who died. Discriminative analyses were carried out, and the discriminative score cut-offs correctly classified 66% of the subjects as to survival status.

756

BOTWINICK, WEST, AND STORANDT


others whose medical problems and custodial and death and not carried out as an aftercare needs are left unspecified. This is particu- thought. Nevertheless, cut-off or "dangerlarly so of the earlier studies (e.g., Berkowitz, point" scores were not available for the 1965; Kleemeier, 1962; Lieberman, 1965). individual tests so prospective predictions It is clear that medical status is crucial in pre- could not be made with regard to particular dicting life and death and it is unfortunate that subjects. The prediction was in terms of peoit has so often been given only superficial ple, not single individuals — those adults at attention. The current study is seen as an im- the lower ends of the score distributions would provement in that all subjects had a medical be the ones who die soonest, as would be those examination at the start and were found whose scores dropped the most. healthy for independent living. The subjects were not institutionalized, although at the time of testing most of them were planning to move into an apartment complex which provided a METHOD Subjects. — There were 732 applicants (187 wider range of social service attention than is common in the typical apartment house. men and 545 women) to two senior citizen The moves to the apartment complexes had apartment complexes (built with FHA 236 no negative bearing on survival (Wittels & funds) located in the greater St. Louis area. The applicants had to meet government requireBotwinick, 1974). The third fact is that all too frequently ments for residence in a federally underwritten prediction of death is made on the basis of building (i.e., be 62 years of age or over [unless test scores without reference to the age of 100% handicapped or a relative of a person who the subject. If a 75-year-old scores more poorly did meet the age requirement] and have an or declines more than a 65-year-old, a pre- income under $5200 for a single person or diction of earlier death for the 75-year-old does $5900 for two people). Applicants to one comnot necessarily reveal much about the predic- plex were primarily Jews, many of whom tive value of the test scores. While most studies were immigrants to this country. Applicants are not as blatant as this in ignoring the sub- to the other building were largely native-born, ject's age, many fail to give sufficient weighting Christian Americans. to age in the prediction equation (the study by Although an effort was made to test all appliPalmore & Cleveland, 1976, is a notable ex- cants, because of nonavailability and other ception). The analyses of score-death relation- uncontrollable factors, it was possible to adships in the present study took into account minister individually a battery of tests and the age of the subject and also were carried questionnaires to only 380. Those tested out independent of age. The justification for ranged in age from 60 to 89 years. Subsequentthe latter analysis rests on the idea that func- ly 290 of these tested applicants moved into tional ability, irrespective of calendar years, the apartments; 90 did not. Earlier reports is potentially the more useful predictor. on a subset of these individuals revealed no The fourth fact relating to the death-predic- statistically significant differences between tion literature is that most studies seem to be movers and nonmovers in terms of age, educaretrospective in design. They tend to follow tion, socioeconomic status (SES), a medical this pattern: (1) older adults were tested for status rating based on reports from each applireasons other than prediction of mortality; (2) cant's attending physician, or death rates some time afterwards, several subjects died; (Storandt & Wittels, 1975; Wittels & Bot(3) only then, the idea of comparing the living winick, 1974). Further, the number of deaths with the dead on the basis of test scores was in the present sample of 380 was comparable conceived, and statistical comparisons made. for movers and nonmovers (X2 = 2.00, df=\, Even if such studies were carried out perfectly, p > .05). Thus, for the purposes of the present there is the need to replicate them with ' 'pro- study, movers and nonmovers were pooled. spective prediction," rather than this type of Although individuals from the two apart"retrospective prediction." The present study ment complexes were comparable in terms may be seen as a step toward prospective of age and SES (Hollingshead, 1957). they prediction. This study was conceived for the were very different from each other in their purpose of making group predictions of life socioreligious backgrounds, as described

PREDICTING DEATH FROM TEST PERFORMANCES

757

Procedure. — The tests and questionnaires used in this study may be seen in Table 1. Their description and brief reasons for choosing them may be found in two earlier publications (Storandt et al., 1975; Storandt & Wittels, 1975). Table 1 shows that a wide array of functions were assessed; these include demographic factors, cognitive, perceptual and psychomotor abilities, personality and morale factors, and health and social activities. It was not possible to test all subjects with each of the procedures. The actual number (N) tested with each procedure is shown in Table 1 for the total living sample, those still living and matched with the deceased on the basis of age, sex and building, and those who were deceased within five years after testing. Many of the subjects were tested three times; some were tested twice. Of the 83 sub-

Made 6-14 mo after other assessments on Building 1 residents only Note: Missing scores were marked 0; the N of subjects with nonzero scores is indicated above.

jects who died by the time the data analyses were begun, 37 had been tested three times, 10 twice, and the remaining 36 only once. On the average, the second test was carried out 4 mo after the first one, and the third test was carried out about 8 mo after the second one. Thus, approximately one year elapsed between Test 1 and Test 3. Data analyses were of two main types. One set compared Test 1 scores of those still living and those who had died. The other set compared change scores among those who had died. That is, it examined the changes in performance scores between Test 1, 2, and 3 among those with three testings.


above, different in education levels, F (1/378) Table 1. Tasks and Other Measures of the Study. = 27.37,p < .0001, and in medical status rating, F (1/378) = 5.59,p < .025. Those in Building 1 Sample Size (N) Living Died a Tasks and Other Measures had less education (mean = 7.96) and poorer Total Matched medical status (mean = 11.32) than did those in Building 2 (mean education = 10.03; mean Demographic medical status rating =12.50). Thus, the two (1) Age (60-89 years) 297 83 83 (2) Education level 83 83 297 groups to whom the tests were given are dif(3) Socioeconomic status 82 83 296 ferentiated by the reference, Building 1 (N = Cognitive 231) and Building 2 N = 149), in keeping with (4) WAIS Comprehension 83 81 291 (5) WAIS Similarities 83 81 291 our previous reports. b (6) WMS Paired Associates 82 69 260 Of these 380 persons, 83 died within five (7) WMS Visual Reproduction 81 70 258 (8) Following Instructions years of initial testing; 52 were from Building 74 56 230 1 (15 men, 37 women) and 31 were from Build- Perceptual (9) Bender-Gestalt 81 72 266 ing 2 (14 men, and 17 women). Hierarchical (10) Hooper Visual Organization 82 74 267 analyses of variance were conducted on the Psychomotor (11) WAIS Digit Symbol demographic variables of age, education and 80 72 271 (12) Crossing-off 82 75 270 SES. The two factors in the analysis were (13) Trailmaking A 81 69 266 building (ordered first) and status (living vs Personality deceased). Although the living and deceased (14) Hostility Inventory 83 74 277 (15) Neuroticism Scale 83 68 273 were comparable in terms of education and (16) Zung Depression Scale 82 71 270 SES, the living subjects (N = 297) were sig(17) Change Questionnaire 83 78 288 nificantly younger (mean age = 72.3) than (18) Slow Writing 80 67 256 (19) Life Satisfaction 83 76 277 those who had died (mean age = 74.9) within (20) Control Rating 83 75 275 five years of testing, F (1/376) = 11.95, p < Health .001. The interaction between building and 78 71 (21) Quantified Physician's Report 258 (22) Clinical Rating0 43 23 status was not significant in any of these 119 (23) Self-health Rating 83 75 278 analyses. Social Activities As a result of these analyses, a sample of (24) Pastimes 288 83 79 (25) Clubs 288 subjects still living was matched with those (26) Offices Held 83 79 288 deceased for age, sex and building (N = 83 pairs). These 166 subjects were examined in a Died within 5 years after assessment b Wechsler Memory Scale (WMS) the major data analyses which follow. c

758


RESULTS

analysis: Digit Symbol, 6%; Trailmaking, 7%; Bender-Gestalt, 5%; Crossing-off, 9%; Following Instructions, 5%; Visual Reproduction, 6%; Paired Associates, 7%; Hooper, 5%; Neuroticism Scale, 6%; Zung Depression Scale, 8%; Self-Health Rating, 9%; Control Rating, 4%; Life Satisfaction, 2%; Medical Rating, 3%; Clinical Rating, 8%. Clearly, while these procedures were effective in statistically differentiating status groups, prediction based on any one of these procedures would be poor. However, using these 13 measures in combination as predictors of death, a multiple correlation indicated potential usefulness (R = .44). A discriminant analysis of these data indicated that if appropriate coefficients were applied to the subjects' scores on these 13 procedures, 70% of the living and 63% of the deceased subjects in the sample would be correctly classified (Wilks' lambda = .8023, X2 = 34.687, df= 13, p < .001). It should be noted that these analyses included subjects with missing scores and these were given a value of 0 to represent poor performance (i.e., those who could not complete the procedure were seen as performing poorly). A more conservative, although, perhaps, a less meaningful method would include only pairs of subjects with no missing data; this was done also. The multiple R was .45, with both 69% of the living and 69% of the deceased correctly classified by discriminant analysis (N = 102; Wilks' lambda = .7982, X2 = 21.074, df= 13, p = .071). Thus, essentially the same degree of prediction was obtained by both methods of analysis. However, the use of 0 scores to represent missing data is seen as the more useful practice since the number of individuals who could be screened by the battery is maximized. As indicated, these analyses involved matched pairs of living and deceased subjects. A potentially more useful application of the predictive measures of this study involves discrimination of individuals of both sexes and of varying ages. Thus, analyses comparable to those described above conducted with the matched sample were also carried out for the total unmatched sample. Here the Status effect (i.e., living vs dead), approached significance (Mult. F = 1.56, df = 18,359, p = .0673), with the other effects comparable to those obtained in the analysis of matched pairs.


Initial scores. — Pairs of living and deceased subjects were matched in terms of age, sex, and building in order to determine which procedures are useful in signalling death in comparable groups. A two-factor multivariate analysis of variance was employed on those procedures in Table 1 which required the subject's response or performance (since the purpose of this study was to determine the predictive value of such behavior). These measures included procedures numbered 4 through 20, and 23 in Table 1. The two factors included in the design were building (1 vs 2) and status (living vs deceased). Building differences were significant (Mult. F = 3.19, df = 18,145, p < .0001), as were Status group differences (Mult. F = 2.38, df = 18,145, p = .0025), with the interaction between building and status groups not statistically significant (p > .05). The building effect represents the disparate performance of the two study populations on these behavioral procedures. However, since building and status did not interact, the significant differences among the living and deceased may be interpreted without regard to the population from which the subjects were obtained, thus allowing greater generalizability of the predictive nature of poor performance on selected procedures. Univariate analyses of variance revealed that 13 of the 18 performance measures distinguished between the living and dead: Digit Symbol (p = .0010), Bender-Gestalt (p = .0034), Trailmaking (p = .0004), Crossing-off (p < .0001), Following Instructions (p = .0014), Visual Reproduction (p = .0011), Paired Associates (p = .0004), Hooper (p = .0032), Neuroticism Scale (p = .0010), Zung Depression Scale (p = .0002), Self-Health Rating (p < .0001), Control Rating (p = .0043), Life Satisfaction (p = .0463). Those still living scored higher than those now deceased with respect to each of these performance measures. A similar analysis of the remaining, nonperformance variables in Table 1 indicated that the two status groups were different only with respect to Medical Rating (p = .0213) and Clinical Rating (p = .0002). Those who subsequently died were seen as less healthy and less vital. The following are the a>2's relating performance to the Status effect in the above


Change scores. — A set of variance analyses was carried out with only the deceased subjects to test for "terminal drop" or the relationship between mortality and test score decline. It compared subjects who completed three testings and subsequently died. Separate variance analyses were carried out for each test in Table 1, comparing performance scores among the three test sessions (the N associated with each test varied with a maximum of 37). In the analysis of the three testing sessions, only two procedures showed significant 4 The "purist" correctly maintains that nonsignificant multivariate Fs should not be followed by univariate analyses unless there were specific hypotheses regarding the individual tests. The fact is, most (but not all) of the tests were selected because there was reason to think that they would discriminate between status groups. More important, however, is the fact that there were many more survivors than deceased — a skewed sample which could well have biased the results of the analyses. Given this, and given the generally exploratory nature of this five-year study, it might be argued that it is justified, that there is more to gain in carrying out univariate analyses following the multivariate F associated with a p value of .0673, than by possibly making an error of concluding "no effect."

changes in performance from Test Session 1, to 2, to 3: Bender-Gestalt (p = .0060) and Hooper (p = .0197). However, both of these differences were opposite to the hypothesis that test score decline predicts death. The scores increased over the testing period. Thus, there was no evidence of "terminal drop" or decline within the first year of measurement as an indicant of ensuing death during the subsequent four years. (While most studies examine terminal drop over periods longer than examined here [e.g., five years], practical usefulness makes the one-year period of examination important. Be it practice or other reasons, the drop was not seen.)

DISCUSSION

One of the purposes of this study was to develop a battery of behavioral procedures which might be predictors of ensuing death among the elderly. It is believed that such a battery might one day be useful as part of routine biomedical assessments of the elderly patient. Just as a measurement of high blood pressure, for example, puts the physician on the alert, so might a behavioral indicator, or even more so, a behavioral indicator in combination with physiological ones. In part, it was for this potential practical utility that most of the procedures chosen were very brief, easy to administer by untrained personnel, and easy for the subject to take. The long, arduous, perhaps more reliable tests, were avoided. The 13 most promising in the matched sample procedure take about VA hours to administer, with the Hooper Visual Organization Test and the Neuroticism Scale taking about 20 min each. Perhaps future studies will show that these two procedures can be discarded without appreciable loss of prediction power. The eight procedures which were most promising in the unmatched sample take about 30 min to administer and very little time to score. Two different multiple correlation and discriminant function analyses were made, each based on different assumptions. The less conservative analysis takes into consideration the fact that less complete data were available for those subjects who died. With this analysis, the multiple correlation based on the 13 procedures was .44; 66% of the matched sample


Examination of the univariate F-ratios4 revealed that only 8 of the performance measures listed in Table 1 differentiated the two unmatched status groups: WAIS Digit Symbol (p = .0014), Trailmaking (p = .0005), Visual Reproduction (p = .0180), Paired Associates (p = .0166), Crossing-off (p = .0043), Zung Depression Scale (p = .0239), Control Rating (p = .0458), and Self-Health Rating (p = .0007). When F-tests were carried out with the remaining variables in Table 1, the following also significantly differentiated the two status groups: Age (p = .0006) and Clinical Rating (p = .0033). The older subjects and those seen as less vital subsequently died at a higher rate. Using these eight procedures in combination as predictors of death, a multiple correlation of .47 was obtained. This correlation was based on an equal number (N = 83) of deceased and living subjects, with the living subjects randomly selected from the pool of 297. This was to avoid attenuation of the correlation with unequal N of the two status groups. A discriminant analysis of these data correctly classified 71% of the living subjects and 64% of the deceased (Wilks' lambda = .7813, X2 = 39.486, df = S,p < .0001). Thus, it was possible to differentiate to essentially the same extent those individuals who survived and those who did not, even when they were of varying ages.

759

760



could be correctly classified (discriminant period of prediction was shorter than for the analysis) into those who would be living or performance procedures. dead within five years. With the eight proThe three psychomotor tasks — Digit Symcedures, 68% of the unmatched sample in this bol, Trailmaking A and Crossing-off — form a study could be correctly classified; the mul- logical trinity. Each of these tasks is made up tiple correlation between the eight procedures of a single sheet of paper with items of informaand status was .47. This seems a promising tion on it. As the subject completes one item he beginning to the prediction of death among moves to the next. The more quickly he works, the elderly for the purpose of placing them the more rapid the functional sequencing of under medical alert. This is especially so when information (Botwinick & Storandt, 1974). it is recognized that by actuarial statistics, It is possible that these three simple speed only 24% of those in the age group tested (60 tasks differentiate the status groups by reflectand over) would be expected to die within five ing the functioning of the neural structure. For years. (This actuarial information was pro- example, Hicks and Birren (1970), after a vided by the St. Louis County Health Depart- review of the literature, concluded "Damage ment.) Twenty percent of the present sample or dysfunction of the basal ganglia may be died within the five-year period. the basis for the psychomotor slowness" of This study holds promise but it is only a "aged and brain damaged subjects." beginning. Validation studies must follow. It is The Trailmaking task is one part of a test to be noted that individually the procedures of brain damage in common use in many clinics do not predict well — they do so only in com- in this country. While brain damage was not bination. Further, the present data suggest that presumed to be present in the subject samples prediction can be more accurate if limited studied, it is obvious that people do vary on to homogeneous populations. For example, some dimension of brain function. The WAIS status group differences in the unmatched Digit Symbol task is used even more often. sample accounted for only 1% of the variance This task measures more than just speed. in Crossing-off test performances; but when Storandt (1976) showed that the coding or the two status groups were matched for age, cognitive dimension of this task differentiated sex, and building, Crossing-off test account- two age groups at least as well as the copyability increased from 1 to 9%. ing or simple motor dimension. The CrossingThree of the eight procedures in the un- off test has now twice displayed a surprismatched sample were simple psychomotor ing sensitivity. In addition to the present speed tasks with varying cognitive or percep- study, performance on this task was related tual demands (Digit Symbol, Crossing-off, to the clinical ratings made some months Trailmaking A), two were learning and memory later (Storandt et al., 1975). Thus, the Crosstasks of the Wechsler Memory Scale (Paired ing-off test is proving to be most promising; Associate Learning and Visual Reproduction), 9% of its variance was accounted for by status two were personality measures (Zung Depres- when age, sex and building were matched. sion Scale and Control Rating), and one was Forerunners of this simple procedure difa health rating (Self-Health Rating). It is to be ferentiated various age groups and also difnoted that the latter measure, self-rating of ferentiated "normal" elderly from "senile" health status, was predictive of future life or elderly (Birren & Botwinick, 1951). death while a quantitative index of a physiThe sparse data available in the established cian's medical report was not. It is also to be literature suggests that it is primarily perfornoted that all of the subjects were diagnosed mance on tests of verbal skills, and not speed, as well and capable of independent living at that is related to ensuing death. For example, the time the study began. in examining change scores (the decline in Another rating was also predictive of death scores, not initial test values), Blum et al., — clinical rating. This rating was made by (1973) concluded that decline in speed "may evaluators, not the subjects themselves. It represent a general concomitant of aging and is was a rating based on the degree of vitality not a predictor of mortality" but "decline on the subject displayed during a very brief in- certain tests of cognitive functioning is correterview. The interview and rating were made lated with mortality." Two verbal skills tasks 6-14 mo following initial testing, thus the and one psychomotor task in that study were


(Hooper and Bender-Gestalt), one was designed to assess concentration ability (Following Instructions), and two were personality scales (Life Satisfaction and Neuroticism Scale). It is of interest here that the physician's report was also predictive of death. It was predictive with the matched sample but not with the unmatched. Perhaps the physicians took into account the age and sex of the subject when making their judgments regarding health. Perhaps they judged the subject as healthy when old, when they might have judged otherwise with a younger adult. No terminal decline in scores was found within one year of initial testing. Only the initial score was useful in prediction. In terms of future research, the initial test scores discussed here require further validation before they can be recommended for routine use as guides in assessment. At least eight differentiating tasks are quick and easy to administer and worthy of such validation efforts.

SUMMARY

A battery of 18 tasks was given to 380 healthy men and women aged 60 to 89 years, and in addition, eight health, social activity and demographic assessments were made. Five years later, the scores of all these were compared between those still living and those who had subsequently died. These two groups, living and dead, were referred to as status groups. Eight of the 18 task performances significantly distinguished between status groups, and when the subjects of the two groups were matched for age and sex, 13 task performances were significantly distinguishing. The multiple correlation between the eight task performances and status was .47; the multiple correlation between the 13 task performances and status was .44. Discriminative analyses were carried out with the eight task performances and with the 13. Based on discriminative score cut-offs with the eight tasks, 68% of the subjects were correctly classified; based on the 13,66% were correctly classified. Most all these tasks are very brief, easy to administer by even untrained personnel and easy for the subject to take. Accordingly, the tasks warrant further validation efforts for


particularly predictive of death: Similarities, Vocabulary, and the Digit Symbol task. In similar fashion, Birren (1968) compared single test session scores between elderly men who survived five years after testing and those who did not. He suggested that loss of speed is an age function, but that diminished stores of verbal information are disease-related and thus death-related. The present study suggests otherwise. It pointed to only one verbal task as important — Paired Associates Learning. It also pointed to a nonverbal recall task (Visual Reproduction). The WAIS Similarities and Comprehension procedures, however, were not predictive of death. Two personality scales also differentiated status. The Zung Depression Scale pointed to ensuing death, and this is compatible with the relocation literature of nursing home residents. Miller and Lieberman (1965) showed that depressed residents had lower survival potential. The other personality scale was a self-rating of "What degree of control do you feel you have over things?" The rationale was simply that people who feel in control might be those who feel well, who are able, and who can look forward to the future with contentment and possibly productivity. Whatever the basis, this procedure also differentiated status groups. The last differentiating procedure was also a self-rating (of health). Like the Control Rating, the scale was made of a SV^-inch horizontal line drawn with a zero at one end (poor health) and a 10 at the other (excellent health), with slash marks dividing equal units in between. The subject simply circled the number "showing how you rate yourself in health." Not only did this self-rating differentiate the two status groups, but as indicated, it did so better than a quantified physician's report. It is not surprising that health is a factor in survival; it may be surprising that such a simple self-rating could be relatively correct in indicating this. Perhaps the data of Maddox and Douglass (1973) are relevant here. They found persistent positive congruence between selfratings of health and physician's ratings of health in longitudinal investigations. Moreover, the self-ratings had prediction potential for future ratings by physicians. Five additional procedures were predictive of death when subjects were matched for age and sex. Two were tests of brain damage

761

762


the purpose of their practical use as adjuncts in routine biomedical assessments of elderly adults. These behavioral measures are seen as potentially valuable in alerting the physician to trouble ahead.

REFERENCES


Berkowitz, B. Changes in intellect with age: IV. Changes in achievement and survival in older people. Journal of Genetic Psychology, 1965, 107, 3-14. Birren, J. E. Increment and decrement in the intellectual status of the aged. Psychiatric Research Report 1968,25. 207-214. Birren, J. E., & Botwinick, J. The relation of writing speed and age and the senile psychoses. Journal of Consulting Psychology, 1951,75, 243-249. Blum, J. E., Clark, E. T., & Jarvik, L. F. The New York State Psychiatric Institute study of aging twins. In L. F. Jarvik, C. Eisdorfer, and J. E. Blum (Eds.), Intellectual functioning in adults. Springer Publ. Co., New York, 1973. Botwinick, J. Aging and behavior. Springer Publ. Co., New York, 1973. Botwinick, J., & Storandt, M. Memory, related functions, and age. Charles C Thomas, Springfield, IL, 1974. Hicks, L. H., & Birren, J. E. Aging, brain damage and psychomotor slowing. Psychological Bulletin, 1970, 74, 377-396.

Hollingshead, A. B. Two-factor index of social position. 1965 Yale Station, New Haven, CT, 1957. (Mimeo) Kleemeier, R. W. Intellectual change in the senium. Proceedings of the social statistics section of the American Statistical Assoc. 1962, 290-295. Lieberman, M. A. Psychological correlates of impending death: Some preliminary observations. Journal of Gerontology, 1965,20, 181-190. Maddox, G. L., & Douglass, E. Self-assessment of health: A longitudinal study of elderly subjects. Journal of Health and Social Behavior, 1973, 14, 87-93. Miller, D., & Lieberman, M. A. The relationship of affect state and adaptive capacity to reactions to stress. Journal of Gerontology, 1965,20, 492-497. Palmore, E., & Cleveland, W. Aging, terminal decline and terminal drop. Journal of Gerontology, 1976, 31, 76-81. Palmore, E., & Jeffers, F. C. (Eds.). Prediction of life span. D. C. Heath Co., Lexington, MA, 1971. Riegel, K. F., & Riegel, R. M. Development drop and death. Developmental Psychology, 1972, 6, 306-319. Siegler, I. C. The terminal drop hypothesis: Fact or artifact? Experimental Aging Research, 1975,/, 169-185. Storandt, M. Speed and coding effects in relation to age and ability level. Developmental Psychology, 1976,12, 177-178. Storandt, M., & Wittels, I. Maintenance of function in relocation of community-dwelling older adults. Journal of Gerontology, 1975,50, 608-612. Storandt, M., Wittels, I., & Botwinick, J. Predictors of a dimension of well-being in the relocated healthy aged. Journal of Gerontology, 1975,50, 97-102. Wittels, I., & Botwinick, J. Survival in relocation. Journal of Gerontology, 1974,29, 440-443.

Predicting road test performance in drivers with stroke.

Predicting Fatigue and Psychophysiological Test Performance from Speech for Safety-Critical Environments.

Separating chokers from nonchokers: predicting real-life tennis performance under pressure from behavioral tasks that tap into working memory functioning.

Predicting motor learning performance from Electroencephalographic data.

Predicting Speech-in-Noise Recognition From Performance on the Trail Making Test: Results From a Large-Scale Internet Study.

Predicting students' happiness from physiology, phone, mobility, and behavioral data.

Predicting fetal death.

Predicting fetal death.

Algorithm for Predicting Disease Likelihood From a Submaximal Exercise Test.

Predicting mortality from cervical cancer after negative smear test results.

Predicting death from initial disease severity in very low birthweight infants: a method for comparing the performance of neonatal units.

Pay-for-performance: toxic to quality? Insights from behavioral economics.

Death knell for guineapig test.

Predicting the risk of sudden cardiac death.

Predicting sudden cardiac death in heart failure.

Psychological test performance during climatic heat stress from dessert winds.

Understanding protocol performance: impact of test performance.

Treadmill exercise test for predicting coronary disease.

Predicting suicide using the Rorschach Inkblot Test.

Treadmill exercise test for predicting coronary disease.

A behavioral test of three F subscales.

Predicting who benefits most from cognitive-behavioral therapy for anxiety and depression.

Predicting academic performance in surgical training.

Predicting reading performance using the slingerland procedures.