Psychiatry

93

Research, 44:93-106

Elsevier

Differential Validity of Psychometric Tests in Dementia of the Alzheimer Type Ralf Ihl, Lutz Friilich, Thomas Dierks, Eva-Maria Received

February

25, 1992; revised

version

received

Martin,

and Konrad Maurer

June 24, 1992: accepted

September

5, 1992.

Abstract. Forty-nine patients with a clinical diagnosis of probable dementia of the Alzheimer type underwent an extensive test battery designed to evaluate cognitive deficits according to NlNCDS/ ADRDA criteria. All patients demonstrated signs of impairment on this test battery. One day later, they were administered a second test battery that consisted of the Mini-Mental State Examination (MMS), the Alzheimer’s Disease Assessment Scale (ADAS), the SKT test (SKT), and the Brief Cognitive Rating Scale (BCRS) to assess the construct validity, sensitivity, and possible shortcomings of these tests. A control group of 47 age-matched persons was administered the same test battery to allow a comparison with reference values from other studies. Due to the design of the study, values of controls and patients did not overlap. Intercorrelations in patients were above 0.65 (p < 0.05 after Bonferroni correction) for all four cognitive tests. The ADAS and BCRS appeared to document the whole course of the disease in patients studied. The best differentiation with the SKT test could be obtained in mild to moderate dementia; however, due to the test’s construction, a floor effect demonstrated its limitations in the case of severe dementia. Results obtained with the MMSE indicated the contrary: a ceiling effect showed its lack of differentiation in mild dementia. Therefore, a combination of tests should be used in the evaluation of cognitive deficits in the course of dementia of the Alzheimer type. Key Words. diagnosis.

Alzheimer’s

disease,

cognition,

psychology,

clinical

evaluation,

Numerous psychometric tests, rating scales, and self-assessment batteries are used in the evaluation of dementia of the Alzheimer type (DAT) and other dementias. Although many tests are available, only a few of them are well validated (Siu, 1991). Furthermore, it is unclear whether the new tests that are continually being introduced offer specific improvements relative to standard instruments such as those recommended by Yesavage et al. (1988). Before more new tests are constructed or combined into test batteries, more data should be collected on existing tests to evaluate their advantages and shortcomings. Little is known, for example, about their validity in different stages of dementia. Such information would be helpful in the comparison of test results for “early” versus “late” dementia.

Ralf Ihl, M.D., is Staff Psychiatrist and Psychologist; Lutz Friilich, M.D., is Staff Psychiatrist: Thomas Dierks, M.D., is Research Assistant; Eva-Maria Martin is Staff Psychologist: and Konrad Maurer, M.D.. is Professor of Psychiatry, Department of Psychiatry, University of Wiirzburg. (Reprint requests to Dr. R. Ihl, Department of Psychiatry, University of Wiirzburg, Fiichsleinstr. 15, D-8700 Germany.) 01651781/92/$05.00 0 1992 Elsevier Scientific Publishers Ireland Ltd

94

Some of the most frequently used tests are believed to measure the severity of the disease or to divide the course of the disease into severity stages. Many studies that have attempted to provide external validation for the DAT diagnosis through the use of physiological measures such as positron emission tomography, computed tomography, and quantitative electroencephalography have been carried out on the assumption of the validity of existing psychometric tests. However, the tests may not be equally useful in the evaluation of different stages of the disease. Ceiling and floor effects could be one type of methodological factor limiting the usefulness of some tests. The present study was designed to evaluate the applicability of various standard tests to the assessment of DAT in different stages. To identify a group of DAT patients for study, it was necessary to use independent tests to ensure that interference with results would be kept to a minimum. Therefore, we used an independent test battery that provided information needed to make NINCDS/ADRDA diagnoses of DAT (McKhann et al., 1984), but deliberately did not include in the initial assessment those tests whose construct validity was to be the focus of study. The tests investigated were the Mini-Mental State Examination (Folstein et al., 1975) the Alzheimer Disease Assessment Scale (Mohs et al., 1983; Rosen et al., 1984) the SKT test (Erzigkeit, 1989) and the Brief Cognitive Rating Scale (Reisberg et al., 1983a, 1983b). The objectives of the study were (1) to determine whether there are floor and ceiling effects and (2) to compare the sensitivity of these tests in detecting early dementia, clinically diagnosed as described above. We also attempted to evaluate the score structure of the above tests as related to the severity of the disease. Such information would be useful in improving measures designed to describe the course of the disease and to monitor the results of clinical drug trials.

Methods Subjects. Forty-nine patients with dementia (30 women, 19 men; see Table 1) from the Department of Psychiatry, University of Wtirzburg, were selected for study. They had a mean age of 70.4 years (SD = 10.5, range = 45-84). For descriptive purposes, 49 age-matched controls were included (Table 1). They had a mean age of 68.9 years (SD = 9.9, range = 51-87). Control subjects were required to be free of centrally acting drugs and free of illnesses affecting the central nervous system. Controls were recruited from a day house center in Wiirzburg and a dance circle for elderly persons. The patients fulfilled the NINCDS/ for a ADRDA criteria for probable DAT (McKhann et al., 1984). They were hospitalized period of 2-3 weeks for diagnostic evaluation, Diagnostic assessment included medical history; physical, neurological, and psychiatric examinations; and routine laboratory tests (including thyroid hormone levels, vitamin B,,, and folate). The cognitive function of the patients was investigated with a group of neuropsychological tests primarily used to provide confirmatory evidence for the diagnosis of dementia. The tests that were the subject of the present investigation were not included to avoid confounding study results. The test battery consisted of the Pictures and Objects Test (POT; McCarthy et al., 1981) the Shopping List Task (SLT; McCarthy et al., 1981), subtests of the Wechsler Adult Intelligence Scale (Wechsler, 1964), the Names and Figures Test (NFT; Crook et al., 1986), and the Sandoz Clinical Assessment Geriatric (SCAG; Shader et al., 1979). These tests are either specified in the article describing the NINCDS/ ADRDA criteria or assess the functions tapped by the criteria (McKhann et al., 1984). All patients showed signs of impairment on this test battery, and all were included in the study. The modified Hachinski ischemic score (score < 4) was used to exclude patients

95

Table 1. Characteristics of patients with dementia of the Alzheimer of the age-matched control group Patients

Parameter n Age (yr) f

SD

47

70.4 * 10.5

68.3 * 9.9

45-84

51-87

19130

14133

3.2 zt 1.2

0

l-6

0

SD

Range (yr)

Patients’ scores on inclusion tests Shopping

List Task

Pictures and Objects Test Names and Figures Test Block Design Digit Span

Controls

49

Range (yr) Sex (M/F) Duration of the disease (yr) f

type and

Mean

SD

Minimum

Maximum

16.7

9.2

4

43

15.4

8.1

4

38

3.2

2.1

0

10

82.3

11.0

70

106

5.5

2.3

0

12

58.0

10.1

31

76

Sandoz Clinical Assessment Geriatric

Note. BCRS = Brief Cognitive Rating Scale. SKT = SKT Test. MMSE = Mini-Mental State Examination. ADAS-C = Alzheimer’s Disease Assessment Scale-Cognitive.

with multi-infarct dementia (Rosen et al., 1980). The differential diagnosis between DAT and multi-infarct dementia was further validated by computed tomographic examinations of all patients which showed only cerebral atrophy, ventricular dilatation, and no more than one lacunar infarction. In no case were there territorial infarctions. Furthermore, topographic maps of electroencephalographic activity and acoustically evoked potentials were consistent with a diagnosis of DAT (Ihl et al., 19896; Maurer et al., 1988). The electroencephalographic measures showed a slowing and an anteriorization of the peak frequency, an increase in slow wave activity, and typically an anteriorization and decreased amplitude of the P300 wave. In most patients, single photon emission computed tomography with technetium 99mhexamethyl-propyleneamineoxime (HMPAO-SPECT) was performed and did not reveal signs of multifocal flow deficits. Typically, temporoparietal and frontal flow deficits were found (Frolich et al., 1989; Ihl et al., 1989~). Neuropsychological Assessment. The Brief Cognitive Rating Scale (BCRS; Reisberg et al., 1983a, 19836, 1985; Reisberg and Ferris, 1988) was used in staging of the disease. It subdivides the disease process into seven stages from normal mental capacities (stage 1) to very severe deficit (stage 7). The first version of the BCRS included five subscales or axes (concentration, short-term memory, long-term memory, orientation, and functioning and self-care (Reisberg et al., 1983b). For this study, a version with eight subscales was used because it was believed to offer improved reliability; the additional subscales were speech and language, motoric functioning, and mood and behavior (Reisberg et al., 1983~). In a clinical interview, every subscale has to be rated as representing stages 1 through 7 according to the judgment of a clinical psychiatrist. A mean score, calculated on the basis of all subscale scores, is used for staging. Its reliability and validity have been extensively studied: Intercorrelations between subscales ranged from 0.58 to 0.79 (Yesavage et al., 1988) or higher (Reisberg and Ferris, 1988). External validity was provided by comparing the test scores with electroencephalographic (Ihl et al., 19896; Dierks et al., 1991) and SPECT (Frolich et al., 1989; Ihl et al., 1989~) examinations. Interrater reliability was between 0.92 and 0.97 for psychiatrists, and between 0.76 and 0.93 for a group comprising research assistants (psychiatric nurse, clinical

96 psychologist, and clinical psychology graduate student) (Foster et al., 1988). Test-retest reliability was between 0.82 and 0.86 for the subscales (Reisberg et al., 1989). In the definition of the stages of the BCRS given by Reisberg et al. (1985), impairment is very mild in stage 2 and mild in stage 3. To receive a score of 3, the patient’s impairment must be reported not only by the patient, but must be obvious to others as well. Thus, a mean score < 2 indicated very mild impairment. For a diagnosis of mild DAT to be given, the patient had to have scores of 3 on at least three subscales. A score of 3 was required, in addition, on the “Memory” and “Functional and Self-Care” subscales to satisfy the NINCDS/ ADRDA criteria. The Mini-Mental Status Examination (MMSE) is one of the most frequently used screening tests to establish a profile of deficits in dementia and is not timed. It requires 5-10 minutes for administration. A trained technician can administer it to patients during brief periods of cooperation. MMSE scores range from 0 to 30 (30 = no deficits, < 23 = dementia and delirium; Cockrell and Folstein, 1988). Precise instructions for administration are provided (Spencer and Folstein, 1985). The test is extensively validated. Test-retest reliability over a 24-hour period is 0.89; interrater reliability is 0.82 (Cockrell and Folstein, 1988) and above (Foster et al., 1988). Test-retest-reliability over 28 days is said to be 0.98 (Folstein et al., 1975). External validity was demonstrated by correlating the test results with computed tomography (Tsai and Tsuang, 1979), electroencephalography (Tune and Folstein, 1986; Celsis et al., 1990; Primavera et al., 1990; Lopez et al., 1991), positron emission tomography (Kessler et al., 1991), and the density of synapses in the third frontal cortical layer (DeKosky and Scheff, 1990). As a measure of construct validity the MMSE was also correlated with other tests in this field of research (Yesavage et al., 1988). Sensitivity to detect dementia or delirium was 0.87; specificity was 0.82 (Anthony et al., 1982). No data are available on the MMSE’s detection of disease stages that represent various levels of severity. The SKT test (SKT; Erzigkeit, 1989) is a timed test to examine patients with organic brain syndrome. A detailed description of the test is given by Erzigkeit (1989). Nine subtests assess naming of objects, short- and long-term remembering of objects, ability to put numbers into ascending order, “interference,” and counting ability. It is scored from 0 to 27, with 27 as the lowest score. Five parallel forms-Forms A-E-are available. In contrast to the tests described above, age-specific norms are provided in the SKT manual (Erzigkeit, 1991). Reliability and validity data are also included in the manual and are also reported by Overall and Schaltenbrand (in press). The best validated cutoff values were determined from the norms given in the SKT manual. A score < 4 points is normal, and scores of 5 to 9 points signify mild or questionable organic brain syndrome. The 21 subscales of the Alzheimer Disease Assessment Scale (ADAS; Mohs et al., 1983; Rosen et al., 1984) cover the spectrum of symptoms of Alzheimer’s disease. The lowest score is 120. The items represent symptoms found in a group of patients with Alzheimer’s disease who were diagnosed by brain biopsy (Mohs and Davis, 1982). It consists of cognitive and noncognitive sections. Most cognitive items are evaluated by tests; noncognitive items are rated in a clinical interview. The scale was not constructed for differential diagnosis but divides demented persons from age-matched controls (Mohs and Cohen, 1988). In the course of the disease, a 7- to 8-point worsening is observed over a l-year period of illness (KramerGinsberg et al., 1988). Interrater reliability is between 0.87 and 0.98 (Mohs et al., 1983). Test-retest reliability is 0.91 for the cognitive section, 0.58 for the noncognitive section, and 0.83 for the total score in patients with Alzheimer’s disease; the values for controls are 0.47 for the cognitive, 0.52 for the noncognitive, and 0.58 for the total score. Construct validation was done by correlating the ADAS scores with MMSE scores and BCRS scores (Burch and Andrews, 1987~. 19876; Yesavage et al., 1988). External validation was provided through a correlation with optic nerve head and nerve fiber layer changes in Alzheimer’s disease (Tsai et al., 1991). For the ADAS, no cutoff score is given, but Mohs et al. (1983) reported a mean score of 1.7 (SD = 1.7) points for a geriatric control group on the cognitive section of the ADAS (ADAS-C). If the SD is multiplied by 2.5 and added to the mean, the significance interval will end at 5.5 and values above 6 would differ significantly from those for this geriatric control group.

97 Procedures. Initially, the patients were clinically diagnosed as suffering from “probable DAT.” In all patients, the clinical diagnosis was corroborated by deficits on the neuropsychological tests. This judgment had to be compared with performance on a second psychometric test battery. Therefore, patients had to be tested for a second time with the battery studied here (BCRS, SKT, MMSE, and ADAS). The tests were administered in a latin square design to avoid sequence effects. Both the inclusion test battery on the first day and the experimental battery on the second day were administered over 60-90 minutes. The time between both testing sessions was I day. The psychometric assessment was performed by a clinical psychologist. Each test was used as a control for the three other tests in this group of DAT patients to permit conclusions to be drawn about the convergent validity of the tests. Although a group of geriatric controls cannot contribute much to this issue, we decided to include such a control group for descriptive purposes to document the range of normal values. It should be noted that this study was not designed to differentiate controls from patients but, rather, to investigate one aspect of the validity of the rests by comparing their outcome in a defined group of patients. Statistics. To evaluate interrelations between the five tests (the cognitive and noncognitive sections of the ADAS were considered as separate tests), intercorrelations of sum scores were calculated for all tests using the Spearman rank-order correlation. The significance level was set at p < 0.05 after Bonferroni correction for multiple tests. In addition, a descriptive data analysis was performed to detect floor and ceiling effects. For each test, scores were segmented into six ranges. The number of categories was determined by the ranges of the SKT. For the other tests, the first range included the values up to the cutoff score (MMSE), the scores of normal controls (ADAS), or the previously defined categories of the BCRS. The other ranges were calculated by dividing the remainder of scores (number depending on each test) by 5. The following ranges resulted: for the SKT, O-4, 5-9, 10-13, 14-18, 19-23, and 24-27; for the BCRS, 1.0-1.9, 2.0-2.9, 3.0-3.9, 4.0-4.9, 5.0-5.9, and 6.0-7.0; for the cognitive section of the ADAS, O-5, 6-18, 19-31, 32-44, 45-57, and 58-70 (the noncognitive section was not used for this comparison due to its evident differences from the other tests); and for the MMSE, 30-24, 23-19, 18-15, 14-10, 9-5, and O-4. The distributions of values of sum scores of the MMSE, the SKT, the BCRS, and the ADAS were calculated on the basis of the ranges defined by the other tests. Finally, the distributions of the various scores were compared using a two-tailed Wilcoxon test for dependent variables. The significance level was set at p < 0.05 after Bonferroni correction. Values of the geriatric controls were descriptively attached to the ranges.

Results Confirmatory Analysis. Correlations. The correlational part of the study revealed significant correlations with rs > 0.65 (p < 0.05 after Bonferroni correction) and therefore common variance between all tests with the exception of the noncognitive section of the ADAS, which only correlated with the sum score of the ADAS (Table 2). The percentage of explained variance was between 43.6% and 88.4% for the relationship between the four cognitive tests. Only a few of the patients were scored on the noncognitive subtests of the ADAS, and none were scored on all subtests. Depending on the grade of severity indicated by the SKT, the BCRS, and the MMSE, the means and SDS of the noncognitive scores were nearly the same for all categories of the cognitive test scores. Distribution of test scores. In the ADAS-C (cognitive section of the ADAS)

98

Table 2. Spearman rank order correlations for the total scores of the tests in this study Test BCRS

BCRS 1.00

SKT

SKT

MYSE

ADAS-C

ADAS-NC

ADAS

0.66

-0.70

0.80

0.21

0.77

1 .oo

-0.80

0.82

0.01

0.71

-0.81

-0.03

-0.71

1.00

0.20

0.94

1.00

0.50

MMSE

1 .oo

ADAS-C ADAS-NC ADAS

1.00

Note. BCRS = Brief Cognitive Rating Scale. SKT = SKT Test. MMSE = Mini-Mental State Examination. ADAS-C = Alzheimer’s Disease Assessment Scale-Cognitive. ADAS-NC = Alzheimer’s Disease Assessment ScaleNoncognitive. ADAS = Alzheimer’s Disease Assessment Scale-summed score.

and the BCRS, the group of patients showed a normal distribution of values with most patients being in medium-severe stages of the disease. In the SKT, more than half of the patients scored in the two highest range segments, indicating more impairment (Fig. 1). In the MMSE, more patients scored in the highest range segments, indicating less severe or no impairment. The differences between the distributions over the categories were significant for all comparisons (p < 0.002). When the BCRS values were used as a criterion, none of the patients scored between 1.0 and 1.9, and there were only four patients with a very mild level of

Fig. 1. Distribution of the frequency of patients in test score range segments PATIENTS

RANGE MM

-BCRS

tl

ADAS-C

-

SKT

Range segments in the order l-6 for the SKT, O-4,5-9,1 O-l 3.14-l 8.19-23, and 24-27; for the BCRS. 1.0-l .9,2.0-2.9. 3.0-3.9, 4.0-4.9. 5.0-5.9. and 6.0-7.0; for the cognitive section part of the ADAS (ADAS-C), O-5. 6-18, 19-31, 32-44, 45-57, and 58-70; and for the MMSE, 30-24,23-l 9, 18-15, 14-l 0. 9-5, and O-4.

99 impairment (mean = 2.0-2.9). None of the patients would have been classified as normal and only two as suffering from a mild organic brain syndrome on the basis of the SKT test. With respect to the MMSE, four patients would not have been classified as demented with a cutoff score of more than 23 points as proposed by Cockrell and Folstein (1988). All patients who could not be classified by the MMSE as demented had low scores on the other tests, but none of the other test scores were in range 1. The cognitive subscale of the ADAS clearly defined all patients as impaired, and even the lowest score of 12 points was far above the limit of 6 for normal controls. Descriptive Analysis. The span of scores over the range segments was highest for the MMSE with respect to a single range segment. The highest span was 3.0 or 4990 of the scale for the BCRS (in comparison with the SKT range segment 3; Table 3b), 42 or 60% for the ADAS-C (in comparison with the MMSE range segment 4; Table 3c), 17 or 63% for the SKT (in comparison with the BCRS range segment 5; Table 3a), and 22 or 73% for the MMSE (in comparison with BCRS range segment 4; Table 3a). Evaluating more than one range segment reveals data concerning an overlap of test scores, so that it is difficult to compare the scores of one test to the scores of another. For instance, the span common to range segments 3,4, and 5 of the BCRS is 10 to 21 points for the MMSE or 11 to 24 points for the SKT, which makes it impossible to draw a conclusion about the stage represented by these range segments of the BCRS (Table 3a). For the ADAS-C, the comparable values are 40 to 50 (Table 3a). The lowest number of common scores in ranges 3 to 5 is seen when ADAS-C

Table 3a. Test values of the ADAS-C (cognitive section), SKIT, and YMSE depending on BCRS values 2

3

4

5

2.0-2.9

3.0-3.9 (n=15)

4.0-4.9 (n=18)

5.0-5.9 (n=9)

20.8

32.8

44.2

54.2

8.9

10.5

8.1

7.9

3.6

z

15.0

12.0

35.0

40.0

59.0

Maximum

-

30.0

50.0

53.0

63.0

66.0

Mean

-

9.5

17.8

19.8

23.7

27.0

SD

-

3.7

3.6

4.0

5.3

0.0

Minimum

-

6.0

10.0

13.0

11 .o

27.0 27.0

Ranges

scores Tests ADAS-C

Mean

1 1.0-l .9 (n=O) -

SD Minimum

SKT

MMSE

(n=4)

6 6.0-7.0 (n=3) 83.0

Maximum

-

14.0

24.0

26.0

27.0

Mean

-

24.8

17.1

14.4

11.2

4.7

SD

-

2.1

4.5

4.6

2.3

6.9

Minimum

-

22.0

10.0

1 .o

1.0

1.0

Maximum

-

27.0

27.0

22.0

21.0

8.0

Note. BCRS = Brief Cognitive Rating Scale. SKT = SKT Test. MMSE = Mini-Mental State Examination. ADAS-C = Alzheimer’sDisease AssessmentScale-Cognitive.

100 serves as a reference (Table 3d). Comparing other ranges results in smaller spans of the common scores (Table 3 a-d). Nearly all control values were allocated to range segment 1. There were only two controls for the BCRS (4.1%) and five for the SKT (10.2%) in range segment 2.

Table 3b. Test values of the ADAS-C (cognitive section), BCRS, and MMSE depending on SKT values SK-r Ranges SCOr45S

Tests ADAS-C

BCRS

MMSE

1 Q-4 (n=O)

2 5-9 (n=2)

3 10-13 (n=5)

4 14-16 (n=13)

5 19-23 (n=15)

6 24-27 (n=14)

Mean

-

23.0

28.2

34.4

42.1

SD

-

9.9

13.8

9.3

0.4

7.2

Minimum

-

16.0

12.0

19.0

25.0

44.0

Max

-

30.0

42.0

46.0

53.0

66.0

Mean

-

2.3

3.9

3.9

4.4

5.2

SD

-

0.3

1.6

0.5

0.6

0.8

55.9

Minimum

-

2.1

2.5

2.9

3.5

3.9

Maximum

-

2.5

5.4

4.0

5.4

6.0

Mean

-

25.0

19.8

18.2

13.9

9.7

SD

-

0.0

4.9

4.8

4.6

3.4

Minimum

-

25.0

15.0

12.0

1.0

1.0

Maximum

-

25.0

27.0

27.0

21 .o

13.0

Note. BCRS = Brief Cognitive Rating Scale. SKT = SKT Test. MMSE ADAS-C = Alzheimer’s Disease Assessment Scale-Cognitive.

= Mini-Mental

State Examination.

Table 3c. Test values of the ADAS-C (cognitive section), BCRS, and SKT depending on MMSE values MMSE

Ranges SCOt7Xt

Tests ADAS-C

Mean

BCRS

2 23-19 (n=lO)

3 16-15 (n=9)

4 14-10 (n=19)

5 (n4_55)

6 (L14i02) 53.0

20.0

31.8

37.6

47.6

60.2

7.8

10.0

4.5

8.9

2.4

18.4

Minimum

12.0

15.0

30.0

22.0

58.0

40.0

Maximum

30.0

44.0

43.0

63.0

64.0

66.0

Mean

9.3

15.8

17.8

22.1

27.0

23.0

SD

3.6

2.8

3.5

3.6

0.0

5.7

Minimum

6.0

11.0

11.0

15.0

27.0

19.0

Maximum

14.0

20.0

22.0

27.0

27.0

27.0 5.2

SD

SKT

1 30-24 (n=4)

Mean

2.7

3.9

4.1

4.5

5.7

SD

0.4

0.8

0.5

0.6

0.4

1.1

Minimum

2.1

2.5

3.6

3.6

5.0

4.4

Maximum

3.1

4.9

5.4

5.5

6.0

6.0

Note. BCRS = Brief Cognitive Rating Scale. SKT = SKT Test. MMSE ADAS-C = Alzheimer’s Disease Assessment Scale-Cognitive.

= Mini-Mental

State Examination.

101 However, the two controls in range segment 2 of the BCRS did not fulfill the criteria for patients outlined above and there was also no overlap between scores of controls and patients with respect to the SKT results. Table 3d. Test values of BCRS, SKT, and MMSE depending on ADAS-C (cognitive part) values ADAS-C

Ranges Scores Tests BCRS

Mean

1 o-5 (n=O) -

4 32-44 (n=20)

5 45-57 (n=ll)

6 56-70 (n=6)

2.7

3.3

4.2

4.5

5.7

0.6

0.5

0.6

0.4

2.5

2.1

3.4

3.7

5.0

3.1

3.8

5.4

5.4

6.0

Mean

9.3

15.1

17.9

22.5

27.0

SD

2.1

4.6

3.5

2.6

0.0

Minimum Maximum

MMSE

3 19-31 (n=7)

0.4

SD

SKT

2 6-16 (n=3)

1 -

Minimum

x

7.0

6.0

11.0

17.0

27.0

Maximum

-

11 .o

20.0

24.0

26.0

27.0

Mean

-

24.7

21 .o

15.5

12.3

7.9

2.5

4.6

4.4

1.3

3.5

SD Minimum

z

22.0

14.0

1.0

10.0

1.0

Maximum

-

27.0

27.0

22.0

14.0

12.0

Note. BCRS = Brief Cognitive Rating Scale. SKT = SKT Test. MMSE = Mini-Mental State Examination. ADAS-C = Alzheimer’sDiseaseAssessmentScale-Cognitive.

Discussion The present study detected different indications for some psychometric tests that are frequently used to evaluate DAT. Moreover, both advantages and shortcomings of the individual tests were demonstrated. In the interpretation of our data, it has to be kept in mind that the basis for inclusion of patients was their performance on another independent test battery. Furthermore, the descriptive analysis describes the scores of one test by relating it to three others, so in this study the tests serve as controls for each other with regard to the ability to detect impairment that other tests detect (convergent validity). Other test features (e.g., coverage of all symptom domains, ease of administration, reliability, carryover effects, and sensitivity to longitudinal changes) are also important but were not studied. Moreover, we did not study many severely impaired patients (BCRS score > 6, n = 3; MMSE score < 5, n = 4; ADAS-C score > 58, n = 8). Thus, we can only draw tentative conclusions concerning very severely impaired patients. However, this study offers one way of revealing effects that cannot be found in other types of external validation studies. Not every test is sensitive enough in the early detection of dementia, but sensitivity increases with severity. In our data, the MMSE reached a sensitivity of only 25% in stage 2 according to the BCRS. The sensitivity over all stages of 88% was comparable to that reported in earlier studies (Anthony et al., 1982; Cockrell and Folstein, 1988; Mungas, 1991). One could argue that a diagnosis of dementia of the

102 Alzheimer type is usually valid in only 80% of cases (Boller et al., 1989); thus, the results of the MMSE could be congruent with the true diagnosis, but this would be speculative. It could not explain why the other three tests, which are more extensive than the MMSE, label the same patients as demented. Difficulties of the MMSE in detecting “early dementia” were also reported in other studies (Creasey et al., 1990; Fischer et al., 1991; Morris et al., 1991). The reason for this difficulty can be seen in problems with such test items as “Where are we?“, which can be answered more easily when the test is administered at home (O’Neill, 1991) and in correlations with educational level (Anthony et al., 1982; Cockrell and Folstein, 1988; Mungas, 1991) or with race (Fillenbaum et al., 1990). However, none of the scores of the agematched controls overlapped with the scores of patients. Thus, the MMSE does not classify controls as patients, but our data suggest that it does classify some patients who are in early stages of the disease as controls. Raising the cutoff score could reduce this problem. Compared with the other tests, the MMSE has the highest span in a single range segment (73Y0 compared with range segment 4 of the BCRS). Moreover, our data demonstrate that the MMSE (as well as the SKT) has a wide span of scores in moderate stages and is not able to differentiate between the stages 3, 4, and 5 of the BCRS. On the whole, in contrast to the other tests in this study, the MMSE labeled patients as not demented when other tests described cognitive deficits and its scores have a wide span in mild to moderately severe demented patients, which indicates low sensitivity for documenting the course of the disease or improvement during therapy. It should be kept in mind, however, that the MMSE was not constructed to address these specific issues (Folstein et al., 1975). The SKT confirmed cognitive impairment in all patients. No patient was classified as free from impairment, even in early stages of the disease, but the SKT classified five controls as patients (10.2%) when range segments were used. This critical issue is discussed in the SKT manual, and this range of scores was labeled “mild or questionable organic brain syndrome.” However, we did not see an overlap of actual test scores between controls and patients. The experimental test battery was not used to differentiate between controls and patients. Patients whose diagnoses were unclear were not included in the study. The wide span over stages 3, 4, and 5 of the BCRS is worth noting, but the SKT manual does not claim to measure the whole span of severity ranges of the disease. In contrast, it concentrates on scores in the lower four ranges of this study. The finding of a floor effect with this test therefore underscored statements made in the manual. The highest SKT scores correspond to stage 5 or above according to Reisberg’s BCRS. In stage 6, all patients had the maximal SKT score. Thus, the SKT should only be applied in early and moderate stages of dementia. For patients with a BCRS mean of 5.0 or higher, an MMSE score of 14 or lower, and an ADAS-C score of 50 or higher, the SKT cannot be usefully administered. The span of scores in the lowest four range segments shows no remarkable widening. However, the SKT test was constructed to evaluate the course of early dementia, and our data support this indication. There are only a few studies validating the ADAS. Normative values are available for a small group (Mohs et al., 1983). The scores of the whole test (noncognitive and cognitive sections) differentiated well between patients and controls. With a mean score of 24.5 on the cognitive section, the DAT patients were only mildly demented. The mean score for nondemented controls in the literature was 1.7 (SD = 1.7) and

103 allowed the cutoff score to be calculated as two times the SD, as was done in this study. Our control group had comparable results, with a mean score of 2.4 (SD = 1.2). Signs of impairment were found in all patients. Thus, the sensitivity to detect the previously classified patients as demented was 100%. On the other hand, there was no overlap between patients and controls. The ADAS-C allocated all controls to range segment 1 and all patients to range segments 2 and above. No clear floor or ceiling effects were visible in this study, although floor effects were found in another study (Mohs and Cohen, 1988). The results reported here suggest the suitability of the ADAS for all stages of DAT that were examined. In the range segment that we studied, the ADAS-C could be useful for the documentation of the course of dementia. It must be kept in mind that the cognitive section of the ADAS is the basis for these conclusions about its possible utility. The noncognitive items were rarely rated and did not correlate with other tests, but changes in the scores of single items of the ADAS-NC over the course of the disease can be documented. Summary scores on the ADAS do not appear to be useful. The BCRS classified all patients as impaired (sensitivity 100%). The BCRS correlated very well with the other tests in this study, although it is more a clinical rating scale than a test measure, per se. Two (4.1%) of the controls were allocated to range segment (and stage) 2. However, neither subject reached the criterion score of the BCRS for dementia as described above. Therefore, the BCRS, like the ADAS, showed no overlap between scores of patients and scores of controls. Compared with findings for the three other tests, the span of scores common to range segments 3,4, and 5 was small, and there were no floor or ceiling effects detectable. Taken together, we found a ceiling effect in the MMSE and a floor effect in the SKT. The SKT test proved to be an instrument for course observation in early stages of dementia. It is not useful in severe dementia. The ADAS cognitive test battery fulfilled all requirements with respect to sensitivity and specifity (as far as our data investigated these parameters) and the ability to document the course of dementia. It was useful over the whole course of the disease in the group of patients that we studied. These statements refer only to the cognitive section of the ADAS. The noncognitive section has shortcomings that render it useless in these areas. No limitations for the use of the BCRS were found. In conclusion, the four tests should be combined in diagnosis, course, and therapy evaluation. For the MMSE, many shortcomings are documented in the literature and again in this study. It is important to note that the MMSE was constructed as a short mental state questionnaire and that its limitations were apparent from the time of initial development. On the basis of our data, tests to be developed in the future should not attempt to cover the whole course of the disease. It could be more useful to concentrate on particular aspects of the disease (e.g., differential diagnosis, course documentation, or symptom description). References Anthony, J.C.; LeResche, L.; Niaz, U.; von Korff, M.R.; and Folstein, M.F. Limits of the “Mini-Mental State” as a screening test for dementia and delirium among hospital patients. Psychological Medicine, 12:397-408, 1982.

104 Boiler, F.; Lopez, O.L.; and Moossy. J. Diagnosis of dementia: Clinicopathologic correlations. Neurology, 39:76-79, 1989 Burch, E.A.J., and Andrews, S.R. Cognitive dysfunction in psychiatric consultation subgroups: Use of two screening tests. Southern Medical Journal, 80: 1079-1082, 1987~. Burch, E.A.J., and Andrews, S.R. Comparison of two cognitive rating scales in medically ill patients. International Journal of Psychiatric Medicine, 17: 193-200, 19876. Celsis, P.; Agniel, A.; Puel, M.; Le Tinnier, A.; Viallard, G.; Demonet, J.F.; Rascal, A.; and Marc Vergnes, J.P. Lateral asymmetries in primary degenerative dementia of the Alzheimer type: A correlative study of cognitive, haemodynamic and EEG data, in relation with severity, age of onset and sex. Cortex, 26585596, 1990. Cockrell, J.R., and Folstein, M.F. Mini-Mental State Examination (MMSE). Psychopharmacology Bulletin, 24:689-692, 1988. Creasey, G.L.; Myers, B.J.; Epperson, M.J.; and Taylor, J. Couples with an elderly parent with Alzheimer’s disease: Perceptions of familial relationships. Psychiatry, 53:44-5 1, 1990. Crook, T.; Bartus, R.T.; Ferris, S.H.; Whitehouse, P.; Cohen, G.D.; and Gershon, S. Age associated memory impairment: Proposed diagnostic criteria and measures of clinical change-Report of a National Institute of Mental Health work group. Developmental Neuropsychology, 2:26 l-276, 1986. DeKosky, S.T., and Scheff, S.W. Synapse loss in frontal cortex biopsies in Alzheimer’s disease: Correlation with cognitive severity. Annals of Neurology, 27:457-464, 1990. Dierks, T.; Perisic, I.; Friilich, L.; Ihl, R.; and Maurer, K. Topography of the quantitative electroencephalogram in dementia of the Alzheimer type: Relation to severity of dementia. Psychiatry Research: Neuroimaging, 40: 18 I- 194, 199 1. Erzigkeit, H. The SKT-A short cognitive performance test as an instrument for the assessment of clinical efficacy of cognition enhancers. In: Bergener, M., and Reisberg, B., eds. Diagnosis and Treatment of Senile Dementia. Berlin, Heidelberg, New York: SpringerVerlag, 1989. pp. 164-174. Erzigkeit, H. The development of the SKT project. In: Hindmarch, I.; Hippius, H.; and Wilcock, G.K., eds. Dementia: Molecules, Methods, and Measures. New York: John Wiley, 1991. Fillenbaum, G.; Heyman, A.; Williams, K.; Prosnitz, B.; and Burchett, B. Sensitivity and specificity of standardized screens of cognitive impairment and dementia among elderly black and white community residents. Journal of Clinical Epidemiology, 431651-660, 1990. Fischer, P.; Jellinger, K.; Gatterer, G.; and Danielczyk, W. Prospective neuropathological validation of Hachinski’s Ischaemic Score in dementias. Journal of Neurology, Neurosurgery, and Psychiatry, 54:580-583, 1991. Folstein, M.F.; Folstein, S.E.; and McHugh, P.R. “Mini-Mental State”: A practical method for grading the cognitive state of patients for the clinician. Journal of Psychiatric Research, 12:189-198, 1975. Foster, J.R.; Sclan, S.; Welkowith, J.; Boksay, I.; and Seeland, I. Psychiatric assessment in medical long-term care facilities: Reliability of commonly used rating scales. International Journal of Geriatric Psychiatry, 3:229-233, 1988. Friilich, L.; Eilles, C.; Ihl, R.; Maurer, K.; and Lanczik, M. Stage-dependent reductions of regional cerebral blood flow measured by HMPAO-SPECT in dementia of Alzheimer type. Psychiatry Research, 291347-350, 1989. Ihl, R.; Eilles, C.; Friilich, L.; Maurer, K.; Dierks, T.; and Perisic, I. Electrical brain activity and cerebra1 blood flow in dementia of the Alzheimer type. Psychiatry Research, 291449-452, 1989~. Ihl, R.; Maurer, K.; Dierks, T.; Friilich, L.; and Perisic, I. Staging in dementia of the Alzheimer type: Topography of electrical brain activity reflects the severity of the disease. Psychiatry Research, 29:399-401, 19896. Kessler, J.; Herholz, K.; Grond, M.; and Heiss, W.D. Impaired metabolic activation in .4lzheimer’s disease: A PET study during continuous visual recognition. Neuropsychologia, 29:229-243, 1991.

105 Kramer-Ginsberg, E.; Mohs, R.C.; Aryan, M.; Lobel, D.; Silverman, J.; Davidson, M.; and Davis, K.L. Clinical predictors of course for Alzheimer patients in a longitudinal study: A preliminary report. Psychopharmacology Bulletin, 24:458-461, 1988. Lopez, O.L.; Becker, J.T.; Brenner, R.P.; Rosen, J.; Bajulaiye, 0.1.; and Reynolds, C.F. Alzheimer’s disease with delusions and hallucinations: Neuropsychological and electroencephalographic correlates. Neurology, 41:906-9 12, 199 1. Maurer, K.; Ihl, R.; and Dierks, T. [Topography of P300 in psychiatry-II. Cognitive P300 fields in dementia.] Zeitschrift fur EEG und EMG, 19:26-29, 1988. McCarthy, M.; Ferris, S.H.; Clark, E.; and Crook, T. Acquisition and retention of categorized material in normal aging and senile dementia. Experimental Aging Research, 7:127-135, 1981. McKhann, G.; Drachman, D.; Folstein, M.; Katzman, R.; Price, D.; and Stadlan, E.M. Clinical diagnosis of Alzheimer’s disease: Report of the NINCDS-ADRDA Work Group under the auspices of Department of Health and Human Services Task Force on Alzheimer’s Disease. Neurology, 34:939-944, 1984. Mohs, R.C.; Rosen, W.G.; and Davis, K.L. The Alzheimer’s Disease Assessment Scale: An instrument for assessing treatment efficacy. Psychopharmacology Bulletin, 19448-450, 1983. Mohs, R.C., and Cohen, L. Alzheimer’s Disease Assessment Scale. Psychopharmacology Bulletin, 24:627-628, 1988. Mohs, R.C., and Davis, K.L. A signal detectability analysis of the effect of physostigmine on memory in patients with Alzheimer’s disease. Neurobiology of Aging, 3:105-l 10, 1982. Morris, J.C.; McKeel, D.W.J.; Storandt, M.; Rubin, E.H.; Price, J.L.; Grant, E.A.; Ball, M.J.; and Berg, L. Very mild Alzheimer’s disease: Informant-based clinical, psychometric, and pathologic distinction from normal aging. Neurology, 4 I :469-478, 199 1. Mungas, D. In-office mental status testing: A practical guide. Geriatrics, 46:54-8, 63, 66, 1991. O’Neill, D. The Mini-Mental Status Examination. Journal of the American Geriatric Society, 39:733, 1991. Overall, J.E., and Schaltenbrand, R. The SKT neuropsychological test battery. Journal of Geriatric Psychiatry and Neurology, in press. Primavera, A.; Novello, P.; Finocchi, C.; Canevari, E.; and Corsello, L. Correlation between mini-mental state examination and quantitative electroencephalography in senile dementia of Alzheimer type. Neuropsychobiology, 23:74-78, 1990. Reisberg, B., and Ferris, S.H. Brief Cognitive Rating Scale (BCRS). Psychopharmacology Bulletin, 24~629-636, 1988. Reisberg, B.; Ferris, S.H.; and deleon, M.J. Senile dementia of the Alzheimer type: Diagnostic and differential diagnostic features with special reference to functional assessment staging. In: Traber, J., and Gispen, W.H., eds. Advances in Applied Neurological Sciences. Vol. 2. Berlin: Springer-Verlag, 1985. pp. 18-37. Reisberg, B.; Ferris, S.H.; Steinberg, G.; Shulman, E.; deleon, M.J.; and Sinaiko, E. Longitudinal study of dementia patients and aged controls. In: Lawton, M.P., and Herzog, A.R., eds. Special Research Methods for Gerontology. Amityville, NY: Baywood, 1989. pp. 195-231. Reisberg, B.; London, E.; Ferris, S.H.; Borenstein, J.; Scheier, L.; and deleon, M.J. The Brief Cognitive Rating Scale: Language, motoric, and mood concomitants in primary degenerative dementia. Psychopharmacology Bulletin, 19:702-708, 1983a. Reisberg, B.; Schneck, M.K.; Ferris, S.H.; Schwartz, G.E.; and deleon, M.J. The Brief Cognitive Rating Scale (BCRS): Findings in primary degenerative dementia (PDD). Psychopharmacology Bulletin, 19:47-50, 19836. Rosen, W.G.; Terry, R.D.; Fuld, P.A.; Katzman, R.; and Peck, A. Pathological verification of ischemic score in differentiation of dementias. Annals of Neurology, 7:486-488, 1980. Rosen, W.G.; Mohs, R.C.; and Davis, K.L. A new rating scale for Alzheimer’s disease. American Journal of Psychiatry, 141: 1356-l 364, 1984.

106

Shader, R.L.; Harmatz, J.S.; and Salzman, C. A new scale for clinical assessment in geriatric populations: Sandoz Clinical Assessment Geriatric (SCAG). Journal of the American Geriatric Society, 27:80-82, 1979. Siu, A.L. Screening for dementia and investigating its causes. AnnaIs of Internal Medicine. 115:122-132, 1991. Spencer, M.P., and Folstein, M.F. The Mini-Mental State Examination. In: Keller, P.A., and Ritt, L.C., eds. Innovations in Clinical Practice: A Source Book. Sarasota, FL: Professional Resource Exchange, 1985. pp. 305-310. Tsai, C.S.; Ritch, R.; Schwartz, B.; Lee, S.S.; Miller, N.R.; Chi, T.; and Hsieh, F.Y. Optic nerve head and nerve fiber layer in Alzheimer’s disease. Archives of Ophthalmology, 109: 199204, 1991. Tsai, L., and Tsuang, M.T. The Mini-Mental State test and computerized tomography. American Journal of Psychiatry, 136436439, 1979. Tune, L., and Folstein, M.F. Post-operative delirium. Advances in Psychosomatic Medicine, 15:5 l-68, 1986. Wechsler, D. Die Messung der lntelligenz Erwachsener. Stuttgart: Huber, 1964. Yesavage, J.A.; Poulsen, S.L.; Sheikh, J.; and Tanke, E. Rates of change of common measures of impairment in senile dementia of the Alzheimer’s type. Psychopharmaro1og.t Bulletin, 24:53 l-534. 1988.

Differential validity of psychometric tests in dementia of the Alzheimer type.

Forty-nine patients with a clinical diagnosis of probable dementia of the Alzheimer type underwent an extensive test battery designed to evaluate cogn...
1MB Sizes 0 Downloads 0 Views