Psychological Medicine, 1991, 21, 785-790 Printed in Great Britain

Performance of the Informant Questionnaire on Cognitive Decline in the Elderly (IQCODE) as a screening test for dementia A. F. JORM,1 R. SCOTT, J. S. CULLEN AND A. J. M A C K I N N O N From the NH & MRC Social Psychiatry Research Unit, The Australian National University, Canberra, Australia

A 26-item informant questionnaire (IQCODE) and the Mini-Mental State Examination (MMSE) were compared as screening tests for dementia in a sample of 69 patients. Dementia diagnoses were made by both a clinician and a research interview using a computer algorithm to meet DSM-III-R and ICD-10 (Draft) criteria. The IQCODE was found to perform at least as well as the MMSE against all diagnoses and significantly better when judged against the algorithmic ICD-10 diagnoses. Also, the IQCODE was found to be uncontaminated by pre-morbid ability as estimated from the National Adult Reading Test and to have very high test-retest reliability after a delay of a day or more. SYNOPSIS

(r = 0-75). A principal components analysis INTRODUCTION showed that it measured a general factor of The IQCODE is a 26-item questionnaire which cognitive decline, and both the total score and asks an informant to rate degree of change over each of the 26 items were found to discriminate a ten-year period in various aspects of an elderly between general population and dementing person's memory and intelligence. The IQCODE samples. The correlation with elderly subjects' differs in two respects from other brief instru- education was quite small, indicating that scores ments for the assessment of cognitive function. were not contaminated by differences in preFirstly, it relies on the reports of informants morbid ability. rather than on cognitive testing of the subject, This paper reports further data on the validity making it a highly acceptable way to assess and reliability of the IQCODE and compares it cognitive function. Secondly, it obtains historical with the Mini-Mental State Examination material, allowing an assessment of cognitive (MMSE) (Folstein et al. 1975), which is one of decline rather than just current cognitive im- the most widely used screening tests for depairment. Because the questionnaire discounts mentia. Validity was assessed using diagnostic the influence of pre-morbid ability, it is suitable judgements as the criteria. Diagnoses were made for use with subjects who are educationally in two ways: {a) by a clinician using the DSMdisadvantaged. III-R (American Psychiatric Association, 1987) Jorm & Jacomb (1989) have reported data and draft ICD-10 (World Health Organization, supporting the reliability and validity of the 1990) criteria for dementia, and (b) by a IQCODE. It was found to have high internal standardized interview, the Canberra Interview consistency in a general population sample for the Elderly (CIE), which is administered by (alpha = 0-95) and reasonably high test-retest lay interviewers and scored by a computer reliability over one year in a dementing sample algorithm to also yield diagnoses according to the DSM-III-R and ICD-10 systems. The performance of the IQCODE and MMSE 1 Address for correspondence: Dr A. F. Jorm, NH & MRC Social as screening tests for dementia was compared Psychiatry Research Unit, The Australian National University, GPO using receiver-operating characteristic analysis Box 4, Canberra, ACT 2601, Australia. 785

786

A. F. Jorm and others

(Murphy et al. 1987). This form of analysis involves the calculation of sensitivity and specificity for all possible cut-points and allows a comparison of test performance over the full range of scores. To assess contamination by pre-morbid ability, the National Adult Reading Rest (NART) (Nelson, 1982) was used. The NART is generally considered to be the best available indicator of pre-morbid ability in dementing subjects. If the IQCODE is uncontaminated by pre-morbid ability it should have only a small correlation with the NART. The IQCODE was expected to differ in this respect from the MMSE, which measures a subject's current functioning without regard to pre-morbid ability. Finally, test-retest reliability was assessed, using a much shorter time lag than in the previous study. METHOD Subjects

The subjects were recruited as part of a study of the reliability and validity of the CIE, a standardized interview administered by lay interviewers which can be analysed by computer algorithm to yield diagnoses of dementia and depression. The original intention in the study was to assess a consecutive series of patients seen by a geriatrician as in-patients, day-hospital attenders or out-patient clinic attenders, excluding those who were too young, had no informant, did not speak English or had major health problems or other reasons precluding participation. A total of 72 suitable patients fulfilling these criteria were identified, and 64 of these agreed to participate. This achieved sample contained an adequate number of patients with cognitive impairment, but too few with depressive disorders. To have a better representation of depressed cases, an additional 12 patients were recruited from other sources, mainly psychogeriatricians. These latter patients were not a consecutive series. Of the total of 76 subjects recruited from all sources, 4 had missing IQCODE data and 3 had missing MMSE data. The 69 remaining subjects with complete data are the focus of the present paper. Of these subjects, 44 were females and 25 males. The mean age was 80 years, with a range from 63 to 96. Of their informants, 51 were females and 18 males. There were 21 spouses, 36 children or

children-in-law, 5 other relatives and 7 nonrelatives. Procedure

Subjects were diagnosed by their treating clinician according to the draft ICD-10 criteria for dementia and the DSM-III-R criteria for dementia, delirium and amnestic syndrome. Diagnoses of depressive disorders were also made, but are not of relevance here. The clinicians were provided with a checklist for each set of criteria and were asked, for each subject in the study, to indicate whether each criterion for a diagnosis was met. This procedure was designed to ensure adherence to the published criteria. Subsequently, subjects and their informants were interviewed by lay interviewers trained in the administration of the CIE. The

subjects were interviewed mainly in their own homes (30 subjects) or in hospital (33 subjects), with most of the remainder being interviewed in nursing homes or hostels. The CIE involves cognitive testing of the subject as well as reports from an informant. The cognitive testing incorporates the items of the MMSE, allowing an MMSE score to be calculated. The CIE also incorporates the NART. The IQCODE is not part of the CIE and was given to each informant at the end of their interview. To assess test-retest reliability, the CIE (and the IQCODE) were readministered to as many subjects and informants as possible, using a different lay interviewer from the first test. The delay between initial test and retest averaged 2-8 days for the subjects (range 1-14) and 3-2 days for the informants (range 1-15). Test-retest data for both the MMSE and IQCODE were obtained for 57 subjects and informants. RESULTS Scoring of instruments The IQCODE was scored by averaging ratings from the 26 items. Up to 25 % of items were allowed to be omitted before a score was regarded as missing. The MMSE was scored out of 30. Subjects were given both the 'serial sevens' and 'WORLD backwards' items and credited with the larger score. Items refused were counted as errors, but items missed for other reasons (e.g. sensory or motor impairment) were handled by pro-rating the scores, provided

The IQCODE as a screening test

787 (b)

10

1 0 •(

0-8

0-8 •

£• 0-6

0-6 •

0-4 •

0-4 •

IQCODE

IQCODE

(Area = 0-77, S.E. = 007) MMSE (Area = 0-75, S.E. = 007)

0-2

0-0 -I

00

0-2

0-4 0-6 Specificity

0-8

(Area = 0-85, S.E. = 005) MMSE

0-2 •

(Area = 0-74, S.E. = 007)

00 J, 00

10

0-2

0-4 0-6 Specificity

(c)

10 •

r

i-

0-6 •

i

\ 0-4

10

(d) 10

T~ /

'/

r/

0-8

0-8

f

0-8

r

0-6 •

J

0-4 •

- IQCODE 0-2 •

00 0-0

- IQCODE

(Area = 0-87, S.E. = 005) - MMSE (Area = 0-80, S.E. = 0-06) 0-2

0-4 0-6 Specificity

0-8

10

0-2 •

on-

\ 1

(Area = 0-87, S.E. = 0-05)

i-lj

-MMSE (Area = 0-66, S.E. = 010)

00

0-2

0-4 0-6 Specificity

0-8

10

FIG. 1. Receiver-operating characteristics (ROCs) for the IQCODE and MMSE as screening tests for dementia: DSM-HI-R diagnosed by (a) clinician, (b) algorithm; ICD-10 diagnosed by (c) clinician, (d) algorithm.

no more than 25% of the test points were missing. In this sample, the mean score for the IQCODE was 3-62 (s.D. = 0-58) and for the MMSE 23-82 (s.D. = 4-61). Reversing the MMSE score so that, like the IQCODE, a higher score represents greater impairment, the correlation between the two tests was 0-54. The CIE was analysed by computer algorithm to give diagnoses of ICD-10 dementia and DSM-III-R dementia and delirium. Although many MMSE items were used in the diagnostic algorithm, the MMSE total score played no role.

Receiver-Operating Characteristic (ROC) analysis

The IQCODE and MMSE were evaluated as screening tests using the clinical and algorithmic diagnoses of dementia as the criterion. Clinical diagnosis yielded 24 cases of DSM-III-R dementia and 22 cases of ICD-10 dementia, while algorithmic diagnosis yielded 15 cases of DSMIII-R dementia and 8 cases of ICD-10 dementia. It is evident from these numbers that the algorithmic diagnoses were stricter than the clinical diagnoses. However, not all cases

788

A. F. Jorm and others

identified by the algorithm were also identified by the clinicians. The two groups were, in part, made up of different individuals. Sensitivity and specificity were calculated using all possible cut-points and plotted as shown in Fig. 1 to yield ROCs (Hanley & McNeil, 1982). The area under the ROC can be used as an index of the overall performance of a screening test. An area of 0-5 represents chance performance, while 10 represents perfect performance. The areas and their standard errors were calculated using the Wilcoxon method described by Hanley & McNeil (1982) and are given in Fig. 1. Overall, the IQCODE performed better than the MMSE for all diagnoses. However, when the differences were tested for statistical significance using the method of Hanley & McNeil (1983) the IQCODE was reliably better only with the algorithmic diagnosis of ICD-10 dementia (Z = 2-46, P = 001,2-tailed). Because the IQCODE and MMSE may have been sensitive to cases of delirium as well as dementia, the ROCs were recomputed, adding in cases of DSM-III-R delirium to the dementia diagnoses (ICD-10 delirium was not assessed because of practical difficulties in implementing the criteria). There were five cases of delirium according to the clinicians and two cases according to the algorithm. When these delirium cases were added, areas under the ROC increased in all but one instance, but the pattern of results was essentially the same. It is possible that the IQCODE and MMSE could be effectively harnessed together to make a better screening test. One method of evaluating this possibility is to carry out a discriminant analysis which would calculate the optimal weights for combining the two tests. However, this approach was not used here because optimal weights are unlikely to be useful in clinical practice due to the computation involved. Optimal weights are also unstable because they capitalize on chance to optimize the discrimination between groups. Instead, IQCODE and MMSE scores were each divided into ten equally spaced bands and added to give a score out of 20. The combined scores were then used to derive ROCs. However, using clinical diagnoses as the criterion, the areas under the ROCs were found to be only marginally better than for the IQCODE alone, while they were marginally

worse with algorithmic diagnoses as the criterion. The IQCODE is probably better used as a continuous measure of cognitive decline, but some users may want to dichotomize it at an optimal cut-point for detecting dementia and delirum. The optimal score for maximizing sensitivity and specificity varies, depending on the criterion used. However, taking a score of 3-60+ as indicating a case seems to be satisfactory. When judged against the clinicians' diagnoses, this cut-off yielded a sensitivity of 80% and a specificity of 82% with the ICD-10 diagnoses, and a sensitivity of 69% and a specificity of 80% with the DSM-III-R diagnoses. For the MMSE, using the conventional 23/24 cut-off, the corresponding sensitivity and specificity were 64% and 75% for ICD-10 and 76% and 73% for DSM-III-R. Using the algorithmic diagnoses as the criterion, the IQCODE had a sensitivity of 80% and a specificity of 66% for ICD-10 diagnoses and a sensitivity of 82 % and specificity of 73 % for DSM-III-R diagnoses. For the MMSE, the corresponding sensitivities and specificities were 80% and 68% for ICD-10 and 76% and 73% for DSM-III-R. The cut-off of 3-60+ for the IQCODE is a reasonable choice if false positives and false negatives are equally undesirable. However, a higher score might be chosen if it is desirable to maximize specificity, and a lower score of it is desirable to maximize sensitivity. Influence of informant's age

It is possible that elderly informants provide less valid information than younger ones. Unfortunately, the sample size was too small to produce ROCs separately for older and younger informants. However, it was possible to correlate the IQCODE with the MMSE separately for informants aged 65+ compared to those who were younger. The correlation for the 36 older informants was 0-64 compared to 0-40 for the 33 younger informants. Thus, there is no evidence in these data that older informants give less valid ratings. Relationship to pre-morbid ability

The cognitive testing in the CIE also included the NART, which it was possible to administer to 42 subjects. Data were missing for other subjects because of visual impairment or refusal.

The IQCODE as a screening test

For these 42 subjects the correlation between the NART and IQCODE was - 0 1 0 ; between the NART and the MMSE it was 0-43. Test-retest reliability

Data from the 57 subjects who had IQCODE and MMSE scores for both occasions showed a retest correlation of 0-96 for the IQCODE and 0-79 for the MMSE. However, the mean score for both tests changed towards better cognitive function on retest. The IQCODE mean went from 3-62 (s.D. = 0-58) to 3-56 (S.D. = 0-59), while the MMSE mean went from 23-82 (s.D. = 4-61) to 24-88 (S.D. = 500). Both changes were significant at the P < 005 level. Remission of delirium might have been responsible for this improvement. To see whether this was the case, the 57 subjects for whom retest data were available were split into two groups: a group of four subjects diagnosed as delirious by a clinician and a group comprising the remaining 53 subjects. The delirious subjects showed significantly greater change than the non-delirious for the IQCODE (P < 001), but not for the MMSE. The mean change score on the IQCODE was 0-29 for the delirious subjects, compared to 004 for the others. DISCUSSION The present study examined the validity of the IQCODE as a screening test for dementia when judged against diagnoses made in accordance with the DSM-III-R and draft ICD-10 criteria. The diagnoses were made by a clinician, as well as by a computer algorithm using data from the CIE. For both methods of diagnosis and both sets of diagnostic criteria, the IQCODE was found to perform well as a screening test. When compared with the MMSE - a widely used screening test for dementia - the IQCODE performed at least as well and, with ICD-10 dementia diagnosed by algorithm, it was significantly better. A possible reason for the greater agreement between the IQCODE and the ICD-10 diagnoses is that both involve the assessment of decline. The IQCODE assesses change in aspects of memory and intelligence over a ten-year period, while the ICD-10 criteria for dementia require the presence of a decline in both memory and

789

intellectual abilities. The ICD-10 criteria differ in this respect from the DSM-III-R criteria for dementia, which require only the presence of memory impairment rather than memory decline. The computer algorithm for ICD-10 dementia relies, in part, on informant reports to assess decline. Although the IQCODE itself was not used in making the diagnoses, the CIE asks a number of questions of informants which are very similar to some found in the IQCODE. This common assessment methodology might have contributed to the greater agreement between the IQCODE and the ICD-10 diagnoses made by algorithm. On the other hand, some of the MMSE items contributed directly to the ICD-10 and DSM-III-R diagnostic algorithms, leading to a bias in favour of agreement with the MMSE. A major reason for developing the IQCODE was to overcome the problem of contamination with pre-morbid ability which has plagued screening tests based on cognitive testing. In the present study, pre-morbid ability was estimated using the NART. The IQCODE was found to have little correlation with this estimator of premorbid ability and differed in this respect from the MMSE. The present results therefore confirm earlier findings that contamination by pre-morbid ability is not a problem for the IQCODE (Jorm & Jacomb, 1989). The IQCODE was also found to have excellent test-retest reliability after a delay of a day or more. Again, the reliability was higher than for the MMSE. However, a contributing factor to this high reliability may have been informants' memory for their responses from one occasion to the next. Nevertheless, memory for responses is unlikely to be wholly responsible because reasonably high test-retest reliability has been previously reported over a period of a year (Jorm & Jacomb, 1989). It is notable that both the IQCODE and the MMSE indicated improved average performance on retest. For the MMSE, a practice effect may have been involved, but this is not a possibility with the IQCODE. For the IQCODE, the improvement was due to the small number of subjects diagnosed as delirious and is therefore likely to reflect a true change in cognitive function as their delirium improved. Overall, the IQCODE performs very well as a screening test for dementia and may have

790

A. F. Jorm and others

advantages over traditional screening tests such as the MMSE. It appears to overcome the problem of educational bias and contamination with pre-morbid ability, but further work is needed on its performance with groups having very different educational and cultural backgrounds. The main disadvantage of the IQCODE is that an informant who has known the subject for 10 years may not always be available. In such cases there is no alternative to screening tests based on testing of the subject. The authors wish to thank the following people for their help with the study: Professor S. Henderson, Ms A. Korten and Drs H. Christensen and C. Phillips for help with the design of the study and comments on the manuscript; Professors G. A. Broe, J. Snowdon, S. Williams and Dr A. Duncan for assistance with recruiting subjects; Ms J. Groth for supervision of the interviewing; Mrs M. Smith, J. Crabb, N. Little and M. Shepherd for carrying out the interviews; Ms P. Jacomb for help with graphics; Ms P. Evans for typing the manuscript; and the subjects and informants for their generous cooperation.

REFERENCES American Psychiatric Association (1987). Diagnostic and Statistical Manual of Mental Disorders (3rd ed revised). American Psychiatric Association: Washington DC. Folstein, M. F., Folstein, S. E. & McHugh, P. R. (1975). 'MiniMental State': a practical method for grading the cognitive state of patients for the clinician. Journal of Psychiatric Research, 12, 189-198. Hanley, J. A. & McNeil, B. J. (1982). The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143, 29-36. Hanley, J. A. & McNeil, B. J. (1983). A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology 148, 839-843. Jorm, A. F. & Jacomb, P. A. (1989). The Informant Questionnaire on Cognitive Decline in the Elderly (IQCODE): socio-demographic correlates, reliability, validity and some norms. Psychological Medicine 19, 1015-1022. Murphy, J. M., Berwick, D. M., Weinstein, M. G , Borus, J. F., Budman, S. H. & Klerman, G. L. (1987). Performance of screening and diagnostic tests. Application of receiver operating characteristic analysis. Archives of General Psychiatry 44, 550-555. Nelson, H. E. (1982). National Adult Reading Test (NART). NFERNelson: Windsor. World Health Organization (1990). ICD-10 Draft of Chapter V: Categories F00-F99, Mental and Behavioural Disorders (including Disorders of Psychological Development). Diagnostic Criteria for

Research. (WHO/MNH/MEP/87.1, Rev 4). World Health Organization, Division of Mental Health: Geneva.

Performance of the Informant Questionnaire on Cognitive Decline in the Elderly (IQCODE) as a screening test for dementia.

A 26-item informant questionnaire (IQCODE) and the Mini-Mental State Examination (MMSE) were compared as screening tests for dementia in a sample of 6...
437KB Sizes 0 Downloads 0 Views