Journal of the Royal Society of Medicine Volume 85 July 1992

403

Computerized assessments of psychiatric disorder using PROQSY: discussion paper

G Lewis PhD MRCPsych Section of Epidemiology and General Practice, Institute of Psychiatry, De Crespigny Park, London SE5 8AF Keywords: computerized assessments; psychiatric disorder; validity

Introduction Computers and other aspects of information technology have had a profound influence on many aspects of our lives. However, the clinical examination of patients has so far resisted any of these influences and has remained almost unchanged for several generations of doctors. This according to many enthusiasts for information technology will all soon be changed. Lieffl writes 'when computers beeome common office and waiting room equipment (just as telephones are today), a lengthy structured computer interview could be accomplished in the waiting room before and after clinical visits with no time lost to the clinician'. Such predictions seem apposite when one examines the results of De Dombal2 on the computerized diagnosis of abdominal pains. He and his colleagues found that the computerized expert system was more accurate in diagnosing abdominal pain than both the junior and more senior surgeons. Such results published 20 years ago seem to herald a brave new world of computerized medicine.

Computerized assessments for psychiatric disorder One might suppose that psychiatrists whose training puts emphasis on interviewing technique and empathy would be amongst the last to develop an interest in computerized assessments. In fact there has been a considerable amount of work in this area and more recently these have resulted in the development of self-administered computerized assessments for psychiatric disorder3. These developments started in the 60s with the work of Slack. Since that time many workers have shown that computerized assessments are easy to use with subjects requiring little supervision and are acceptable to wide variety of patients including psychiatric inpatients46. Over the past few years the General Practice Research Unit at the Institute of Psychiatry has nt developed a self-adrinisered computerized for minor psychiatric disorder. This system will be described as an example of the sort of technology that is currently available7. The program has been christened PROQSY (Programmable Questionnaire System). There are two parts to the system - firstly, a program, the interview driver; the second file contains the questions, branching and scoring commands which are prepared in a separate file using a word-processor. A specialized knowledge of computing is not required to prepare the questionnaire file. This has now become a fairly standard arrangement for such devices. The questionnaire is prepared in a series of frames and an example is given in Figure 1. The subject sees

DEPRESSION-WEEPY 'Have you felt like crying in the PAST WEEK?' 1 'No' 2 'Yes, but I don't actually cry' 3 'Yes, and I cry sometimes'

Paper read

to Section

of Psychiatry, 8 January 1991

if answer > =2 then I DEPR:=2; gotoDEPRESSIONFREQ;J Figure 1. An example of a frame for PROQSY

the question inside the inverted commas and the answers preceded by the numbers. The command lines at the bottom of the frame are instructions for the computer program that determine branching and scoring. The subject presses the number keys of a standard keyboard. It does not appear to be necessary to provide a specialized layout of keys and in PROQSY it was decided to limit the responses of subjects to individual key strokes in order to ensure ease of use. There is an important conclusion to be drawn from these technical matters. At present computerized assessments are elegant multiple choice questionnaires. The term 'computerized interviews' though occasionally heard is quite misleading. Interviewers do a lot more than passively collect information, though some social survey interviewers might act in this way in large scale research studies. Certainly, in standardized interview assessments for psychiatric disorder the interviewer is usually, though not always, expected to be a clinical psychiatrist. There is a further important limitation to the use of selfadministered computerized assessments for psychiatric disorder. t is difficult to conceive of a self-administered questionnaire for schizophrenia, though it is more appropriate to use this method for assessing neurotic disorders. The General Health Questionnaire8 for example, has an established place in measuring neurotic or minor psychiatric disorders in general practice and community settings. Needless to say, one of the crucial questions about such computerized questionnaires concerns the comparability between them and the usual means of assessing psychiatric disorder, standardized interviews. Before discussing some of the results comparing computerized assessments with standardized interviews it is therefore helpful to think about the differences between these two methods of measuring 0141-0768/92/ psychiatric disorder. 070403-04/$02.00/0

What is climical judgement? One of the main differences between standardized nt by interview and computerized assessments

© 1992

The Royal soiety of Medicine

404

Journal of the Royal Society of Medicine Volume 85 July 1992

is that the latter cannot make the clinical judgements which are currently expected by those who administr most currently used standardized interviews9"10. For instance, Kendell et aL" wrote that the Present State Examination'2 (PSE) 'places considerable reliance on the interviewer's clinical judgement both in the conduct of the interview and in decisions about the presence or absence of individual items ofthe psychopathology'. Wing and colleagues12 go further in stating that the PSE 'requires expert clinicians'. Similar statements are also made about the Schedule for Affective Disorders and Schizophrenia's (SADS) and the Structured Clinical Interview for DSM-_lI'4 (SCID). If computerized assessments eliminate clinical judgement then it is necessary to ask 'what is clinical judgement?'. There are two main approaches. First, clinical judgement indicates a failure to standardize medical diagnosis though all clinical judgements could be standardized given enough thought and study. This is the position taken by Feinstein15. The second approach is one perhaps more akin to that of the traditional clinician's view. That clinical judgements are necessary when assessing something which is impossible to describe in a standardized fashion. In medical research the advantages of standardization are to reduce observer bias and to allow comparability within and between studies. The arguments for standardization in medical research are overwhelming though there is not the space to discuss them fully here. In devising a computerized assessment for psychiatric disorder the researcher is forced to standardize the assessment more fully than has previously been done in many of the standardized interviews used in, psychiatry. Though coUeagues may argue about the exact questions and rules, that are included, these can always be changed until a consensus -is reached. Indeed having such explicit rules is a stimulus for constructive debate about exactly how to assess psychiatric disorder.

Validity or utility? In many circumstances when the measurement of psychiatric disorder has .been disussed, clinical judgement on the part of the interviewer in deciding on ratings has almost been taken to be synonymous with validity. However, validity is a complex concept. Validity is easy to -conceive when there is a gold standard measure. For example in the De Dombal study mentioned above the authors compared preoperative diagnosis of acute abdominal pain with the operative findings and histology. -These latter measures would satisfy most doctors' notion of a gold standard. It is then easy to assess the validity ofthe computuized assessment; De Dombal2 merely had to compare the results of the computerized diagnosis with the post-

individual is paramount in clinical work, yet for a researcher the main priority is the elimination of systematicerror or bias between the grupd foir whom, comparisons are being made. Also, highly specific diagnoses are needed in research so that doubtful cases are not included within your case definition. However, in clinical work much broader definitions are often more appropriate to ensure that the maximum number of patients receive treatment. For research, computerized assessments eliminate observer bias and increase standardization. In addition, computers also save interviewing costs and automatically code and enter data; an error-prone and tedious job. Ideally then computerized assessments should combine the thoroughness - the ability to branch- and avoid redundant questions - of a standardized ;iterview with the economy, lack of bias and full standardization of a self-administered

questionnaire.

Comparison between computerized assessments and interviews The PROQSY assessment developed in the General Practice Research Unit was compared with the Clinical Interview Schedule'7 (CIS). The study was conducted on 87 attenders at the King's College Hospital Dermatology Clinic when PROQSY and the CIS were both given to subjects on the same day separated by a few minutes. The results ofthis study are compared with- the results of other studies of Britishstandardized interviews in Table 1. It should be mentioned however, that these comparisons are very difficult to make. Reliability is a property of a particular measurement method in a particular setting. Furthermore, many of-the other studies had longer periods between the interviews and this may have reduced agreement. Though the kappas (a measure of agreement that corrects for chance) for the computerized- assessment PROQSY appear quite low, they in fact compare quite fa-vourably with results of standardized interviews. Greist and colleagues'8 in North America has also compared his computerized Diagnostic Interview Schedule'9 (DIS)-with the interviewer administered version. Again the results suggest that the agreement between the computerized diagnoses is of the same order of magnitude as those Table 1. Compari ing the present results with other studies singa drdi i interviews. The resuls given ar the -mean weighted happas (,)r other agreement index) across symptoms. The figures in pa rentheses are the kappas for agreement on 'cases'

Stdy

operative findings. However, Feinstein observed tGoldbergetaL7 that 'in many (if not most) medical circumstences, tWing et al.w 2Cooper et al7 however, a definitive standard is not available'. There are certainly no gold stand in psychiatry tSturt et aLM8 and in the absence of these it seems more helpful to $R8odws and Manne talk about whether computerized assessments are ULewis et aL"O useful rather than whether they are valid. As always it is also important to ask useful for what? At the very least, research and clinical uses must be distinguished if utility is to be realistically assessed. Clinicians often have quite different priorities from those of a researcher. For instance, accuracy in the

Observer

Re-interview

Interview rating

design

PSE CIS

-

PSE PSE

0.73 0.71 0.52* 0.67

0.41 0.34 (0.37) 0.38

PSE

0.60 (0.43)*

PSE

0.71 (0.74)* -

CIS-R

-

PROQSY -

*Audiotapee were used

tPsychiatric hospital based studies $'Lay' interviewers were used

0.56 (0.66) 0.48 (0.58)

Journal of the Royal Society of Medicine Volume 85 July 1992

between two administrations of the interviewer administered DIS. However, there are rather large question marks over the relationship of the intervibw administered DIS 'to clinical diagnosesP0J1. For research though there appear to be definite advantages for using computerized assessments and it would seem that there is a promising future for these methods. The clinical applications of computerized assessments have been less well studied. Furthermore, some workers in North America have made rather extravagant claims for their instruments without proper evaluation. In contrast, Greist and colleagues have avoided commercially minded claims and compared the computerized DIS with clinical diagnoses and the agreement on those occasions is rather variable. For instance, Mathisen et aL22 found a kappa of only 0.24 for affective disorder. Many proponents of computerized assessments in clinical work are surprised and disappointed that all of the systems are not more widely used in clinical practice. Though one might accuse doctors of a Luddite attitude towards information technology there is also an absence of convincing evaluative data that they are of value in clinical work. Certainly, for pychiatrist, dealing with difficult and insightless patients, self-administered assessments are probably of little value. However, in general practice where general practitioners deal with the bulk of minor psychiatric disorder, computerized assessments may be of some clinical utility. The medical consultation in all medical specialties is more than a passive means of gathering information. The relationship between doctor and patient is important for patients' satisfaction and compliance3 and for the non-specific aspects of treatment. Asking patients to complete computerized questionnaires, especially if this is done instead of seeing the doctor, may damage this important relationship. As Edwards24 has written in his book about the treatment of alcohol problems 'taking a history from a patient should not be a matter only for obtaining facts to be written down in the casenotes. It is an interaction between two people and ought to be as meaningful for the person who answers the que§tions as for the questioner'. If computers are used to shorten the time spent with a patient they may do, more harm than good. However, they could be used to collect routine information freeing the'r doctor to use time more constructively

Computerized assessments of psychiatric disorder using PROQSY: discussion paper.

Journal of the Royal Society of Medicine Volume 85 July 1992 403 Computerized assessments of psychiatric disorder using PROQSY: discussion paper G...
891KB Sizes 0 Downloads 0 Views