DIFFERENCES BETWEEN REPORTS FROM CHILDREN, PARENTS AND TEACHERS: IMPLICATIONS FOR EPIDEMIOLOGICAL STUDIES Michael Gifford Sawyer, Peter Baghurst, Jennifer Clark

This study describes the different prevalences obtained when varying combinations of informants were used to identify emotional and behavioural disorders in a representative sample of 336 children living in two-parent families in the community of Adelaide, South Australia. When different informants were used to identify children with disorders, the estimated prevalences ranged from 3.3+_1.6%to 17.%4.1% for younger children, and 6.0+2.9% to 19.9+4.9% for older children. Results from the study highlight potential methodological problems which arise in epidemiological studies due to differences between reports from children, parents, and teachers describing childhood emotional and behavioural problems. Australian and New Zealand Journal of Psychiatry 1992; 26:652-660 Differences between reports obtained from parents, teachers and children are a critical problem for both research and the accurate clinical assessment and treatment of childhood emotional and behavioural disorders [ 11. Currently there are no guidelines to advise clinicians or researchers as to how they should weight discrepant reports in order to best evaluate childhood disorders. Thus, assessments by clinicians and re~

~

~

-

Evaluation Unit, Adelaide Children’s Hospital, North Adelaide, South Australia Michael Gifford Sawyer MBBS. Dip Child Psych. PhD, FRCPC, FRANZCP. Director Division of Human Nutrition, CSIRO, Adelaide, South Australia Peter Baghurst BAgSc. PhD, MSc, BSc, Principal Research Scienti\l Evaluation Unit, Adelaide Children’s Hospital, North Adelaide, South Australia Jennifer Clark BA, Dip Ed, Project Officer Correspond with Dr Sawyer

searchers will vary greatly, depending on the information sources used in an assessment, the extent of agreement between different information sources, and the weighting given to information from different sources when disagreement exists. A range of approaches have been used in an attempt to integrate reports obtained from parents, teachers and children describing the emotional and behavioural problems of children in the community. For example, Anderson et al. [2] identified four levels of agreement between the informants, “representing different degrees of certainty of identifying a case” (p. 7 1). The levels ranged from the situation where diagnostic criteria employed in the study were met by more than one source independently, to that where the diagnostic criteria were only achieved if the symptoms reported by all three informants were combined. Ferguson and Horwood [3-51, on the other hand, used sophisticated statistical techniques to estimate the contributions of trait, method, and error variance to maternal, teacher and child ratings of conduct disorders. This approach

MICHAEL GIFFORD SAWYER, PETER BAGHURST, JENNIFER CLARK

enabled the investigators to examine the relationship between the trait component of conduct disorder and other variables of interest, for example, maternal depression, family life events and maternal neuroticism. The approach is particularly useful for studies involving large numbers of children but is less helpful for clinicians assessing individual children. In addition, a full understanding of the approach requires a considerable knowledge of statistical techniques. Different approaches have also been used to establish a standard against which the clinical significance of scores on behaviour questionnaires completed by parents, teachers, and children can be tested. For example, Rutter et a/. [6] used the results of a standardised parental interview and an interview with the child’s teacher to establish a cutoff score on the questionnaire completed by teachers in the Isle of Wight study. The cutoff score was then used in the study to identify those children “likely to have a psychiatric problem” (p. 374) from amongst the total population of children participating in the study. More recently, “referral to a mental health clinic” has been used as another criterion against which the significance of scores on behaviour checklists describing problems amongst children in the community can be evaluated [7-91. Some of the limitations of this latter approach have been highlighted in a recent study by Sawyer et al. [ 101. Epidemiological studies investigating the nature and prevalence of childhood emotional and behavioural disorders generally employ a two-stage procedure in order to identify children with disorders [ 111. In the first stage, behaviour checklists are used as an economical method to survey large numbers of children. By employing established cutoff scores [7,12], children in a study are categorised as potential “cases” or “non-cases” [ 13, p. 421 on the basis their scores on behaviour checklists which are typically completed by parents or teachers. Subsequently, all the children identified as potential cases, and a random sample of the non-cases, are assessed in greater detail using more expensive interview-based assessment techniques. Prevalences of childhood disorders are then reported as the number of children per 100 defined as having emotional or behavioural disorders. Differences between informants’ reports can affect epidemiological studies at each stage during the identification of childhood disorders. During the initial screening stage, different cases may be identified, depending on which informant is used to complete the

653

behaviour checklists. For example, in the Isle of Wight study, Graham [ 141 noted that mothers and teachers agreed on the presence of behaviour disorders in only 7% of the disturbed children identified by the parent and teacher questionnaires. Graham went on to point out that “Our survey would have missed large numbers of children if we had relied only on parents or only on teachers ... ” (p. 33). During the second stage of epidemiological studies, more detailed information is generally collected from several different sources including teachers, parents and the children themselves. This information is used to complete a diagnostic assessment of children identified as potential cases during the first stage of a study. Lobovits and Handal [I51 have pointed out that a weakness of many epidemiological studies is that they fail to describe how the information obtained from different informants was integrated to form a single prevalence estimate: “As a result, it is unclear whether such discrepancies may differentially affect reported prevalence rates, and if so, in what manner” (p. 47). To date, most studies which have investigated the level of agreement between different informants have employed a dimensional approach in which results focus on the size of the difference between mean scores and the correlation between scores reported by different informants [ 11. Epidemiological studies, however, generally employ a categorical rather than dimensional approach when reporting results, and Hulbert et a1 [ 161 have suggested that the results of comparisons based on a categorical approach may not reflect those which will be found when a dimensional approach is used to describe childhood behaviour problems. Most studies which have investigated the level of agreement between reports from different informants have also focused on clinic-referred populations [17]. However, Sawyer er al. [lo] have highlighted differences in the pattern of parent and child reports in clinic and community populations. In light of this, it is possible that results obtained in clinic-referred populations may differ from those found in community populations where epidemiological studies are conducted. The aim of this study was to compare the level of agreement between mothers, fathers, teachers and children when describing emotional and behavioural disorders in children living in the community in Adelaide, South Australia. This paper describes the level of agreement between informants when both categorical and dimensional approaches were used to

DIFFERENCES IN REPORTS OF CHILDREN, PARENTS AND TEACHERS

654

Table 1. Demographic information

10-11 year old children

14-15 year old children

Males Females Males Females

(n=86) (11405) (n=71) (m=74) Age: Mean SD

11.6 0.9

11.4 0.9

15.6 1.0

15.5 0.9

Occupational class’ High (1-2.9) (%) Middle (3-4.9) (%) LOW(5-6.9) (Yo) Other (“10) Missing (%)

10.5 51.2 25.6 8.1 4.7

5.7 52.4 21.9 8.6 11.4

7.0 54.9 18.3 8.5 11.3

13.5 59.5 20.3 4.1 2.7

* Daniel A. Power, privilege and prestige: occupations

in Australia. Melbourne: Longman Cheshire, 1983

describe childhood emotional and behavioural problems. The unique features of the study are the use of four different informants to describe the problems of the children, the inclusion of children in an age range where the reliability of the children’s reports is acceptable [18] and the use of similar measures to obtain reports from the different informants.

Subjects The complete details of the sampling procedure employed in the study are provided in Sawyer et al. [ 191. Briefly, the subjects consisted of two cohorts of children aged 10-1 1 years and 14-15 years on July 1st 1987. These cohorts were selected from all children enroled in schools in metropolitan Adelaide using a stratified sampling procedure. Families were contacted by letter and datacollected by mail survey using the protocol devised by Dillman [20]. A response rate of 77% was obtained from families with a 10- 11 year old child and 7 1% from families with a 14-15 year old. In general, the best response rates were obtained from informants describing the children attending higher socio-economic schools and the worst response rates were from informants describing children attending lower socio-economic schools. This suggests that the

study population may be under-representative of children from lower socio-economic class schools. The subjects for the comparisons reported in this paper were the 336 children for whom completed questionnaires were available from two parents, a teacher and the child. This represented 75% of the children living in two-parent families who participated in the study. The loss of subjects was primarily due to the absence of a teacher questionnaire for 95 children, and a father questionnaire for 55 children in the twoparent families. Background information describing the children and their families is shown in Table 1.

Measures All participating children were assessed by means of Child Behaviour Checklists [7-91 completed by mothers, fathers, teachers and the children themselves. The checklists provide measures of behavioural problems and the social competences of children in a range of areas. Common to each checklist is a Total Behaviour Problem Scale which is comprised of all the behaviour items on the checklist. In addition, each checklist includes an Externalking Scale which rates antisocial/undercontrolled behaviour, and an Internalising Scale which rates inhibited/overcontrolled behaviour [7, p. 3 11. The checklists have been widely used in studies of both clinic and community populations and extensive information is available about their reliability and validity [7-91. Cutoff scores recommended for use with the Total Behaviour Problem Scale, Externalising Scale and the Intemalising Scale taken from the parent, teacher and youth checklists were utilised to define children as cases or non-cases [7-91. Achenbach and Edelbrock [7] state that these cutoff scores enable users of the checklists “to discriminate in a more categorical fashion between children likely to resemble our clinical sample and those more likely to resemble our nonclinical sample” (p. 62). Separate cutoff scores are provided for each scale for children of different age and sex on the three checklists. Although Hensley [ 2 I ] has suggested tentative higher cutoff scores for use with Australian populations, these are still to be validated and the cutoff scores recommended by Achenbach and Edelbrock have been employed in this study. Despite their similarity, there are two differences between the CBCL, TRF and YSR which are impor-

MICHAEL GIFFORD SAWYER. PETER BAGHURST, JENNIFER CLARK

Table 2. Prevalences (per 100)of cases identified by different informants

Mother report

10-11 year olds Males Tot Ext Int Females Tot Ext Int 14-15 year olds Males Tot Ext Int Females Tot Ext Int

Father report

Child report

Teacher report

17.9f4.1 11.5k3.5 4.7f2.3 12.3k3.6 11.3f3.5 7.3k2.9 3.6k2.0 11.9f3.6 20.0f4.3 9.8k3.1 4.232.0 9.8f3.3 16.7k3.7 16.1f3.7 3.3fl.6 14.1k3.5 13.8k3.5 11.2k1.1 18.2f3.8 16.1f3.7 3.7f1.9

5.532.2 4.4k2.0 9.4k3.0

19.9k4.9 16.5k4.6 6.0k2.9 17.7f4.6 19.9f4.9 5.6k2.8 19.3f4.8 19.5f4.9 1 1.954.0

8.8f3.5 6.s3.2 9.4f3.7

15.9f4.0 11.7k3.5 9.2k3.3 7.9f3.1 7.8f3.0 6.6f2.9 9.4f3.3 5.2f2.4 12.0k3.7

9.1f3.1 7.832.9 9.253.3

Tot = Total behaviour problem scale Ext = Externalisingscale Int = Internalking scale

tant when the continuous scores from the checklist scales are being compared. First, there are differences in the item content of the checklists. Sixteen items in the CBCL which “were deemed inappropriate to ask adolescents, mostly because they are characteristic of younger ages” [9, p. 71 were not included in the YSR. These items were replaced with socially desirable items which are not scored on the YSR profile. A further thirteen items on the CBCL are not present on the TRF and are replaced with items describing school related behaviour [8]. Second, on different checklists the Externalising and Internalising Scales are comprised of different items for children of different age and sex. For the purpose of the comparisons based on the continuous scores, different approaches were used to ensure that the Total Behaviour Problem scores, and the Externalking and Internalising scores obtained from different informants were always based on the same set of items. For the Total Behaviour Problem scores (consisting of the sum of the scores from all items on a checklist), items not present on all the

655

checklists were omitted from the calculation of the Total Behaviour Problem score. For the Externalising and Internalising scores, regardless of the age or sex of the child or the informants being compared, the scores were all based on the items that make up the Externalising and Internalising Scales on the CBCL for 12-16 year old males. Once again, only items which were present on all the checklists were included in the calculation of scores on the two scales. These modifications to the scoring procedures ensured that when comparisons focused on the number of problems reported by different informants, the scores compared were always based on the same set of items.

Data analysis Two different statistical approaches were used to investigate the level of agreement between informants when scores from their respective child behaviour checklists were used to classify children as cases or non-cases. In the first approach, the percentages of children actually identified by one, two, or three informants were determined. These were then compared with the percentages of children it was estimated that two or three informants would identify, assuming that identification by one informant was totally independent (in the strict probabilistic sense) of identification by any other informant. These estimated percentages (PE) were based on the percentages identified by individual informants as follows: PE.2 = PI+P2-(PlXP2) PE.F Pl+P2+P3-(PIXP?)-(PlXP3)-(PZXP3)+(PIXP2XP3) PI = Actual percentage from informant 1 Pz = Actual percentage from informant 2 P3 = Actual percentage from informant 3 PE = Estimated percentages from two ( P E . ~ ) or three (pE.3) informants PA = Actual percentage from two or three informants

It would be expected that PE would be greater than PA if there was a high level of agreement between informants about whether or not a child was a case. The second approach employed the kappa statistic [22] and the “Yules-Y” statistic [23] to measure the level of agreement between different informants. The kappa statistic varies from negative values for less than chance agreement, through 0 for chance agreement, to 1.0 for perfect agreement. Herjanic and Reich [24] have suggested that kappas of higher than 0.50 repre-

DIFFERENCES IN REPORTS OF CHILDREN, PARENTS AND TEACHERS

656

Table 3. Cases identifed on the total behaviour problem scale, externalising scale, or internalising scale, by at least one injbrman t

Males

Females

(”/.I

(”/.I

10-11 year olds Tot Ext Int

(n=86) 21 (24) 16 (19) 21 (24)

(n=105) 26 (25) 22 (21) 31 (30)

14-15 year olds Tot Ext Int

(n=71) 20 (27) 18 (25) 24 (34)

(n=74) 18 (24) 12 (16) 18 (24)

Tot = Total behaviour problem scale Ext = Externalising scale Int = lnternalising scale

sent a “high” level of agreement, 0.30 to 0.49 a “middle” level of agreement, and less than 0.29 “low” agreement. With all the measurements of agreement, it was found necessary to restrict the analyses to children identified by at least one informant as a case. Inclusion of the children not identified as a case by any informant (the vast majority) always artificially inflated the level of agreement.

Results The initial analyses focused on three questions. How do prevalence estimates vary with the choice of informants? What percentage of the children, identified as falling in the clinical range of the child behaviour checklists when information was obtained from all four informants, would be identified if information was obtained from one, two, or three of the informants? What percentage of children was reported by only one informant as falling in the clinical range on the child behaviour checklists? 1. What prevalences were identified when reports were obtained from different informants? Estimates of prevalences based on reports from the different informants are shown in Table 2. These estimates were calculated with the weighting ap-

propriate to the stratified sampling scheme employed in the study. Large differences can be seen between the prevalence estimates found when reports were obtained from different informants. For example, when cutoff scores on the Total Behaviour Problem Scale were used, prevalence estimates for male 10- 1 1 year olds ranged from 4.7k2.3 to 17.%4.1. In general, the highest estimates were obtained when reports were obtained from mothers and the lowest estimates when reports were obtained from children.

2. What percentage of cases was identified if one, two or three informants were used to collect information about the children? The total number of children identified as cases by at least one informant from scores on the Total Behaviour Problem Scale, Externalising Scale and the Internalising Scale are shown in Table 3. It should be noted that as it was not essential for the purpose of comparisons described in this section that cases be classified as exclusively Internalisers or Extemalisers, the more stringent criteria suggested by Achenbach and Edelbrock [7, p. 34) to achieve this level of classification were not employed. The percentage of the total number of cases identified by different informants (PA) using scores from the Total Behaviour Problem Scale, Externalising Scale and Internalking Scale are shown in the left hand columns in Table 4. The percentage of cases which it was estimated that two or three informants would have identified in each area (PE) are shown in the right hand columns in Table 4. In this estimation it was assumed that for any one child, identification as a case by one informant was totally independent of identification by any other informant. It might be suspected that if a child was identified as a case by one informant, there would be a much higher chance of the child being similarly classified by a second informant. However, on all three scales the similarity of the estimated percentages (PE)and the percentages actually found (PA) suggests that the degree of overlap between the children identified as cases by two or more informants was no larger than might be expected by chance. These results were supported by those obtained using the kappa statistic and the “Yules-Y” statistic. All the kappas measuring agreement between pairs of informants on the three scales were in the range defined by Herjanic and Reich [24] as that representing “low” agreement. The only comparison which achieved a

MICHAEL GIFFORD SAWYER. PETER BAGHURST. JENNIFER CLARK

657

Table 4. The percentage of cases identified wheii one iifbrmant. ~ M Y Iinformarits or-three ir formants were employed for. case ideritifi‘cation ._____--

Informants

Total problems

Ext. problems

(n=85)

(n=68)

Actual (PA)

Estimated (PE)

Actual

Int. problems (n=94)

Estimated

Actual

Estimated

Mother Father Teacher Child

66 54 33 22

Mother and father Mother and teacher Mother and child Father and teacher Father and child Teacher and child

85 77 74 72 64 48

84 77 74 69 64 48

78 74 66 79 69 47

82 74 67 72 64 49

74 76 72 65 63 53

76 71 69 63 60 50

Mother, father and teacher Mother, teacher and child Mother, father and child Father, child and teacher

96 86 88 80

90 82 88 76

93 81 85 88

89 79 85 78

91 88 86 81

84 79 83 73

59 56 37 19

~ 4 . 0 level 5 of significance with the “Yules-Y” statistic was that which described the level of agreement between the fathers and teachers when identifying children who exceeded the cutoff score on the externalising scale. The latter analysis involved eighteen comparisons and, using a p

Differences between reports from children, parents and teachers: implications for epidemiological studies.

This study describes the different prevalences obtained when varying combinations of informants were used to identify emotional and behavioural disord...
777KB Sizes 0 Downloads 0 Views