Six-year developmental course of internalizing and externalizing problem behaviors.

Six-Year Developmental Course of Internalizing and Externalizing Problem Behaviors FRANK C. VERHULST, M.D.,

AND

JAN

VAN DER

ENDE, M.S.

Abstract. The 6-year developmental course of parent-reported problem behavior in an epidemiological sample of 936 children assessed with the Child Behavior Checklist at 2-year intervals was determined. Children who were scored in the deviant range of the total problem score at time 1 were nine times more likely to be scored deviant 6 years later than were children who were not deviant at time 1 (odds ratio 9.0). Of the deviant children at time 1, 33% were deviant at time 4. There was no difference in the persistence of externalizing versus internalizing problems. This underscores the notion that internalizing problems should not be disregarded. Although this study demonstrated moderate stability of problem behaviors across a 6-year interval, children's problem behaviors should not be regarded as static. Many children showed changes in their level of functioning across time. However, extreme changes were the exception rather than the rule. J. Am. Acad. Child Adolesc. Psychiatry, 1992, 31, 5:924-931. Key Words: epidemiology, follow-up, child psychopathology, longitudinal.

The developmental approach to questions of child psychopathology is characterized by two basic qualities. The first is that development involves change. Children undergo rapid changes in biological, cognitive, emotional, and social functioning. Without children's ability to show developmental changes, questions concerning the etiology and course of maladaptive behaviors would not be relevant. The second quality of developmental processes is that although children show many changes in their functioning, there must be some stability or connectedness. The developmental theories of Freud and Piaget, for example, rely on the assumption that successive stages are causally and coherently linked. Information on the developmental stability of children's problem behaviors is important from a practical as well as theoretical point of view. If problem behaviors in children fail to show persistence across time, this would cast doubt on the necessity of treatment. However, if stability of serious problem behaviors proves to be substantial, this would argue against a wait-and-see policy and for adequate intervention. Often, longitudinal research in the field of child psychopathology is handicapped by a lack of standardized assessment procedures and a lack of operational definitions for diagnostic categories. In previous studies, 2- and 4-year stability of parent reported problem behaviors, and 4-year stability of teacher reported problem behaviors in an epidemiological

Accepted December 2, 1991. Dr. Verhulst is professor and director of child and adolescent psychiatry at Sophia Children's Hospital, Erasmus University, Rotterdam, The Netherlands, where Mr. Van der Ende is research psychologist. This research was financially supported by grants from the Sophia Foundation [or Medical Research and by the Dutch National Programme for Stimulation of Health Research. Portions of this study were presented at the Society for Research in Child and Adolescent Psychopathology, Zandvoort, Holland, June 20, 1991. We wish to thank Dr. Thomas Achenbach for his critique on this manuscript. Reprint requests to Dr. Verhulst, Erasmus University, Rotterdam, Sophia Children's Hospital, Gondelweg 160, 3038 GE Rotterdam, The Netherlands. 0890-8567/92/31 05-G924$03.00/0© 1992 by the American Academy of Child and Adolescent Psychiatry.

924

sample of children was assessed (Verhulst and Althaus, 1988; Verhulst et al., 1990; Verhulst and van der Ende, 1991). Problem behaviors were assessed with the Child Behavior Checklist (CBCL) (Achenbach and Edelbrock, 1983) and the teacher version of this instrument, the Teacher's Report Form (TRF) (Achenbach and Edelbrock, 1986). Using these standardized assessment procedures, high stability was found for the level of parent reported problem behaviors (r = 0.66 for CBCL-total problem scores across a 4-year interval), whereas the stability for teacher reported problem behaviors was medium (r = 0.37 for TRF-total problem scores across the 4-year interval). Stability of both teacher and parent reported problem behavior in children aged 4 to 12 years did not differ significantly by age. Stability was equal for both sexes in parent reported problems, whereas teacher ratings showed more stability for girls than boys. In a previous report on the 4-year follow-up of parentreported problem behaviors (Verhulst et al., 1990), an overview was given of prospective studies published before 1988 in which the stability of problem behaviors was assessed in children from the general population. The studies by Richman et aI. (1982) and Graham and Rutter (1973) are most relevant for the study to be reported here. In both these prospective longitudinal studies of epidemiological samples, caseness was defined in terms of categorical criteria. Richman et aI. (1982) reported the prevalence of psychiatric disorder in a sample of 705 3-year-old children from a London borough. For a subsample of 185 children, information was obtained at ages 4 and 8. It was found that 61% of the problematic children at age 3 still showed considerable difficulties on a clinical rating 5 years later; 38% were labeled antisocial and 23% neurotic by clinical ratings of the type of deviance. In the 4-year follow-up of a general population sample of 10 to l l-year-olds on the Isle of Wight, Graham and Rutter (1973) found that of children with time 1 clinical diagnoses of conduct disorder, 75% remained deviant at time 2. Of the children with time 1 diagnoses of emotional disorder, 46% remained deviant at time 2. Two recent studies reported on the stability of problem behaviors in general population samples. Esser et aI. (1990) J. Am. Acad. Child Adolesc. Psychiatry, 31:5, September 1992

COURSE OF INTERNALIZING AND EXTERNALIZING PROBLEMS

studied the persistence of psychiatric disorder in a general population sample of 399 children assessed at ages 8 and 13. Of the 71 children initially diagnosed as having a psychiatric disorder, approximately 50% remained disturbed across the 5-year interval. The prognosis for emotional disorders was found to be more favorable than that for conduct disorders. Berden et al. (1990) studied the relationship between life events and problem behavior in Dutch children from the general population assessed longitudinally across a 2-year interval, using the CBCL. A significant but small effect, accounting for 2.9% of variance, was found between total scores for life events and CBCL total problem scores at the second time of assessment, with initial CBCL scores partialled out. In general, most studies showed modest stability of problem behaviors. Firm conclusions cannot be drawn, however, because of several limitations of the studies, such as the restriction of the sample to a single locality, the use of selected samples, the use of different assessment instruments at different times, and the use of very broad categories of functioning. A number of studies compared the prognosis of externalizing versus internalizing disorders. These studies showed conflicting results, however. In the Isle of Wight 4year longitudinal study (Graham and Rutter, 1973), emotional disorders tended to be somewhat less persistent than conduct disorders. However, Verhulst and Althaus (1988) found little difference in the 2-year persistence of deviance among children from the general population who scored above the 90th percentile on the CBCL internalizing scale or the externalizing scale. Of the children who were initially deviant on the externalizing scale, 28% remained deviant on this scale, whereas 24% of the children who were initially deviant on the internalizing scale remained so. McConaughy et al. (1992) also reported little difference in the persistence of parent reported externalizing (52%) versus internalizing (44%) problems in a general population sample across a 3year interval. A number of studies investigated the persistence of depressive disorders in clinic samples and reported considerable continuity across time (Harrington et al., 1990; Kovacs et al., 1984 a, b). Although it has often been assumed that internalizing disorders have a rather good prognosis, few studies have tested this over a reasonably long period. The present study enabled us to test the difference in persistence of parent reported externalizing versus internalizing disorders across a 6-year period in children from the general population assessed at 2-year intervals with the same instrument. The main aims of the present study were: (1) to determine the course of parent-reported problem behavior in individual children from the general population assessed with the CBCL at 2-year intervals across a 6-year period; and (2) to identify differences in the stability of problem behaviors in individual children according to the type of problem and the sex and age of the child. Method Instrument The CBCL (Achenbach and Ede1brock, 1983) was used J. Am.Acad. Child Adolesc.Psychiatry,31:5, September1992

to obtain standardized parents' reports of children's problem behaviors. It consists of 20 competence items and 120 problem items. Only the findings from the problem section will be reported. Parents are requested to circle a 0 if the problem item is not true of the child, a 1 if the item is somewhat or sometimes true, and a 2 if it is very true or often true. A total problem score is computed by summing all Os, 1s, and 2s. The CBCL was translated into Dutch with the help of a linguist. The good reliability and discriminative validity established by Achenbach and Edelbrock (1983) were confirmed by the authors' studies using this translation (Verhulst et al., 1985 a.b), A correlation of 0.70 has been obtained between CBCL total problem scores and problem scores based on clinical interviews with parents 313 (SD = 66) days after they completed the CBCL (Verhulst and Van der Ende, 1991). Achenbach (1991) has constructed eight so-called crossinformant syndromes that were similar on the parent, teacher, and self-report versions of the CBCL, respectively on the CBCL, the TRF, and the Youth Self-Report. The syndromes were empirically derived via factor analyses. The eight cross-informant syndromes described by Achenbach (1991) were used. Two broad-band groups of syndromes, designated as externalizing and internalizing were also used in the analyses. Externalizing problems reflect conflicts with other people and their expectations of the child, whereas internalizing problems reflect internal distress. The internalizing group consists of the anxious/depressed, somatic complaints, and withdrawn syndromes. The externalizing group consists of the aggressive and delinquent behavior syndromes. Description of the Sample

The original sample consisted of 4- to 16-year-olds from the Dutch province of Zuid-Holland in 1983 (time 1). For the present study, only those children who were 4 to 11 years old in 1983 were included. The original target sample consisted of 1,498 children aged 4 to 11 years. At time 1, usable CBCLs were obtained on 1,315 (87.8%) of the children. Parents were interviewed by trained interviewers who recorded the parents' answers to each CBCL question. The same procedure was followed at time 3 (1987) and time 4 (1989). At time 2 (1985), because of financial constraints, CBCLs were obtained by mail. Usable CBCLs were obtained for 936 children on all four occasions. This is 71.2% of the children in the time 1 sample, and 62.5% of the original target population. The sample included 238 boys and 240 girls who were initially aged 4 to 7, plus 221 boys and 237 girls who were initially aged 8 to 11. For a more detailed description of the original time 1 sample, see Verhulst et al. (1985 a). A problem with longitudinal studies is the attrition of the sample during the study. The present study was not an exception, because for nearly 29% of the children on whom there was time 1 information, the follow-up data for the three subsequent assessments were not complete. These children were excluded from the present study. Dropouts (N = 375) and remainers (N = 936) were compared with respect to sex,

925

VERHULST AND V AN DER ENDE

age, socioeconomic status (SES), and CBCL total problem scores. Dropouts and remainers did not differ significantly in the level of total problem scores (24.5 for dropouts; 22.5 for remainers; t = 1.8, NS), indicating that the dropouts did not form a group with especially problematic children. Dropouts also did not differ from remainers with respect to age or sex. However, parents of lower SES were somewhat overrepresented in the group of dropouts (40.6% of lower SES in the group of dropouts, and 32.2% in the group of remainers), whereas in the group of remainers, parents from higher SES were overrepresented (28.6% of higher SES in the group of dropouts, and 34.3% in the group of remainers) (chi square = 21.1; df = 5; p < 0.01). Because lower SES children may be subjected to somewhat poorer environmental influences, there is a possibility that the present study's findings slightly underestimate the level of persistence of children's problem behavior. Results

Total Problem Scores To assess the persistence of problem behaviors across the 6-year time interval, first children were identified who could be classified as deviant at time 1. The 90th percentile (P90) of the cumulative frequency distribution of total problem scores was chosen as the clinical cutoff above which children can be considered deviant. Because attrition of the sample across the 6-year interval may have caused selective loss of subjects, the P90 was based on the original time 1 Dutch normative sample for both sexes and age-groups 4 to 11 and 12 to 16 separately (Verhulst et aI., 1985a). This normative sample consisted of a representative sample from the general population, excluding those children who had received mental health services within the preceding 12 months (N = 2033). In this way, a normative sample of children considered to be healthy was obtained. Using the score associated with the P90 of the normative sample as the clinical cutoff, 92 children could be identified who were scored above this cutoff at time 1. Because movements of children scoring above the cutoff toward scores just below the cutoff can hardly be regarded as significant improvement in children's functioning, the 50th percentile (P50) was chosen as the border below which children were regarded as functioning well. The P90 and P50 cutoff scores make it possible to detect significant improvement or worsening in children's functioning across time. Although it was also possible to determine cutoffs for each time of assessment separately to take account of possible retest and method effects, it was decided to use the cutoff scores based on the Dutch normative sample, rather than different cutoffs for each specific time. The main reason for this choice was that the general population sample of 936 children that was followed-up was not truly normative because of attrition and of the fact that the sample also contained children who had been referred for mental health services. By applying the normative cutoff scores, it was likely that children scoring above the P90 showed behaviors that were deviant enough to be regarded as clinically relevant.

926

Figure 1 shows the pathways of the 92 children who were scored in the deviant range of total problem scores by their parents at time 1 (upper left of the figure). The four times of assessment are indicated at the top of the figure. The figure should be read from the left to the right, from time 1 to time 4. Table entries indicate numbers of children. Figure 1 shows the movements across the borders of the 90th and the 50th percentile, indicated as P90 and P50. Following the total problem scores of the 92 children who could be regarded deviant at time 1, 39 were still scored in the deviant range at time 2 (above the P90 line), whereas 46 children improved somewhat (scoring between the P90 and P50 at time 2), and seven could be regarded to have been considerably improved (scoring below the P50 at time 2). Next, it is possible to track the children's scores from time 2 to time 3, and from time 3 to time 4. Fifteen children (16% of the children who were scored deviant at time 1) were scored in the deviant range at all four assessments (upper right corner). Thirty children (33%) of those who were scored in the deviant range at time 1 were scored in the deviant range again 6 years later at time 4. This percentage is indicated on the right side at the top. Fifteen children (16%) made a temporary detour with scores at one or two times of assessment below the P90 and then

Total Problem Score Time 1

Time 2

Time 3

Time 4

92

39

25

15 4 6

13

4

1

13 ~

~

33%

7 9 6

W52% 4

4

P 50

3

..

1 1

315% 5 7

3

1

2

3 1 2

FIG. 1. Pathways of the 92 children with initial total problem scores in the clinical range across the 6-year follow-up period. The four times of assessment are indicated at the top. The figure should be read from the left to the right. Dotted lines indicate the P90 border (top) or the PSO border (bottom). Entries indicate numbers of children, except where percentages are given on the right side indicating the percentage of children scoring in the deviant range at time 1 who are respectively (from top to bottom) scored above the P90, between the P90 and the PSO, and below the pso. J. Am. Acad. Child Adolesc. Psychiatry, 31:5, September 1992

COURSE OF INTERNALIZING AND EXTERNALIZING PROBLEMS TABLE

1. Percentage of Children in Each Category Representing Different Patterns of Longitudinal Course Category

Percentage

DDDD

D-L

LLLL

L-D

R

Total (N = 936)

1.6

. 1.5

26.8

1.1

69

100

Note: DDDD = scores above the P90 at all four assessments (D = deviant), LLLL = scores below the P50 at all four assessments (L = low), D-L = scores above the P90 at time 1 and below the P50 at time 4, L-D = scores below the P50 at time 1 and above the P90 at time 4, R = the remainder of children.

back again to the deviant range at time 4. Only one child had a total problem score below the P50 (at times 2 and 3) and again above the P90 at time 4. Although 67% of the deviant children at time 1 showed improvement, for the majority this improvement was only slight; 52% were scored below the P90 and above the P50 (percentage indicated in the figure on the right side in the middle). Only 15% of the deviant children at time 1 showed a significant improvement, that is, they were scored below the P50 at time 4 (indicated on the right side at the bottom). Table 1 shows the percentages of children from the total sample falling into four groups representing different patterns of longitudinal course that are of interest. These groups consisted of the following: (1) children who were scored in the deviant (D) range (above the P90) at each of the four assessments; (2) children who scored low (L; below the P50) at each of the four assessments; (3) children who changed from low to deviant scores (L-D); and (4) children who changed from deviant to low scores (D-L) during the 6year time interval. The table shows that the percentages of children in each group, except the LLLL group, were very small. The percentages of children who made significant changes in their scores across the 6-year time interval were the smallest. The majority of children showed less extreme changes from time I to time 4 in their scores. To determine the possible effects of sex and age on the level of persistence, Table 2 shows the percentage of children scored deviant at time 1 who were also scored deviant at each of the three subsequent assessments. The percentages in Table 2 indicate that the persistence for girls was somewhat greater than that for boys. However, this difference was not significant (chi square = 1.73, df = 1, NS). Older children seemed to show somewhat more persistence in their problem behavior than younger children, 2. Distribution by Sex and Age-Group of Children Scored Deviant at Time 1 and Children Who Were Scored Deviant at All Four Assessments

but this difference was not significant (chi square = 1.51, df = 1, NS). Figure 2 shows what happened to the children in the sample who at time 1 were scored below the P50. These children can be regarded as functioning well. The left corner at the bottom of Figure 2 shows that this was the case for 450 children at time 1. Of these children, only six showed a considerable increase in problem behavior from below the P50 at time 1 to above the P90 at time 2. For 75 children, a somewhat smaller increase in problem behavior was found (from below the P50 at time 1 to the scoring range between the P50 and the P90 at time 2). The majority of children (56%) who at time I could be regarded as functioning well continued to be scored below the P50 at each of the three subsequent assessments. Of the children who scored below the P50 at time 1, 67% were scored below the P50 at time 4 (indicated on the right side at the bottom); 51 (11%) were scored above the P50 at time 2 and/or time 3, but returned below the P50 at time 4. The majority of children scored below the P50 at time 1 who showed an increase in problem behavior (scores above the P50) at subsequent assessments did not obtain scores above the P90. Only 2% (indicated in the figure on the right side

Total Problem Score Time 1

Time 2

Time 3

Age-Group (years)

Sex

Children scored deviant at time 1 Children scored deviant at all four assessments

Girls

N(%)

N(%)

51 (55) 6 (40)

4-7

8-11

N(%)

N(%)

41 (45)

44 (48)

48 (52)

9 (60)

5 (33)

10 (67)

--- ---

J. Am. Acad. Child Adolesc. Psychiatry, 31:5, September 1992

1 1 2 2 1

2

1 1 1

P 90 ..__..

oo.

oooooooooooo_oooooooo _ _ oooo

oooooo.oo_oo

3 75

30

2%

00.00-

2

1 3

2331% 1 15

'--~":-'--f+ 36 57 v

25

TABLE

Boys

Time 4

67%

5

450

369

309

21 251

FIG. 2. Pathways of the 450 children with initial total problem scores

below the P50 across the 6-year follow-up period. The four times of assessment are indicated at the top. The figure should be read from the left to the right. Dotted lines indicate the P90 border (top) or the P50 border (bottom). Entries indicate numbers of children, except wher~ percenta~es are given on the right side indicating the percentage of children sconng below the P50 at time 1 who are respectively (from top to bottom) scored above the P90, between the P90 and the P50, and below the P50.

927

VERHULST AND VAN DER ENDE

at the top) of the well functioning children at time 1 were deviant at time 4. To determine the degree to which deviance at time 1 predicted deviance 6 years later, odds ratios were computed. The risk of deviance at time 4 was computed for children who were scored in the deviant range of total problem scores at time 1 relative to the risk for children who were not deviant at time 1. The odds ratio was 9.0 (95% confidence interval 5.7 to 14.3). In other words, children who had been deviant at time 1 were nine times more likely to be deviant on the total problem scale at time 4 than were children who had not been deviant on the total problem scale at time 1. The odds ratio for girls of 14.2 (7.4 to 27.3) was greater than that for boys, which was 6.1 (3.2 to 11.8). However, this difference was not significant (Breslow-Day test for homogeneity of the odds ratio's; chi square = 2.4; df = 1; NS). The odds ratio for the 4- to 7-year-old cohort of 9.1 (4.9 to 17.0) was very similar to that for 8- to ll-year-olds, which was 9.8 (4.9 to 19.7).

Internalizing Time

1

104

Time 37

2

Time

3

Time 4

26 10

18 2 7 2 6 2

36%

P 90 .._............ 10 59

39

Patterns of Longitudinal Course for Different Syndromes The patterns of movements across time for children's externalizing and internalizing scores were determined. Figures 3 and 4 show what happened to the children who at time 1 were scored above the P90 for the externalizing and internalizing scores respectively. The figures can be read in a similar way as Figure 1. As can be seen from these figures, the patterns of persistence across time for both sets of scores were very similar

Externalizing Time 104

1

Time

Time

2

3

Time 4

24

38

12

14 56

34

13 4 7 1 7 132%

9 10 5 1

2358% 7

P 50

-

5

1

4

..

1 10 L..--/-_ _\\-

11% 2

1 4

10----

4

Life-course fertility patterns associated with childhood externalizing and internalizing behaviors.

Mediators of the Associations Between Externalizing Behaviors and Internalizing Symptoms in Late Childhood and Early Adolescence.

Sleep duration and RSA suppression as predictors of internalizing and externalizing behaviors.

Internalizing and externalizing disorders as predictors of alcohol use disorder onset during three developmental periods.

Respiratory Sinus Arrhythmia Activity Predicts Internalizing and Externalizing Behaviors in Non-referred Boys.

Maternal mental health and children's internalizing and externalizing behaviors: Beyond maternal substance use disorders.

Externalizing and internalizing subtypes of posttraumatic psychopathology and anger expression.

Critical periods for the neurodevelopmental processes of externalizing and internalizing.

Intergenerational Transmission of Emotion Dysregulation Through Parental Invalidation of Emotions: Implications for Adolescent Internalizing and Externalizing Behaviors.

The Moderating Role of Genetics: The Effect of Length of Hospitalization on Children's Internalizing and Externalizing Behaviors.

Gendered Pathways From Child Abuse to Adult Crime Through Internalizing and Externalizing Behaviors in Childhood and Adolescence.

Neuroimaging of externalizing behaviors and borderline traits.

Internalizing and Externalizing Symptoms in Sons and Daughters of Mothers with a History of Depression.

Kindergarten Predictors of Recurring Externalizing and Internalizing Psychopathology in 3rd and 5th grade.

Internalizing and externalizing psychopathology as predictors of cannabis use disorder onset during adolescence and early adulthood.

Association of eating disorder symptoms with internalizing and externalizing dimensions of psychopathology among men and women.

Cortisol Awakening Response and Internalizing Symptoms Across Childhood: Exploring the Role of Age and Externalizing Symptoms.

Family conflict, mood, and adolescents' daily school problems: moderating roles of internalizing and externalizing symptoms.

Parental endorsement of spanking and children's internalizing and externalizing problems in African American and Hispanic families.

Internalizing and externalizing problem behavior and early adolescent substance use: a test of a latent variable interaction and conditional indirect effects.

Integrating autism-related symptoms into the dimensional internalizing and externalizing model of psychopathology. The TRAILS Study.

The dynamics of internalizing and externalizing comorbidity across the early school years.

Predicting externalizing and internalizing behavior in kindergarten: examining the buffering role of early social support.

Moderating Effects of Coping on Associations between Autonomic Arousal and Adolescent Internalizing and Externalizing Problems.