Reliability and validity of methods for measuring the duration of untreated psychosis: a quantitative review and meta-analysis.

Schizophrenia Research 160 (2014) 20–26

Contents lists available at ScienceDirect

Schizophrenia Research journal homepage: www.elsevier.com/locate/schres

Reliability and validity of methods for measuring the duration of untreated psychosis: A quantitative review and meta-analysis Kelly Register-Brown a,⁎, L. Elliot Hong b a b

University of Maryland/Sheppard Pratt Psychiatry Residency Training Program, University of Maryland. 701 W. Pratt St., 4th Floor, Baltimore, MD 21201, USA Maryland Psychiatric Research Center, Department of Psychiatry, University of Maryland School of Medicine, Baltimore, MD. Tawes Ct., Catonsville, MD 21228, USA

a r t i c l e

i n f o

Article history: Received 2 July 2014 Received in revised form 9 October 2014 Accepted 14 October 2014 Available online 11 November 2014 Keywords: Duration of untreated psychosis Schizophrenia Meta-analysis

a b s t r a c t Background: The duration of untreated psychosis (DUP) has been associated with a wide range of clinical outcomes, and is considered to be one of the key parameters in managing clinical high risk and first episode psychosis patients. However, considerable discrepancies exist in the way that DUP is estimated in different studies. There is no standard or consensus on which method is most reliable and valid for assessing DUP. Methods: This review aimed to quantitatively assess different DUP measurement instruments and definitions by comparing their inter-rater reliability, and their strength of validity in predicting biological and clinical outcomes. Results: Nine instruments designed for measuring DUP were found. Their inter-rater reliability were found to be adequate to excellent, although quite varied. This analysis did not show that any instrument was clearly outstanding compared to the others, although the limited available data do not exclude this possibility. DUP was also significantly associated with a range of outcomes, although mostly with small effect sizes. However, non-instrument based, ad hoc clinical interviews remained the most common way of measuring DUP. Definitions of onset of psychosis and onset of treatment were inconsistent among studies. Conclusions: This review did not find quantitative evidence to support the use of one instrument over another. DUP remains a promising modifiable risk factor for a range of long-term clinical outcomes. Future research should quantify and improve the reliability and validity of the structured instruments for DUP measurement. © 2014 Elsevier B.V. All rights reserved.

1. Introduction The duration of untreated psychosis (DUP), or the time between the onset of psychosis and initiation of treatment, is a growing research and clinical focus in early psychosis. Research first identified DUP as a prognostic factor in the 1980s (McGlashan, 1999; Singh, 2007). More recent systematic reviews and meta-analyses (Marshall et al., 2005; Norman et al., 2005; Perkins et al., 2005; Large et al., 2008; MacBeth and Gumley, 2008; Farooq et al., 2009; Large and Nielssen, 2011; Boonstra et al., 2012) have examined the relationship of DUP to outcomes including positive and negative symptom severity at follow-up, likelihood of remission of psychosis, overall functioning, and quality of life. These efforts generally found that patients with shorter DUP fare better even after adjustment for premorbid functioning. Additional studies have generally found that numerous outcomes including neuropsychological test performance, changes on brain imaging, risk of violence, and risk of suicide are also associated with DUP (Large and Nielssen, 2011; Chang et al., 2013; Chou et al., 2014), although these findings were

⁎ Corresponding author. Tel.: +1 410 328 6325; fax: +1 410 328 1212. E-mail address: [email protected] (K. Register-Brown).

http://dx.doi.org/10.1016/j.schres.2014.10.025 0920-9964/© 2014 Elsevier B.V. All rights reserved.

not consistent. For example, a recent study reported that neuropsychological test performance is not related to DUP at initial hospitalization (Broussard et al., 2013). These findings support the increasing research and public health efforts aimed at reducing DUP by identifying and addressing barriers to the early identification and treatment of psychosis (Killackey and Yung, 2007; McGorry et al., 2007; Bird et al., 2010; Lloyd-Evans et al., 2011; Broussard et al., 2013). Despite these positive results, considerable discrepancies exist in the way that DUP is estimated in different studies. Measuring DUP requires establishing both the date on which psychosis first presented and the date on which treatment commenced. However, as has been discussed extensively in the literature, these measurements are fraught with practical difficulties, making reliable assessment difficult (Norman and Malla, 2001). The majority of patients with psychotic disorders as defined by established diagnostic criteria experience a prodromal period, which can include attenuated psychotic symptoms and/or brief limited intermittent psychotic episodes (Fusar-Poli et al., 2013). Even with a valid and reliable measure of symptom severity, the point at which attenuated or brief symptoms cross the diagnostic duration and severity thresholds is often unclear, making the dating of psychosis onset necessarily somewhat arbitrary (Norman and Malla, 2001; Compton et al., 2007 and 2011; Singh, 2007). In addition, psychosis encompasses

K. Register-Brown, L.E. Hong / Schizophrenia Research 160 (2014) 20–26

multiple symptom categories, but studies are inconsistent as to which categories are used to define onset (Larsen et al., 2001; Norman and Malla, 2001; Compton et al., 2007). Dating the onset of treatment is quite inconsistent across studies as well, as multiple definitions of “treatment” have been used for both theoretical and practical reasons. The date on which the patient first sought mental health care for psychotic symptoms, and the date on which the first antipsychotic prescription was written, are appealingly concrete but do not necessarily imply a patient has received effective treatment (Breitborde et al., 2009). The date of the first hospitalization is similarly practical, but may falsely lengthen DUP by disregarding any prior period of outpatient treatment (Large et al., 2008), or falsely shorten DUP by disregarding periods of ineffective inpatient treatment caused for example by nonadherence or treatment resistance. Attempting to discern the first clinically effective antipsychotic trial, although intuitively logical, introduces the problem of reliably defining “effective” (Norman and Malla, 2001; Compton et al., 2007; Breitborde et al., 2009; Dell’Osso et al., 2013). Determining the first “effective” psychosocial treatment, like supported employment or cognitive-behavioral therapy for psychosis has not, to our knowledge, been attempted in the literature, but would likely encounter similar methodological difficulties. Further complicating these measures is the fact that patients typically enroll in DUP studies after they first seek treatment for their psychosis. The dating of both symptom onset and treatment seeking is therefore usually retrospective and potentially done at a time when the patient is still experiencing psychotic and/or cognitive symptoms (Maurer and Häfner, 1995; Norman and Malla, 2001). Most studies gather collateral data from family, mental health providers, or medical records to address this difficulty; researchers must methodically reconcile these sources when they disagree (Norman et al., 2005). Because no previous studies that we are aware of have actually quantitatively compared specific methods for measuring DUP in the same patient populations, the method to achieve the best reliability and validity is still unclear. At least five excellent reviews have qualitatively discussed the above issues in measuring DUP (Norman and Malla, 2001; Compton et al., 2007; Singh, 2007; Breitborde et al., 2009; Dell’Osso et al., 2013), and at least seven have systematically examined DUP as a predictor of clinical outcomes (Marshall et al., 2005; Norman et al., 2005; Perkins et al., 2005; MacBeth and Gumley, 2008; Farooq et al., 2009; Large and Nielssen, 2011; Boonstra et al., 2012). However, these reviews were not aimed to establish quantitative comparisons on the reliability and validity of DUP assessment tools. The aim of the present paper is to summarize and, to the extent possible, quantitatively evaluate the quality of the methods that have been used in the peer-reviewed published literature for measuring DUP. Here, we consider the quality of DUP assessments in the context of reliability and validity. For reliability, because very few included studies have reported test-retest reliability, our evaluation is based on interrater reliability. For validity, although construct and face validity are seemingly simple constructs, how to quantify the validity of DUP is not straightforward, especially since construct and criterion validity were not determined during the development of many of the DUP instruments as one would expect were they derived from a strict empirical measurement perspective. Therefore, we were required to operationally define what the field intends to achieve by measuring DUP. DUP is an important research concept in part because it may predict outcomes of disease and treatment. In this review we evaluated predictive validity by exploring which DUP measurement method best predicts outcomes of disease and treatment. In this review we evaluated predictive validity by exploring which DUP measurement method best predicts outcomes of disease and treatment. Specifically, we used two categories of quantitative data to assess and compare DUP measurement tools: 1) their inter-rater reliability for psychosis onset, treatment onset, and/or DUP measurements; and 2) their effect size for the relationship between DUP and biological and clinical outcomes.

21

2. Methods PubMed was searched for keyword “duration of untreated psychosis” OR “duration” AND “untreated” AND “psychosis” on December 5, 2013. The PRISMA flow diagram detailing the reviewed citations is in Fig. 1. Inclusion criteria were publication in a peer-reviewed journal, containing measurement of DUP, and assessment of the relationship of DUP to one or more patient outcome measurements. A total of 141 studies were included. The following information was extracted from these papers: number of included individuals with psychosis, instrument or method used to measure DUP, definition of onset of psychosis, definition of onset of treatment, definition of outcome, reported association of DUP with this outcome, and inter-rater reliability for the measurement of psychosis onset, treatment onset and/or DUP. Authors were contacted with requests for copies of the instruments that are not available online. The reported associations of DUP with biological and clinical outcomes were entered into the software package Comprehensive Meta Analysis version 2.2 (CMA, Biostat, Inc., Englewood, NJ, 2011). Nine papers were excluded from the analysis due to insufficient data compatible for this meta-analysis package, bringing the total included papers to 132. Univariate analysis based results were available from 121 and only multivariate analysis based results were available from the remaining 11 papers; results were almost identical when these 11 papers were excluded, so they were included in the final analysis. When multiple statistical results were presented in the same paper, we used the result with the lowest p value; this choice was based on the assumption that papers reporting just one p value may have had other non-significant results they did not report. For three papers, p value was given as “not significant” rather than as a specific number, so a p value of 0.45 was used. This was arbitrary, to avoid representing “nonsignificant” with p = 0.06 or p = 0.99. Following Perkins et al., 2005, for studies with results from assessments done in multiple time points, the result from the longest time point was used. Although the relative importance of DUP throughout the course of illness has not been established for most outcomes, in their meta-analysis of the relationship of DUP to negative symptoms Boonstra et al. (2012) found no evidence for attenuation of the strength of association at long term (5–8 year) versus short term (1-2 year) follow-up or baseline. Meta-analyses were performed using random effects models, synthesizing outcome data with Fischer’s Z transformation into a single Fisher’s Z statistic with 95% confidence interval. The heterogeneity of the sample populations was evaluated with the I2 statistic calculated in CMA. Following Perkins et al., 2005, clinical outcomes were grouped as positive symptoms, negative symptoms, overall functioning, and relapse risk. In addition, categories were created for “outcomes” in treatment adherence, neuroimaging, neurocognitive changes, and suicidality and violence. 3. Results 3.1. Reliability of DUP assessments The 132 studies were from 94 research groups, accounting for a total of 17 135 subjects. A summary of the reliability for DUP, psychosis onset, and/or treatment onset measurements are in Table 1, categorized by different types of instruments. More comprehensive information on each of the 132 studies is tabulated in the Online Supplement Table S1. Twenty-seven of these research groups used eight instruments developed specifically to assess DUP. The most commonly used instrument was the Interview for the Retrospective Assessment of the Onset of Schizophrenia (IRAOS), used by eight groups. IRAOS's inter-rater reliability was 73 to 97% by pairwise agreement (Häfner et al., 1992, 1994). The Beiser scale (Beiser et al., 1993) was used in seven groups, with reported ICC of 0.79 to 0.98 (Clarke et al., 2006). The Symptom Onset in Schizophrenia Inventory (SOS; Perkins et al., 2000) was used by five groups, with κ = 0.49 to 1.0 for individual items, and κ = 0.8-0.98


Identification

22

Records idenfied through database searching (n = 486)

Addional records idenfied through other sources (n = 19)

Eligibility

Screening

Records aer duplicates removed (n = 476)

Records screened (n = 476)

Records excluded (n = 310)

Full-text arcles assessed for eligibility (n = 166)

Full-text arcles excluded, with reasons (n = 25) • •

Measured the onset of prodromal symptoms Did not measure outcome

Included

Studies included in qualitave synthesis (n = 141)

Studies included in quantave synthesis (meta-analysis) (n = 132)

Fig. 1. PRISMA Flow Diagram, after Moher et al., 2009.

Table 1 Reported reliabilities for DUP, psychosis onset, and treatment onset by DUP measurement method. Instrument

Basel Interview for Psychosis Beiser Scale Comprehensive Assessment of Symptoms and History Circumstances of Onset and Relapse Schedule Interview for the Retrospective Assessment of the Onset of Schizophrenia Nottingham Onset Schedule Positive and Negative Syndrome Scale for Schizophrenia(modified) Psychiatric and Personal History Schedule Royal Park Multidiagnostic Instrument for Psychosis Symptom Onset in Schizophrenia Inventory Clinical Interview Chart Review Totals

Estimated time to administer and Score

Reported Reliabilitya

Number of studiesb

Number of research groupsc

Total number of subjects

DUP

Psychosis onset

Treatment onset

0.5 hr 2 hr

NI ICC=0.79-0.98 ICC=0.87-1.00

NI ICC = 0.94-0.98 ICC = 0.96

NI 0.95 ICC = 0.96-1.00

1 (1 + 0) 11 (9 + 2) 4 (3 + 1)

1 7 2

60 786 337

1.5 hr d

ICC=0.71-0.98

NI

NI

7 (7 + 0)

1

259

1.5-2 hr

κ = 0.6-0.95

PA = 77%

PA = 80-100%

11 (8 + 3)

8

1089

0.25-0.75 hr 0.5 hr

ICC=0.95-0.99 ICC=0.9-0.99

PA = 70% NI

NI NI

2 (2 + 0) 18 (17 + 1)

2 8

174 1969

0.5-1 hr 4-7 hr d

ICC=0.90 κ = 0.79

NI κ = 0.79

NI NI

4 (4 + 0) 6 (6 + 0)

2 1

277 661

0.5 hr

ICC=0.99

ICC = 1.0

NI

7 (5 + 2)

5

937

ICC=0.7-1.0 ICC=0.73

NI NI

NI NI

55 (55 + 0) 6 (6 + 0) 132

52 5 94

10,089 497 17 135

DUP, duration of untreated psychosis. ICC, intraclass correlation. NI, none identified from the published report. PA, pairwise agreement. a Some reports included multiple reliability calculations. For those cases, we included the best reliability reported. b Total number of included studies. Some studies may not use the full, original version. Numbers in parentheses indicate the number of studies using the unmodified instrument plus the number of studies using the modified versions of the instrument. c Numbers of research group may not match the numbers of studies when the instrument was used in more than one report by a research group. The numbers of research groups were an approximation based on authorship and reported affiliations. d Time required to administer the full instrument; an abbreviated version targeting DUP measurement is shorter.

K. Register-Brown, L.E. Hong / Schizophrenia Research 160 (2014) 20–26 Table 2 Frequencies with which different definitions of treatment onset were used, ranked by the number of studies. Note that some studies used more than one definition and are included in more than one row. Definition of Onset of Treatment

Number of Publications (%)

Number of Research Groups

Total Number of Subjects (%)

First Psychiatric Hospitalization First Antipsychotic Treatment First Adequate Treatment Enrollment in Study First Treatment for Psychotic Symptoms Undefined Total

37 (28) 37 (28) 36 (27) 14 (11) 15 (11)

22 28 23 7 15

4616 (27) 2948 (17) 6141 (36) 1997 (12) 2256 (13)

14 (11) 132

14 94

2037 (12) 17 135

overall (Cuesta et al., 2011). The Comprehensive Assessment of Symptoms and History (CASH) and Nottingham Onset Schedule (NOS) were each used by two research groups, and the Circumstances of Onset and Relapse Schedule (CORS), Royal Park Multidiagnostic Instrument for Diagnosis (RPMIP), and Basel Interview for Psychosis each by one research group (McGorry et al., 1990; Andreasen et al., 1992; Ho et al., 2004; Norman et al., 2004; Singh et al., 2005; Fridgen et al., 2013). Overall, these instruments were reported to have good to excellent reliability in their original publications. Ten research groups used more generic psychosis assessment instruments to measure DUP. The PANSS was used by eight research groups (total N = 1969), who typically defined psychosis onset by a score 4 or higher on the PANSS positive subscale with a duration criterion (Larsen et al., 1996). Of these eight, four groups (total N = 740) reported the inter-rater reliability for DUP measurement (ICC = 0.73-0.99). The Psychiatric and Personal History Schedule was used by two research groups (total N = 277), with a maximum reported ICC of 0.90 (Janca and Chandrashekar, 1993). The most common way of determining DUP, however, was through clinical interviews, which were used by 52 research groups with a total N of 10 089 participants. Only 8% of the clinical interview studies (N = 1094 participants) reported the inter-rater reliability of their technique for measuring DUP, with ICC ranging from 0.7 to 1.0 (Drake et al., 2000; Takahashi et al., 2007; Chang et al., 2012; Lihong et al., 2012). However, because 92% of these studies did not report reliability, this ICC range may be biased. Some studies incorporated instruments including the Structured Clinical Interview for DSM-IV Axis 1 Disorders (Craig et al., 2000), Present State Examination (Madsen et al., 1999), Scale for the Assessment of Positive Symptoms (González-Blanch et al., 2008), Brief

23

Psychiatric Rating Scale (Alvarez-Jiminez et al., 2009), and Association for Methodology and Documentation in Psychiatry (Bottlender et al., 2003) into their clinical interviews for determining DUP. However, the reliability and validity for using many of these instruments to retrospectively measure the onset of psychosis are unclear. The definitions of treatment onset used in measuring DUP fell into six main categories (Table 2). The most common definition (total N = 6141 subjects) was the first time psychosis was “adequately” treated. Few studies attempted to set quantitative criteria for this definition; among those that did, the required medication trial duration ranged from 2 to 6 weeks (Malla et al., 2002; de Haan et al., 2003; Diaz et al., 2013; Lopez-Morinigo et al., 2013; Winsper et al., 2013), with most groups requiring around 4 weeks. Lopez-Morinigo et al. (2013) required 75% adherence for one month, and Larsen et al. (1996 and 2000) required an “antipsychotic given in sufficient time and amount that it would lead to clinical response in the average non-chronic schizophrenia patient.” The first psychiatric hospitalization (total N = 4616) and first time antipsychotic medication was prescribed, regardless of trial duration, dose, or compliance (total N = 2948) were also commonly used definitions. About 10% of studies used more than one definition, for example using either the first adequate treatment or the first psychiatric hospitalization. 3.2. Predictive validity The effect sizes of DUP for predicting outcomes are summarized in Table 3. “Predictive validity” here refers to either the correlation with other meaningful clinical and biological measures, or predictive value for treatment response and clinical and functional improvement or deterioration over time. Fisher’s Z statistics were generally in the 0.1 to 0.3 range, corresponding to small effect sizes. I2 for individual analyses was 43.2 to 86.7, indicating moderate to high heterogeneity, and supporting the use of the random effects model. A funnel plot of the overall sample was roughly symmetrical, and Rosenthal’s classic fail-safe N was 21724, indicating lack of significant publication bias. Overall, without considering DUP measurement method, DUP numerically had the highest predictive validity for imaging outcomes in terms of Z score (Z = 0.25, p b 0.001) as compared with the Z scores for other outcome measures. Note that this is a summary score of all imaging data because the current review does not analyze more granular data on specific imaging modality, anatomic regions, or technical differences in imaging. DUP had less robust predictive validity for treatment adherence (Z = 0.14) and suicidality/violence prediction (Z = 0.084), although these were still statistically significant in the meta-analysis.

Table 3 A meta-analysis comparing the predictive values of different DUP measurement methods for clinical and biological outcomes. Instrument

Treatment Adherence Z

Basel Interview Beiser Scale CASH CORS IRAOS NOS PANSS (modified) PPHS RPMIP SOS Clinical Interview Chart Review Overall

0.05

0.19⁎

Overall Function N 2

1

0.15⁎

3

0.14⁎⁎

6

Z

Imaging N

0.30⁎ 0.14⁎ 0.19⁎⁎ 0.17⁎⁎

5 2 2 5

0.20⁎⁎ 0.45 0.33⁎⁎

10 2 3 2 18 3 49

0.16 0.21⁎⁎ 0.14 0.22⁎⁎

Z

N

0.14⁎ 0.22⁎

2 1

0.68⁎⁎

1

0.35⁎

0.32⁎⁎ 0.27 0.25⁎⁎

1

8 1 14

Negative Symptoms

Neurocognition

Z

Z

N

N

0.19 0.44⁎ 0.15 0.19 0.26⁎

1 1 1 2 2 1 1 1 1 8 19

0.33⁎⁎ 0.02 0.18⁎ 0.10⁎

2 1 3 5

0.20⁎

7

0.29⁎⁎ 0.27⁎⁎ 0.19⁎⁎

2 1 11

0.10 0.03 0.38⁎ -0.09 0.29⁎⁎

0.21⁎⁎

32

0.20⁎⁎

Positive Symptoms

Relapse Risk

Suicidality/ Violence

Z

Z

Z

N

N

0.28⁎⁎ 0.12 0.22⁎⁎ 0.15⁎

3 1 2 4

0.21

2

0.11 0.14

2 4

0.11

6

0.15⁎

5

0.31⁎⁎

2

0.24⁎⁎

9

0.22⁎⁎

27

0.33⁎⁎ 0.28⁎⁎ 0.22⁎⁎ 0.37⁎ 0.21⁎⁎

1 1 12 2 29

Overall N

0.16⁎

2

-0.02 0.19⁎⁎

1 3

0.02 0.01 0.07 0.03 0.084⁎⁎

1 1 5 2 15

Z

N

0.19 0.20⁎⁎ 0.12⁎ 0.006⁎ 0.17⁎⁎

1 11 4 7 11 2 18 4 6 7 55 6 132

0.12 0.16⁎⁎ 0.23⁎ 0.20⁎⁎ 0.16⁎⁎ 0.17⁎⁎ 0.32⁎⁎ 0.18⁎⁎

⁎ p b 0.05, **p b 0.001. Z, Fisher’s Z. N, number of studies. Overall: weighted average of Fisher’s Z, weighted by the number of studies. CASH, Comprehensive Assessment of Symptoms and History. CORS, Circumstances of Onset and Relapse Schedule. IRAOS, Interview for the Retrospective Assessment of the Onset of Schizophrenia. NOS, Nottingham Onset Schedule. PANSS, Positive and Negative Syndrome Scale for Schizophrenia. PPHS, Psychiatric and Personal History Schedule. RPMIP, Royal Park Multi-diagnostic Instrument for Psychosis. SOS, Symptom Onset in Schizophrenia Inventory. Studies may report more than one outcome. Blank cells indicate no studies meeting inclusion criteria were found.

24


Table 4 A meta-analysis comparing the predictive value of DUP for clinical and biological outcomes for studies in which DUP is measured by specialized instruments versus by generic clinical interviews. Studies may report more than one outcome. DUP Measurement Method

DUP instrument Clinical interview

Treatment Adherence

Overall Function

Imaging

Negative Symptoms

Neurocognition

Positive Symptoms

Z

N

Z

N

Z

N

Z

N

Z

N

Z

0.13⁎ 0.15⁎

3 3

0.21⁎⁎ 0.22⁎⁎

31 16

0.25⁎⁎ 0.32⁎⁎

5 8

0.20⁎⁎ 0.19⁎⁎

21 10

0.15⁎ 0.27⁎⁎

9 7

0.19⁎⁎ 0.24⁎⁎

Relapse Risk

Suicidality/ Violence

Overall

N

Z

N

Z

N

Z

N

18 9

0.16⁎⁎ 0.19⁎⁎

15 11

0.06 -0.05

7 3

0.17⁎⁎ 0.17⁎⁎

71 55

⁎ p b 0.05, ⁎⁎p b 0.001. Z, Fisher’s Z. N, number of studies.

significant predictor, but with generally small effect sizes in the 0.2 to 0.3 range. We also found that the majority of instruments have been used by just one or two research groups in published studies, and that instruments and individual studies had varying definitions of the onset of psychosis and the onset of treatment. Our analysis did, however, find that the two most clearly operationalized definitions of treatment onset, the first-ever antipsychotic prescription and the firstever psychiatric hospitalization for psychosis, trended toward having a greater magnitude in association with some outcome measures, and therefore may have greater validity. Our findings are consistent with those of previous reviews of the measurement of DUP. In a systematic qualitative review, Compton et al. (2007) found that the definition of DUP as time from the onset of psychosis to the onset of treatment was quite consistent; however, the translation and quantification of these time points varied significantly across studies. The meta-analysis of Large et al. (2008) found that method of measuring DUP did not significantly influence mean or median DUP. Multiple other authors have noted the conceptual and practical difficulties with operationalizing DUP (Norman and Malla, 2001; Norman et al., 2005; Perkins et al., 2005; Singh, 2007; Breitborde et al., 2009; Farooq et al., 2009; Dell'Osso et al., 2013). Our quantitatively-based review largely confirmed these challenges, suggesting that future research should focus on validating one or several DUP measurement methods. Based on meta-analysis, imaging outcomes have the strongest relationship with DUP compared with other categories of clinical outcome measures. This finding could be interpreted as supporting a biological correlate of the DUP concept. The included imaging papers here mostly utilized structural imaging, and may have had higher predictive power in part because structural imaging typically has higher test-retest reliability than clinical measures. Similarly, patients may be able to recall the date of their first hospitalization or first antipsychotic medication with some precision, and this may have contributed to these definitions’ greater predictive power. To our knowledge, this is the first review aiming to quantitatively compare the methods used to measure DUP in the published literature. However, there are several important limitations to this review. Given

The effect sizes on how well psychosis onset as measured by different instruments can independently predict outcomes (regardless how treatment onset was determined) are also summarized in Table 3. There were some variations in how each method performed for different outcomes: the Beiser Scale performed best for overall functioning, the PANSS and clinical interviews for negative symptoms, and clinical interviews for positive symptoms and relapse risk. However, no instrument clearly had larger effect sizes across different categories of outcomes or when all outcomes were grouped together. In fact, when all studies using specialized DUP instruments were grouped together and compared to all studies using clinical interviews, effect sizes were roughly equivalent (Table 4). Finally, the effect sizes on how well treatment onset definitions can independently predict outcomes (regardless how psychosis onset was determined) are summarized in Table 5. First antipsychotic medication treatment and first hospitalization trended toward having a greater statistically significant effect sizes in predicting outcomes (Table 5). 4. Discussion This quantitative review showed some clear and surprising characteristics of current DUP assessment methods. We found that the interrater reliabilities of the instruments used in DUP assessment were generally good, and they did not substantially different from one another. Surprisingly, based on reliability, we did not find clear quantitative evidence to support the use of one DUP measurement instrument over another, or even the use of instruments rather than ad hoc clinical interviews. However, given the lack of studies directly comparing DUP measurement methods, we cannot exclude the possibility that some in fact have greater reliability and validity. Furthermore, many of these instruments did not report reliability; it is possible that some of the instruments are better than others, but this cannot be demonstrated with the existing limited data. This review confirmed the significant associations of DUP with clinical and biological measures as has been reported previously in greater detail (Marshall et al., 2005; Norman et al., 2005; Perkins et al., 2005; Boonstra et al., 2012; Cascio et al., 2012). Regardless of measurement method, DUP was a statistically

Table 5 A meta-analysis comparing different definitions of treatment onset for DUP determination and their impact on the predictive value for clinical outcomes. For blank cells, no studies meeting inclusion criteria were found. Studies may report more than one outcome and/or definition. Definition of onset of treatment

First treatment for psychotic symptoms Enrollment in study First antipsychotic treatment First adequate treatment First psychiatric hospitalization Undefined

Treatment adherence

Overall function

Z

N

Z

1

0.41⁎⁎

0.21

1 2

0.23⁎⁎ 8 0.27⁎⁎ 14

0.46 0.28⁎

0.10

1

0.17⁎⁎ 14

0.42⁎

1

0.23⁎⁎ 18

1

0.18⁎

0.19⁎ 0.15⁎

0.05

Imaging N 5

5

⁎ p b 0.05, ⁎⁎p b 0.001. Z, Fisher’s Z. N, number of studies.

Z 0.27⁎

N

Negative symptoms

Neurocognition

Positive symptoms

Relapse risk

Z

Z

N

Z

N

Z

N

N

Suicidality/ violence

Overall

Z

Z

N

-0.003

15

N

2

0.18⁎⁎

3

0.32⁎⁎ 2

0.13 3 0.22⁎⁎ 10

0.27⁎ 0.27⁎

2 2

0.22⁎⁎ 0.27⁎⁎

4 9

0.33⁎⁎ 1 0.25⁎ 5

0.02 1 0.23⁎⁎ 4

0.20⁎⁎ 0.25⁎⁎

0.24⁎⁎ 3

0.23⁎⁎

0.12⁎

3

0.10⁎

8

0.17⁎

6

0.09

4

0.028⁎⁎ 36

0.38⁎

0.24⁎⁎ 11

0.20⁎⁎

2

0.21⁎⁎

9

0.23⁎⁎ 8

0.12

4

0.24⁎⁎

37

1

0.14⁎⁎

14

2 2 6

2

0.10

0.27⁎⁎

4

9

1

0.19

0.24

5

0.14⁎⁎ 1

-0.17

14 37


that some outcomes are likely more closely associated with DUP than others, and that many DUP measurement instruments have so far been used with a limited number of outcomes, conclusions about the reliability and validity of individual instruments should be drawn with caution. The time at which the outcomes were assessed varied widely in included studies; if associations with DUP vary systematically with length of follow-up, this may be a source of bias given the small number of studies for many DUP instruments. Our effort to include multiple outcomes likely has led to some oversimplification during our metaanalysis, as the underlying mechanisms for these associations likely vary and include such disparate factors as family dynamics and biological differences. These variations likely contributed to the moderate to high heterogeneity found in our meta-analyses. Also, the patients recruited into instrument-based studies may have systematically differed from those recruited into clinical interview based studies. For example, patients in research clinics versus community clinics may differ in DUP length, diagnosis, or selection bias (Friis et al., 2004). We did not focus on such potential confounders as this paper intended to compare the reliability and validity of different DUP measurement methods. Finally, the small number of studies for several DUP measurement instruments limited the power of the meta-analysis. Although we have used weighted scores during meta-analysis (Table 3), these limitations should be considered when reviewing the comparative results. Improving the overall reliability and validity of DUP measurement in future research will be facilitated by separately addressing the distinct challenges of measuring psychosis onset and treatment onset. DUP remains a promising modifiable risk factor for a range of long-term clinical and biological outcomes. To improve the ability to compare and interpret results, consistent inclusion of reliability and validity assessment in the DUP research methodology should be a priority. Role of funding source Support was received from NIH grants MH085646 and MH103222. Contributors Drs. Hong and Register-Brown designed the study. Dr. Register-Brown managed the literature searches and analyses, and wrote the first draft of the manuscript. Both authors have approved the final manuscript. Conflict of interest Both authors declare that they have no conflicts of interest. Acknowledgements None.

Appendix A. Supplementary data Supplementary data to this article can be found online at http://dx. doi.org/10.1016/j.schres.2014.10.025. References Alvarez-Jimenez, M., Gleeson, J.F., Cotton, S., Wade, D., Gee, D., Pearce, T., Crisp, K., Spiliotacopoulos, D., Newman, B., McGorry, P.D., 2009. Predictors of adherence to cognitive-behavioural therapy in first-episode psychosis. Can. J. Psychiatry 54 (10), 710–718. Andreasen, N.C., Flaum, M., Arndt, S., 1992. The Comprehensive Assessment of Symptoms and History (CASH). An instrument for assessing diagnosis and psychopathology. Arch. Gen. Psychiatry 49 (8), 615–623. Beiser, M., Erickson, D., Fleming, J.A., Iacono, W.G., 1993. Establishing the onset of psychotic illness. Am. J. Psychiatry 150 (9), 1349–1354. Bird, V., Premkumar, P., Kendall, T., Whittington, C., Mitchell, J., Kuipers, E., 2010. Early intervention services, cognitive-behavioural therapy and family intervention in early psychosis: systematic review. Br. J. Psychiatry 197 (5), 350–356. Boonstra, N., Klaassen, R., Sytema, S., Marshall, M., De Haan, L., Wunderink, L., Wiersma, D., 2012. Duration of untreated psychosis and negative symptoms–a systematic review and meta-analysis of individual patient data. Schizophr. Res. 142 (1–3), 12–19. Bottlender, R., Sato, T., Jager, M., Wegener, U., Wittmann, J., Strauss, A., Moller, H.J., 2003. The impact of the duration of untreated psychosis prior to first psychiatric admission on the 15-year outcome in schizophrenia. Schizophr. Res. 62 (1–2), 37–44. Breitborde, N.J., Srihari, V.H., Woods, S.W., 2009. Review of the operational definition for first-episode psychosis. Interv. Psychiatry 3 (4), 259–265.

25

Broussard, B., Kelley, M.E., Wan, C.R., Cristofaro, S.L., Crisafio, A., Haggard, P.J., Myers, N.L., Reed, T., Compton, M.T., 2013. Demographic, socio-environmental, and substancerelated predictors of duration of untreated psychosis (DUP). Schizophr. Res. 148 (1–3), 93–98. Cascio, M.T., Cella, M., Preti, A., Meneghelli, A., Cocchi, A., 2012. Gender and duration of untreated psychosis: a systematic review and meta-analysis. Interv. Psychiatry 6 (2), 115–127. Chang, W.C., Tang, J.Y., Hui, C.L., Lam, M.M., Chan, S.K., Wong, G.H., Chiu, C.P., Chen, E.Y., 2012. Prediction of remission and recovery in young people presenting with firstepisode psychosis in Hong Kong: a 3-year follow-up study. Aust.N.Z.J. Psychiatry 46 (2), 100–108. Chang, W.C., Hui, C.L., Tang, J.Y., Wong, G.H., Chan, S.K., Lee, E.H., Chen, E.Y., 2013. Impacts of duration of untreated psychosis on cognition and negative symptoms in firstepisode schizophrenia: a 3-year prospective follow-up study. Psychol. Med. 43 (9), 1883–1893. Chou, P.H., Koike, S., Nishimura, Y., Kawasaki, S., Satomura, Y., Kinoshita, A., Takizawa, R., Kasai, K., 2014. Distinct effects of duration of untreated psychosis on brain cortical activities in different treatment phases of schizophrenia: a multi-channel nearinfrared spectroscopy study. Prog. Neuropsychopharmacol. Biol. Psychiatry 49, 63–69. Clarke, M., Whitty, P., Browne, S., Mc Tigue, O., Kinsella, A., Waddington, J.L., Larkin, C., O'Callaghan, E., 2006. Suicidality in first episode psychosis. Schizophr. Res. 86 (1–3), 221–225. Compton, M.T., Carter, T., Bergner, E., Franz, L., Stewart, T., Trotman, H., McGlashan, T.H., McGorry, P., 2007. Defining, operationalizing, and measuring the duration of untreated psychosis: advances, limitations and future directions. Interv. Psychiatry 1, 236–250. Compton, M.T., Gordon, T.L., Weiss, P.S., Walker, E.F., 2011. The "doses" of initial, untreated hallucinations and delusions: a proof-of-concept study of enhanced predictors of first-episode symptomatology and functioning relative to duration of untreated psychosis. J. Clin. Psychiatry 72 (11), 1487–1493. Craig, T.J., Bromet, E.J., Fennig, S., Tanenberg-Karant, M., Lavelle, J., Galambos, N., 2000. Is there an association between duration of untreated psychosis and 24-month clinical outcome in a first-admission series? Am. J. Psychiatry 157 (1), 60–66. Cuesta, M.J., Peralta, V., Campos, M.S., Garcia-Jalon, E., 2011. Can insight be predicted in first-episode psychosis patients? A longitudinal and hierarchical analysis of predictors in a drug-naive sample. Schizophr. Res. 130 (1–3), 148–156. de Haan, L., Linszen, D.H., Lenior, M.E., de Win, E.D., Gorsira, R., 2003. Duration of untreated psychosis and outcome of schizophrenia: delay in intensive psychosocial treatment versus delay in treatment with antipsychotic medication. Schizophr. Bull. 29 (2), 341–348. Dell'Osso, B., Glick, I.D., Baldwin, D.S., Altamura, A.C., 2013. Can long-term outcomes be improved by shortening the duration of untreated illness in psychiatric disorders? A conceptual framework. Psychopathology 46 (1), 14–21. Diaz, I., Pelayo-Teran, J.M., Perez-Iglesias, R., Mata, I., Tabares-Seisdedos, R., Suarez-Pinilla, P., Vazquez-Barquero, J.L., Crespo-Facorro, B., 2013. Predictors of clinical remission following a first episode of non-affective psychosis: sociodemographics, premorbid and clinical variables. Psychiatry Res. 206 (2–3), 181–187. Drake, R.J., Haley, C.J., Akhtar, S., Lewis, S.W., 2000. Causes and consequences of duration of untreated psychosis in schizophrenia. Br. J. Psychiatry 177, 511–515. Farooq, S., Large, M., Nielssen, O., Waheed, W., 2009. The relationship between the duration of untreated psychosis and outcome in low-and-middle income countries: a systematic review and meta analysis. Schizophr. Res. 109 (1–3), 15–23. Fridgen, G.J., Aston, J., Gschwandtner, U., Pflueger, M., Zimmermann, R., Studerus, E., Stieglitz, R.D., Riecher-Rössler, A., 2013. Help-seeking and pathways to care in the early stages of psychosis. Soc. Psychiatry Psychiatr. Epidemiol. 48 (7), 1033–1043 2013 Jul. Friis, S., Melle, I., Larsen, T.K., Haahr, U., Johannessen, J.O., Simonsen, E., Opjordsmoen, S., Vaglum, P., McGlashan, T.H., 2004. Does duration of untreated psychosis bias study samples of first-episode psychosis? Acta Psychiatr. Scand. 110 (4), 286–291. Fusar-Poli, P., Borgwardt, S., Bechdolf, A., Addington, J., Riecher-Rossler, A., SchultzeLutter, F., Keshavan, M., Wood, S., Ruhrmann, S., Seidman, L.J., Valmaggia, L., Cannon, T., Velthorst, E., De Haan, L., Cornblatt, B., Bonoldi, I., Birchwood, M., McGlashan, T., Carpenter, W., McGorry, P., Klosterkotter, J., McGuire, P., Yung, A., 2013. The psychosis high-risk state: a comprehensive state-of-the-art review. JAMA Psychiatry 70 (1), 107–120. Gonzalez-Blanch, C., Crespo-Facorro, B., Alvarez-Jimenez, M., Rodriguez-Sanchez, J.M., Pelayo-Teran, J.M., Perez-Iglesias, R., Vazquez-Barquero, J.L., 2008. Pretreatment predictors of cognitive deficits in early psychosis. Psychol. Med. 38 (5), 737–746. Hafner, H., Riecher-Rossler, A., Hambrecht, M., Maurer, K., Meissner, S., Schmidtke, A., Fatkenheuer, B., Loffler, W., van der Heiden, W., 1992. IRAOS: an instrument for the assessment of onset and early course of schizophrenia. Schizophr. Res. 6 (3), 209–223. Hafner, H., Maurer, K., Loffler, W., Fatkenheuer, B., an der Heiden, W., Riecher-Rossler, A., Behrens, S., Gattaz, W.F., 1994. The epidemiology of early schizophrenia. Influence of age and gender on onset and early course. Br. J. Psychiatry (23), 29–38 (23). Ho, B.C., Flaum, M., Hubbard, W., Arndt, S., Andreasen, N.C., 2004. Validity of symptom assessment in psychotic disorders: information variance across different sources of history. Schizophr. Res. 68 (2–3), 299–307. Janca, A., Chandrashekar, C., 1993. Catalogue of Assessment Instruments Used in the Studies Coordinated by the WHO Mental Health Programme, Division of Mental Health. World Health Organization, Geneva. Killackey, E., Yung, A.R., 2007. Effectiveness of early intervention in psychosis. Curr. Opin. Psychiatry 20 (2), 121–125. Large, M.M., Nielssen, O., 2011. Violence in first-episode psychosis: a systematic review and meta-analysis. Schizophr. Res. 125 (2–3), 209–220. Large, M., Nielssen, O., Slade, T., Harris, A., 2008. Measurement and reporting of the duration of untreated psychosis.Early Interv. Psychiatry 2 (4), 201–211.

26


Larsen, T.K., McGlashan, T.H., Moe, L.C., 1996. First-episode schizophrenia: I. Early course parameters. Schizophr. Bull. 22 (2), 241–256. Larsen, T.K., Friis, S., Haahr, U., Joa, I., Johannessen, J.O., Melle, I., Opjordsmoen, S., Simonsen, E., Vaglum, P., 2001. Early detection and intervention in first-episode schizophrenia: a critical review. Acta Psychiatr. Scand. 103 (5), 323–334. Lihong, Q., Shimodera, S., Fujita, H., Morokuma, I., Nishida, A., Kamimura, N., Mizuno, M., Furukawa, T.A., Inoue, S., 2012. Duration of untreated psychosis in a rural/suburban region of Japan. Interv. Psychiatry 6 (3), 239–246. Lloyd-Evans, B., Crosby, M., Stockton, S., Pilling, S., Hobbs, L., Hinton, M., Johnson, S., 2011. Initiatives to shorten duration of untreated psychosis: systematic review. Br. J. Psychiatry 198 (4), 256–263. Lopez-Morinigo, J.D., Wiffen, B., O'Connor, J., Dutta, R., Di Forti, M., Murray, R.M., David, A.S., 2013. Insight and suicidality in first-episode psychosis: understanding the influence of suicidal history on insight dimensions at first presentation. Interv. Psychiatry 8, 113–121. MacBeth, A., Gumley, A., 2008. Premorbid adjustment, symptom development and quality of life in first episode psychosis: a systematic review and critical reappraisal. Acta Psychiatr. Scand. 117 (2), 85–99. Madsen, A.L., Karle, A., Rubin, P., Cortsen, M., Andersen, H.S., Hemmingsen, R., 1999. Progressive atrophy of the frontal lobes in first-episode schizophrenia: interaction with clinical course and neuroleptic treatment. Acta Psychiatr.Scand. 100 (5), 367–374. Malla, A.K., Norman, R.M., Manchanda, R., Ahmed, M.R., Scholten, D., Harricharan, R., Cortese, L., Takhar, J., 2002. One year outcome in first episode psychosis: influence of DUP and other predictors. Schizophr. Res. 54 (3), 231–242. Marshall, M., Lewis, S., Lockwood, A., Drake, R., Jones, P., Croudace, T., 2005. Association between duration of untreated psychosis and outcome in cohorts of first-episode patients: a systematic review. Arch. Gen. Psychiatry 62 (9), 975–983. Maurer, K., Hafner, H., 1995. Methodological aspects of onset assessment in schizophrenia. Schizophr. Res. 15 (3), 265–276. McGlashan, T.H., 1999. Duration of untreated psychosis in first-episode schizophrenia: marker or determinant of course? Biol. Psychiatry 46 (7), 899–907. McGorry, P.D., Copolov, D.L., Singh, B.S., 1990. Royal Park Multidiagnostic Instrument for Psychosis: Part I. Rationale and review. Schizophr. Bull. 16 (3), 501–515.

McGorry, P.D., Killackey, E., Yung, A.R., 2007. Early intervention in psychotic disorders: detection and treatment of the first episode and the critical early stages. Med. J. Aust. 187 (7 Suppl.), S8–S10. Moher, D., Liberati, A., Tetzlaff, J., Altman, D.G., The PRISMA Group, 2009. Preferred Reporting Items for Systematic Reviews and Meta-Analyses: The PRISMA Statement. PLoS Med. 6 (6), e1000097. Norman, R., Malla, A., 2001. Duration of untreated psychosis: a critical examination of the concept and its importance. Psychol. Med. 31 (3), 381–400. Norman, R.M., Malla, A.K., Verdi, M.B., Hassall, L.D., Fazekas, C., 2004. Understanding delay in treatment for first-episode psychosis. Psychol. Med. 34 (2), 255–266. Norman, R.M., Lewis, S.W., Marshall, M., 2005. Duration of untreated psychosis and its relationship to clinical outcome. Br. J. Psychiatry Suppl. 48, s19–s23. Perkins, D.O., Leserman, J., Jarskog, L.F., Graham, K., Kazmer, J., Lieberman, J.A., 2000. Characterizing and dating the onset of symptoms in psychotic illness: the Symptom Onset in Schizophrenia (SOS) inventory. Schizophr. Res. 44 (1), 1–10. Perkins, D.O., Gu, H., Boteva, K., Lieberman, J.A., 2005. Relationship between duration of untreated psychosis and outcome in first-episode schizophrenia: a critical review and meta-analysis. Am. J. Psychiatry 162 (10), 1785–1804. Singh, S.P., 2007. Outcome measures in early psychosis; relevance of duration of untreated psychosis. Br. J. Psychiatry Suppl. 50, s58–s63. Singh, S.P., Cooper, J.E., Fisher, H.L., Tarrant, C.J., Lloyd, T., Banjo, J., Corfe, S., Jones, P., 2005. Determining the chronology and components of psychosis onset: The Nottingham Onset Schedule (NOS). Schizophr. Res. 80 (1), 117–130. Takahashi, T., Suzuki, M., Tanino, R., Zhou, S.Y., Hagino, H., Niu, L., Kawasaki, Y., Seto, H., Kurachi, M., 2007. Volume reduction of the left planum temporale gray matter associated with long duration of untreated psychosis in schizophrenia: a preliminary report. Psychiatry Res. 154 (3), 209–219. Winsper, C., Singh, S.P., Marwaha, S., Amos, T., Lester, H., Everard, L., Jones, P., Fowler, D., Marshall, M., Lewis, S., Sharma, V., Freemantle, N., Birchwood, M., 2013. Pathways to violent behavior during first-episode psychosis: a report from the UK National EDEN Study. JAMA Psychiatry 70 (12), 1287–1293.

Race, ethnicity, and the duration of untreated psychosis: a systematic review.

[The Basel Interview for Psychosis (BIP): structure, reliability and validity].

Are the effects of duration of untreated psychosis socially mediated?

Factors contributing to the duration of untreated psychosis.

School-based approaches to reducing the duration of untreated psychosis.

Validity and reliability in quantitative studies.

The validity and reliability of a new instrumented device for measuring ankle dorsiflexion range of motion.

Obstacles to care in first-episode psychosis patients with a long duration of untreated psychosis.

Reliability and validity of non-radiographic methods of thoracic kyphosis measurement: a systematic review.

Reducing the duration of untreated psychosis and its impact in the U.S.: the STEP-ED study.

Migrant background and ethnic minority status as predictors for duration of untreated psychosis.

Validity and reliability of three-dimensional imaging for measuring the volume of the arm.

Validity and reliability of the Tibetan version of s-EMBU for measuring parenting styles.

Duration of Untreated Psychosis in Chinese and Mauritian: Impact of Clinical Characteristics and Patients' and Families' Perspectives on Psychosis.

Associations between duration of untreated psychosis and domains of positive and negative symptoms.

Convergent Validity of Three Methods for Measuring Postoperative Complications.

Prospective relationship between duration of untreated psychosis and 13-year clinical outcome: a first-episode psychosis study.

Reliability and validity of a wireless microelectromechanicals based system (keimove™) for measuring vertical jumping performance.

Validity and reliability of instruments aimed at measuring Evidence-Based Practice in Physical Therapy: a systematic review of the literature.

The relationship between family resiliency factors and caregiver-perceived duration of untreated psychosis in persons with first-episode psychosis.

A new method for measuring power output in a single leg extension: feasibility, reliability and validity.

Reliability and validity of the Myotest® for measuring running stride kinematics.

The validity and reliability of an iPhone app for measuring vertical jump performance.

Substance use and duration of untreated psychosis in KwaZulu-Natal, South Africa.