XML Template (2014) [17.5.2014–9:38am] [1–23] //blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/SMMJ/Vol00000/140059/APPFile/SG-SMMJ140059.3d (SMM) [PREPRINTER stage]

Article

Longitudinal data subject to irregular observation: A review of methods with a focus on visit processes, assumptions, and study design

Statistical Methods in Medical Research 0(0) 1–23 ! The Author(s) 2014 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav DOI: 10.1177/0962280214536537 smm.sagepub.com

Eleanor M Pullenayegum1 and Lily SH Lim2

Abstract When data are collected longitudinally, measurement times often vary among patients. This is of particular concern in clinic-based studies, for example retrospective chart reviews. Here, typically no two patients will share the same set of measurement times and moreover, it is likely that the timing of the measurements is associated with disease course; for example, patients may visit more often when unwell. While there are statistical methods that can help overcome the resulting bias, these make assumptions about the nature of the dependence between visit times and outcome processes, and the assumptions differ across methods. The purpose of this paper is to review the methods available with a particular focus on how the assumptions made line up with visit processes encountered in practice. Through this we show that no one method can handle all plausible visit scenarios and suggest that careful analysis of the visit process should inform the choice of analytic method for the outcomes. Moreover, there are some commonly encountered visit scenarios that are not handled well by any method, and we make recommendations with regard to study design that would minimize the chances of these problematic visit scenarios arising. Keywords longitudinal data, informative observation, observational study, correlated, random effects, inverseintensity weighting

1 Introduction Understanding the evolution of disease processes and the impact of interventions over time requires longitudinal data. Longitudinal data are collected in a variety of settings, ranging from randomized trials to retrospective cohort studies. In some such datasets, both the number and the timings of 1 2

Child Health Evaluative Sciences, Hospital for Sick Children, Dalla Lana School of Public Health, University of Toronto Division of Rheumatology, Department of Paediatrics, Hospital for Sick Children, Toronto, Canada

Corresponding author: Eleanor M Pullenayegum, Child Health Evaluative Sciences, Hospital for Sick Children, Dalla Lana School of Public Health, University of Toronto, Canada. Email: [email protected]

Downloaded from smm.sagepub.com at Northeastern University on November 14, 2015

XML Template (2014) [17.5.2014–9:38am] [1–23] //blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/SMMJ/Vol00000/140059/APPFile/SG-SMMJ140059.3d (SMM) [PREPRINTER stage]

2

Statistical Methods in Medical Research 0(0)

observations differ from one patient to the next, and often no two patients will share an observation time. An example of this is a chart review, where data are recorded only when the patient comes in for a visit. This paper focuses on longitudinal data subject to irregular and potentially informative observation times. While methods for handling repeated measures data subject to missingness are well established,1 these are generally not applicable to irregularly observed longitudinal data. The key distinguishing feature between repeated measures data and irregularly observed longitudinal data is that with repeated measures data, there is a set of discrete time points at which data should be observed, whereas with irregularly observed longitudinal data no such time points are specified by protocol. The resulting variation in observation timings makes methods such as multiple imputation2,3 infeasible: it is unclear at which time points one should impute the data since typically no two individuals will have a measurement at the same time point. Although methods for handling irregularly observed longitudinal data are less well established than methods for repeated measures subject to dropout, they have received considerable attention in recent years. These include semiparametric methods using inverse-intensity weighting4,5 (an extension of inverse probability weighting) and semiparametric methods using correlated random effects to capture associations between the outcome and visit processes.6–12 Importantly, these methods differ in the assumptions they make about the nature of the association between the visit process and the outcome process. For example, inverse-intensity methods may allow the visit and outcome processes to be correlated through observed auxiliary variables; however, they assume that given the auxiliary variables the two processes are independent. By contrast, methods using correlated random effects often require the dependence to be captured solely through baseline covariates and time-invariant latent variables.6,8,10,12 The precise nature of the assumptions about the association between outcome and visit processes is often determined by statistical necessity and receives only cursory discussion; however, the validity of the methods hinges on the assumptions being met. The tendency to tailor assumptions about the visit process to ensure that a newly proposed method gives valid inferences means that it is not always clear how these assumptions line up with visit processes that are encountered in practice. However, given the variety of methods that are available, the methods do in fact cover many of the plausible visit scenarios. The purpose of this paper is to review the methods available for the analysis of longitudinal data subject to irregular observation so as to provide guidance for the analyst who must choose which analytic method to adopt. For this reason, we focus on the validity of the assumptions in practical settings. To this end, we characterize commonly encountered visit processes and then match these up with the assumptions made by the various methods. We discuss which assumptions are testable and illustrate the methods through an example. Finally we suggest some simple study design strategies that would make the assumptions more plausible.

2 Notation To formalize assumptions about visit and outcome processes, we set up the following notation. Let the outcome for individual i (i ¼ 1, . . . , n) at time t be Y i(t), let Xi(t) be a potentially time-dependent covariate process for the outcomes Y, and let Zi(t) be a vector of auxiliary variables. Methods differ on whether Xi(t) and Zi(t) must be known for all times t or are known just at the visit times, and so this will be discussed for each method separately. Some methods constrain the covariates to be time invariant; when this is the case we will write Xi in place of Xi(t) and Zi in place of Zi(t). Define Tij to be the jth visit time for individual i, and let Ni(t) be the counting process for the visit times. Let Ci be

Downloaded from smm.sagepub.com at Northeastern University on November 14, 2015

XML Template (2014) [17.5.2014–9:38am] [1–23] //blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/SMMJ/Vol00000/140059/APPFile/SG-SMMJ140059.3d (SMM) [PREPRINTER stage]

Pullenayegum and Lim

3

the censoring time for individual i, and let Ni ðtÞ be the counting process for the potentially counterfactual visit times (i.e. the visit times had there been no censoring). Although one could simply model Ni(t), it can be helpful to model Ni ðtÞ and Ci separately when Ci is an informative  denote the history of the process A up to and censoring time. For any arbitrary process A, let AðtÞ  including time t, i.e. AðtÞ ¼ fAðsÞ : 0  s  tg: We shall use the notation A(t) to denote  AðtÞ  Aðt Þ, where Aðt Þ ¼ lims"t AðsÞ, and Að1Þ to denote the set containing the values A(s) for obs  any s > 0. We will use Y ðtÞ to denote the observed outcomes up to and including time t, i.e. obs obs Yi ðTi1 Þ, . . . , Yi ðTiNi ðtÞ Þ, and similarly X ðtÞ, Z ðtÞ will denote the observed components of the covariates up to time t. Define i(t) as EðYi ðtÞjXi ðtÞÞ. For a link function g, we consider models of the form gði ðtÞÞ ¼ Xi ðtÞb where we take the first element of X to be 1 to allow for an intercept. Some models allow for a nonparametric intercept, i.e. they take gði ðtÞÞ ¼ 0 ðtÞ þ Xi ðtÞb In these cases, the intercept function 0 is treated as a nuisance parameter (i.e. is not estimated). The majority of methods we discuss are semiparametric; however, when a fully parametric model is used it is necessary to model the outcomes Y in addition to the mean .

3 Visit processes Since the appropriateness of analytic methods will depend on the underlying visit process, we suggest a classification system for visit processes, then describe common protocols for visit processes and discuss how these line up with the visit process classifications, and finally consider the impact of deviations from the protocol. We conclude the section with practical examples of visit processes.

3.1

Visit process classification

Visit processes may be regular or irregular, and among irregular processes, there may be covariates that predict visit times. We define the following visit processes: (1) Regular visits, i.e. Tij ¼ tj 8i, j (2) Irregular visits, i.e. Tij 6¼ tj 8i, j, which can be classified into (a) Visiting completely at random (VCAR): Visit times and outcomes are independent, that is EðNi ðtÞjY i ð1Þ, X i ð1Þ, Z i ð1ÞÞ ¼ EðNi ðtÞÞ (b) Visiting at random (VAR): Given data recorded up to time t, visiting at time t is independent of outcome at time t, that is  EðNi ðtÞjX i ðtÞ, Z i ðtÞ, N i ðt Þ, Y obs i ðt Þ, Yi ðtÞÞ ¼ EðNi ðtÞjX obs ðtÞ, Z obs ðtÞ, N i ðt Þ, Y obs ðt ÞÞ i

i

i

Moreover, we require that the model for the visits and the model for the outcomes do not share parameters. This is similar to Hogan’s definition of sequential missingness at random;1

Downloaded from smm.sagepub.com at Northeastern University on November 14, 2015

XML Template (2014) [17.5.2014–9:38am] [1–23] //blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/SMMJ/Vol00000/140059/APPFile/SG-SMMJ140059.3d (SMM) [PREPRINTER stage]

4

Statistical Methods in Medical Research 0(0) an important distinction is that Hogan was working in the context where missingness is monotone, whereas in our context observation will never be monotone because outcomes are defined in continuous time but patients are observed at discrete time points. (c) Visiting not at random (VNAR): Given data recorded up to time t, visiting at time t is not independent of outcome at time t. That is  EðNi ðtÞjX i ðtÞ, Z i ðtÞ, N i ðt Þ, Y obs i ðt Þ, Yi ðtÞÞ 6¼ EðNi ðtÞjX obs ðtÞ, Z obs ðtÞ, N i ðt Þ, Y obs ðt ÞÞ i

3.2

i

i

Visit process protocols

When analysing a dataset, it is important to consider which of the above categories the visit process falls into. In doing so, it is helpful to go back to the original design of the study. Common study designs are: . Fixed visits: Some study protocols set visit times in advance, i.e. request that Tij ¼ tj 8i, j. . History-dependent protocol visits: Some studies have a protocol that stipulates when visits should be, but allows intervals between visits to depend on the patient’s previously observed history. For example, the protocol may request that visits are more frequent when certain symptoms or treatments are present. . Physician-driven visits: Some studies do not have a formal protocol, but rather the physician recommends when the next visit should be. . Patient-driven visits: Other studies may leave the patient to schedule visits as needed. If the protocol or physician-recommended schedule is adhered to perfectly, then fixed visits will correspond to regular visits and history-dependent protocol visits will correspond to VAR. Physician-driven visits will also lead to VAR, provided that all the information that the physician uses to decide when the next visit should be is recorded in the patient’s chart. Patient-driven visits are likely to result in visiting not at random, since factors that prompt the patient to visit would likely not be reported in advance.

3.3

Deviation from protocol

In practice, recommended visit intervals will not be adhered to perfectly. With fixed visits, it may be possible to specify nonoverlapping intervals ðtj  , tj þ  such that most patients have a visit in any given interval, and those that do not can be treated as missing. Missing visits can then be classified according to Rubin’s taxonomy.13 When visits that are intended to be regular are subject to large deviation from the protocol, the visit process may be best classified as irregular, in which case reasons for deviation need to be considered. More generally, deviation from the recommended visit intervals is relevant for history-dependent protocol visits and physician visits as well as for fixed visits. Deviation that is driven by factors other than the patient’s health will not alter the visit classification. For example, if a visit is scheduled later than it should have been due to the physician having a full schedule, this deviation is unlikely to be related to the patient’s status. Deviation that is due to factors recorded in the patient’s chart would

Downloaded from smm.sagepub.com at Northeastern University on November 14, 2015

XML Template (2014) [17.5.2014–9:38am] [1–23] //blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/SMMJ/Vol00000/140059/APPFile/SG-SMMJ140059.3d (SMM) [PREPRINTER stage]

Pullenayegum and Lim

5

result in visits that are best at random. For example, if patients with a previous favourable assessment felt reassured and so delayed or cancelled their next visit, this would lead to VAR provided that the visit schedule in the absence of deviation was fixed, completely at random or at random. Deviation due to factors not recorded in the patient’s chart, for example disease flares or medication side effects, will result in visits that are not at random.

3.4

Examples

The following examples illustrate the above visit processes. To begin, consider a study of dialysis patients where blood pressure, serum creatinine, and measures of anxiety and quality of life are to be taken once per week while patients are in the clinic for their usual care. Due to the medical need for dialysis several times per week, these patients are unlikely to be away from the clinic for a whole week, so that one might expect to see regular visits. As an example of irregular visits, consider a study of weight gain in pregnancy. The standard of care for uncomplicated pregnancies in Canada is for an initial visit at 8–11 weeks gestation, subsequent visits every 4–6 weeks up until 30 weeks, every 2–3 weeks from 30 to 36 weeks, and every 1–2 weeks from 36 weeks until delivery; however, more frequent visits are recommended if complications develop (for example, if the mother develops pre-eclampsia, weekly visits are recommended).14 Since the visit frequency depends on diagnosis of complications, this is VAR. Some deviation may occur due to the physician’s availability (e.g. a busy schedule or being away); however, since this is unrelated to the patient’s health it will not alter the visit classification. If visits were missed because the mother was hospitalized (e.g. for pregnancyrelated complications), visiting would remain at random provided these were recorded at the time of the hospitalization. If extra visits were scheduled due to concerns such as dizziness or nausea, this would result in visiting not at random; even if the reason for the extra visit is recorded in the patient’s chart, it still violates the assumptions of VAR because the reason for the visit is not recorded until after the visit has begun.

4 Methods for analysis We now consider approaches to analysis. After a brief overview of the model and inference procedures, we review the assumptions in the light of the visit processes outlined in Section 3. Table 1 summarizes the validity of the methods under each visit process. For example, if the visit intensity depends on past outcomes only and the visit process is independent of the outcome process given the values of outcomes at previous visits, then row 4 of the table indicates that mixed models, inverse-intensity weighted generalized estimating equations (GEE)s, or parametric joint models are suitable methods for analysis. Since the choice of analytic method should also be informed by the desired marginal mean model for the outcomes, these are summarized in Table 2.

4.1

Standard analyses (GEEs, mixed models, and Lin-Ying equations)

The simplest approach to analysis is to ignore the visit process. Lin et al.4 and Lipsitz et al.15 discuss the conditions that must be met for GEE estimators and mixed effects models to yield consistent estimates of regression coefficients, and we shall see that these primarily deal with regular visits or visits that are completely at random. These methods are the building blocks for methods that allow for more general visit processes, and are thus helpful to consider despite their limited applicability to irregularly observed longitudinal data.

Downloaded from smm.sagepub.com at Northeastern University on November 14, 2015











  

 



– No No No No No Yes Yes Yes Yes

Yes Yes No No No Yes No No

Correlated with outcome process?

– N/A

May covariates be time dependent?

Yes Yes Yes Yes No No No Yes

– Yes

May latent covariates be time varying?

Latent covariates for the visit process

3 3 3 5 5 5 5 5 5 5

3 3 3 3 5 5 5 5 5 5

3 5 5 5 5 5 5 5

3 3

3 3 3 3 5 5 5 5

3 3

3 5 3 5 3y 3y 5 5

3 3

3 3 3 3 3 3 3 3z

3 3

Inverseintensity Semiparametric Parametric weighted joint joint Mixed models GEE Lin-Ying GEE models models

Analytic model

6

Measured covariates that are predictive of visit frequency are classified according to whether they must appear in the outcome model, whether they are restricted to consist of past outcomes only, or whether there are no restrictions. These covariates are also classified according to whether they are permitted to be time dependent. Unmeasured (or latent) covariates are classified according to whether they are permitted to be correlated with the outcome process, and whether or not they are permitted to be time dependent. N/A ¼ Not Applicable, y ¼ not all semiparametric joint methods are suitable for this setting (see Table 3 for details), z if correctly specified (has yet to be implemented in practice).

Random-effectdependent visits (special case of visiting not at random)

Regular – Visiting completely  at random Visiting at random

Visit process

Must also No appear in covariates outcome Past No permitted model outcomes restrictions

Measured covariates for the visit process

Table 1. Validity of analytic methods for various visit process models.

XML Template (2014) [17.5.2014–9:38am] [1–23] //blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/SMMJ/Vol00000/140059/APPFile/SG-SMMJ140059.3d (SMM) [PREPRINTER stage]

Statistical Methods in Medical Research 0(0)

Downloaded from smm.sagepub.com at Northeastern University on November 14, 2015

XML Template (2014) [17.5.2014–9:38am] [1–23] //blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/SMMJ/Vol00000/140059/APPFile/SG-SMMJ140059.3d (SMM) [PREPRINTER stage]

Pullenayegum and Lim

7

Table 2. Mean models for each estimation method. Model class

Method

Mean model

Standard analyses (Section 4.1) Inverse-intensity weighting (IIW) (Section 4.2) Semiparametric joint models (Section 4.3)

Mixed models GEE Lin-Ying IIW-GEE IIW-Lin-Ying Doubly robust Liang et al.6 Sun et al.12 Sun et al.8 Song et al.11 Mixed models

g1 ðXi ðtÞb þ Wi ðtÞVi Þ g1 ðXi ðtÞbÞ Xi ðtÞb g1 ðXi ðtÞbÞ 0 ðtÞ þ Xi ðtÞb or expð0 ðtÞ þ Xi ðtÞbÞ Xi ðtÞb 0 ðtÞ þ Xi ðtÞb 0 ðtÞ þ Xi b 0 ðtÞ expðXi bÞ 0 ðtÞ þ Xi ðtÞb g1 ðXi ðtÞb þ Wi ðtÞVi Þ

Parametric joint models (Section 4.4)

Note that nonparametric intercepts are not estimated, but rather treated as nuisance parameters. Here, g is an arbitrary link function, X is a row vector of covariates (denoted by X(t) if covariates may be time dependent), Wi(t) is a row vector of covariates corresponding to random effects Vi, b is a vector of regression coefficients, and 0(t) is a nonparametric intercept that is not estimated.

4.1.1 GEEs Lin et al.4 discuss GEEs with a general link function and baseline covariates; however, their results generalize to fully observed time-dependent covariates, i.e. gði ðtÞÞ ¼ Xi ðtÞb. Censoring is assumed to be noninformative in the sense that, given the covariates Xi, censoring is independent of outcomes, i.e. IðCi  tÞ??Yi ðtÞjXi ðtÞ

ð1Þ

where the notation a??bjc means that a and b are independent given c. When using an independent working correlation structure, one must have EðNi ðtÞjXi ðtÞ, Yi ðtÞÞ ¼ EðNi ðtÞjXi ðtÞÞ

ð2Þ

Thus, given covariates Xi(t) and that an individual has not been censored, the visit intensity at time t must be independent of the outcome at time t. Since the outcome process is correlated over time, this also means that the visit times must be independent of past outcomes. While it may be tempting to augment the mean model Xi(t) to make this assumption more plausible, typically Xi(t) is chosen for good scientific reasons and so augmenting may compromise answering the scientific question. As an example, often poor outcomes point to a need for more frequent follow-up. To accommodate this while still meeting assumption (2), the mean model for Y would have to include the last observed value of Y as a covariate. Only in special circumstances would such a mean model be useful. This being the case, the only visit processes from Section 3 that meet this condition are regular visits, irregular visits that are completely at random, or special cases of visits at random (see Table 1). If an alternative working correlation structure (e.g. exchangeable) is used, one requires the assumption that, for all t, the observed visit process up to time t be independent of outcomes

Downloaded from smm.sagepub.com at Northeastern University on November 14, 2015

XML Template (2014) [17.5.2014–9:38am] [1–23] //blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/SMMJ/Vol00000/140059/APPFile/SG-SMMJ140059.3d (SMM) [PREPRINTER stage]

8

Statistical Methods in Medical Research 0(0)

given the covariates, that is obs

obs

EðNi ðtÞjXi ð1Þ, Xi ðtÞ, Yi ðsÞÞ ¼ EðNi ðtÞjXi ðt Þ, Xi ðtÞÞ 4 8s

ð3Þ

Although this may appear a stronger assumption than equation (2), in practice autocorrelation in the outcomes Y will result in the two assumptions being similar. 4.1.2 Mixed effect models Lipsitz et al.15 consider fully parametric mixed effect models with time-dependent covariates, a linear link function, and Gaussian residuals (i.e. Yi ðtÞ ¼ Xi ðtÞb þ i ðtÞ for a Gaussian process ); however, their results generalize to generalized linear mixed models. Provided that the joint likelihood for the outcome process Y and the visit times T factorizes, inference for b can be based on the observed outcomes alone. Specifically, Lipsitz notes that one may ignore the visit times in estimating b if one has  ik Þ, Yð1Þ,  f ðTikþ1  Tik jNðT Xi ð1Þ, Zi ð1ÞÞ obs  ¼ f ðTikþ1  Tik jY ðTik Þ, Xi ð1Þ, Zi ð1ÞÞ

ð4Þ

where f is the probability density function of the gaps and Zi(t) is a vector of auxiliary covariates satisfying EðYi ðtÞjXi ðtÞ, Zi ð1ÞÞ ¼ EðYi ðtÞjXi ðtÞÞ:

ð5Þ

That is, the distribution of the kth gap is conditionally independent of past visit times and unobserved outcomes (both past and future) given previously observed outcomes, the covariates and the auxiliary variables. Moreover, note that the requirement that the expected value of the outcomes be unaffected by the auxiliary covariates Zi(t) places significant restrictions on the choice of covariates Zi(t). Although not explicitly stated, Xi(t) and Zi(t) need be known only at the visit times Tik. Lipsitz et al. do not consider a censoring time, rather it is incorporated into assumptions about the counting process N (i.e. N simply stops changing after an individual is censored). If these conditions are satisfied, then the likelihood for the full data factorizes into a likelihood for the outcomes and a likelihood for the visits, and inference for b can be based on linear mixed effects models of the outcomes alone provided that the visit and outcome models do not share parameters. An important caveat to this is that the correlation structure of the residual process  must be correctly specified; misspecification will in general result in biased parameter estimates. These conditions are less stringent than the conditions for a GEE to yield valid inferences. For example, the assumption that gaps are independent of current and previously unobserved outcomes given previously observed outcomes and covariates is plausible if the next visit is determined at the current visit in response to the current value of the outcome, and patients rarely deviate from the schedule for health-related reasons. However, the requirement that gaps be independent of past visit times might be violated, for example, in a study collecting data as part of routine prenatal visits in which mothers who believed they were at high risk were more likely to schedule additional visits. Thus, the mixed effects model can handle regular visits, irregular visits that are completely at random, and special cases of VAR (see Table 1).

Downloaded from smm.sagepub.com at Northeastern University on November 14, 2015

XML Template (2014) [17.5.2014–9:38am] [1–23] //blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/SMMJ/Vol00000/140059/APPFile/SG-SMMJ140059.3d (SMM) [PREPRINTER stage]

Pullenayegum and Lim

9

4.1.3 Lin-Ying estimating equations Lin and Ying proposed estimators16,17 for irregular longitudinal data, and their work has formed the basis for many later methods using semiparametric joint models. If censoring is noninformative in the sense that EðYi ðtÞjCi  t, Xi ðtÞÞ ¼ EðYi ðtÞjXi ðtÞÞ

ð6Þ

then the major advantage of the Lin-Ying approach over traditional GEEs is that the semiparametric model includes a nonparametric intercept, i.e. we take EðYi ðtÞjXi ðtÞÞ ¼ 0 ðtÞ þ Xi ðtÞb, (see Table 2), where Xi is assumed known at all times t. Under assumption (2), inference is accomplished through a modification of the working-independence GEE equations, in which a weighted average of the residuals is subtracted off the pseudo-score function. That is, instead of solving the GEE equations n Z 1 X Xi ðtÞ0 ðYi ðtÞ  Xi ðtÞbÞdNi ðtÞ ¼ 0 i¼1

0

we solve n Z X i¼1

1

   ~ i ðtÞb dNi ðtÞ ¼ 0 Xi ðtÞ0 ðYi ðtÞ  Xi ðtÞbÞ  Y~ i ðtÞ  X

ð7Þ

0

where e Xi ðtÞ ¼

Y~ i ðtÞ ¼

Pn

i¼1 IðCi  P n i¼1 IðCi

Pn

i¼1 IðCi  Pn i¼1 IðCi

V

tÞeXi ðtÞ Xi ðtÞ V

 tÞeXi ðtÞ V

tÞeXi ðtÞ Yi ðtÞ V

 tÞeXi ðtÞ

Yi ðtÞ is the measurement of Yi taken closest to time t V and Ni ðtÞ given Xi ðtÞ has intensity 0 ðtÞ expðXV i ðtÞÞ for covariates Xi  Xi : Later work extended the estimation method to handle an informative drop-out time Di in addition to the noninformative censoring time Ci, but restricted attention to a fixed intercept. The noninformative censoring time is assumed to be conditionally independent of the outcomes given the covariates, that is Ci ??Y i ð1ÞjX i ð1Þ

ð8Þ

Informative dropout Di is handled through artificial censoring, to which end we require that Xi(t) contains a subset of covariates Wi(t) with corresponding coefficients  such that   Z Di expðWi ðuÞgÞdu ð9Þ Yi ðtÞ  Xi ðtÞb, 0

Downloaded from smm.sagepub.com at Northeastern University on November 14, 2015

XML Template (2014) [17.5.2014–9:38am] [1–23] //blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/SMMJ/Vol00000/140059/APPFile/SG-SMMJ140059.3d (SMM) [PREPRINTER stage]

10

Statistical Methods in Medical Research 0(0)

have a common (although unspecified) distribution for each t. Thus, the Di are assumed to follow an accelerated failure time model.18 Under censoring assumptions (8) and (9), artificial censoring can be used to handle informative dropout. Assuming that assumption (2) holds, one can set up estimating equations that are zero mean and so solve for b (see Lin and Ying16 for more details). 4.1.4 Comments This approach is useful for handling informative dropout. The assumptions about the visit process in the absence of dropout are strict, however, and as for GEEs require that the covariates Xi in the regression model for Y capture any dependence between the visit process and the outcomes. The method is thus useful for regular visits, irregular visits that are completely at random, or special cases of VAR where any covariates that are predictive of the visit process also appear in the outcome model (see Table 1).

4.2

Inverse-intensity weighting

The methods discussed so far handle only special cases of VAR. We now consider estimation procedures that can handle general cases of VAR. For this purpose, weighting observations by the inverse of the visit intensity can be helpful. 4.2.1 Inverse-intensity weighted GEEs (IIW-GEEs) IIW-GEEs were first proposed by Lin et al.4 Later, Buzkova and Lumley19 relaxed some of the conditions, allowing time-dependent covariates in the mean model and a discontinuous hazard for the visit intensity. We present the method in its most general form, and so consider a general link function and time-dependent covariates Xi(t) that are known at the visit times (i.e. gði ðtÞÞ ¼ Xi ðtÞb, see Table 2). The censoring time Ci is assumed to be noninformative in the sense of equation (6). Suppose that there are auxiliary variables Zi(t) (which may include functions of Ni(s), Xi(s), and Y obs i ðsÞ for s < t), such that the visit intensity is independent of the current outcome given the censoring time and the observed auxiliary variables at time t, that is EðNi ðtÞ  Ni ðt  ÞjCi  t, Xi ðtÞ, Zi ðtÞ, Yi ðtÞÞ  EðNi ðtÞ  Ni ðt  ÞjCi  t, Zi ðtÞÞ ¼ ðt, Zi ðtÞÞ ¼ lim #0 

lim #0

ð10Þ

The intensity ðt, Zi ðtÞÞ is assumed to follow a proportional hazards model ðt, Zi ðtÞÞ ¼ 0 ðtÞ expðZi ðtÞÞ. sðtÞ Inference is then through a GEE with weights 0 ðtÞ expðZ , where s(t) is a stabilising function. i ðtÞÞ 19 Buzkova and Lumley note that inference can be simplified by stabilizing by the baseline hazard (i.e. taking s(t) ¼ 0(t)). In this case, the weights become expðZi ðtÞÞ, which avoids the computational expense of estimating the baseline hazard. Besides computational ease, stabilizing by the baseline intensity 0 allows the baseline intensity to be discontinuous, thereby allowing for the case where the visit process is comprised of both scheduled (i.e. protocol-determined, discrete-time) visits and unscheduled (i.e. as-needed, continuous-time) visits. In terms of practical application, Buzkova et al.20 illustrate how the coefficients for the proportional hazards model can differ for scheduled and unscheduled visit times.

Downloaded from smm.sagepub.com at Northeastern University on November 14, 2015

XML Template (2014) [17.5.2014–9:38am] [1–23] //blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/SMMJ/Vol00000/140059/APPFile/SG-SMMJ140059.3d (SMM) [PREPRINTER stage]

Pullenayegum and Lim

11

4.2.2 Inverse-intensity weighted Lin-Ying equations Buzkova and Lumley5 proposed an alternative use of inverse-intensity weighting, by combining it with the Lin-Ying equations rather than with GEEs. This allows for a nonparametric intercept in the mean model, see Table 2. The time-dependent covariates Xi(t) are assumed to be fully observed for all times t. Censoring is required to satisfy condition (6) and the visit process must satisfy equation (10). Inference for b is through weighting the Lin-Ying estimating equations (7) by the inverse of the visit intensity.17 Buzkova and Lumley5 further generalized the Lin-Ying equations with inverseintensity weighting to handle a log link function, with the same assumptions on the visit process and censoring times as in equation (10). 4.2.3 Doubly robust estimating equations When the link function is linear, augmenting the IIW-GEE equations in the spirit of Seaman and Copas21 yields doubly robust estimating equations.22 This approach uses both inverse-intensity weighting and a model for the increments of the outcome process. The marginal model allows for time-dependent covariates (i.e. i ðtÞ ¼ Xi ðtÞb, see Table 2), and the visit process is assumed to follow equation (10); note that although Pullenayegum and Feldman22 state a stronger assumption, it is straightforward to verify that the estimating equations are zero mean under these less stringent conditions. The covariates Xi(t) are assumed known at the visit times. Augmenting the GEE equations to create doubly robust equations requires an estimate of the conditional expectation of the outcome Y i(t) given the observed history up to time t, which can be derived by considering the increments of Y. Specifically, setting Yi ðs, tÞ ¼ Yi ðtÞ  Yi ðsÞ for s < t to be the increment in Yi between s and t, we assume there exist auxiliary covariates Wi(t) such that regressing the increments Y i(s,t) onto Xi ðtÞ  Xi ðsÞ and Wi ðtÞ  Wi ðsÞ results in residuals whose conditional mean given the observed history prior to time s is zero. The advantage of doubly robust inference is that it gives consistent estimates of the regression coefficients b if either the visit process model or the model for the increments of the outcome is correctly specified; that is, it is robust to misspecification of one (but not both) of the models. 4.2.4 Comments and extensions Reviewing these methods alongside one another suggests several possible extensions. Firstly, informative dropout could be incorporated into Buzkova and Lumley’s extension5 to the LinYing equations; assuming the informative dropout time D follows an accelerated failure time model as in Lin and Ying, artificial censoring can be used in conjunction with the inverseintensity weights. In a similar way, while the inverse-intensity weighted GEEs require that censoring be noninformative, it would be relatively straightforward to incorporate informative censoring provided that predictors of censoring were recorded in the dataset. Secondly, the existing doubly robust estimators do not accommodate a nonparametric intercept. However, if Buzkova and Lumley’s inverse-intensity weighted Lin-Ying equations were augmented using the same increments model as is used to augment the existing doubly robust equations, it may be possible to do doubly robust inference with a nonparametric intercept 0(t). By making use of auxiliary covariates Zi(t) for the visit process, inverse-intensity weighting methods handle general cases of VAR (see Table 1). Note in particular that Zi(t) may include past measurements of outcomes. To fit the proportional hazards model for the recurrent visit times, Zi(t) must be known at all times t rather than just at visit times; in practice this has been dealt with by defining the components of Zi(t) as the last observed values of the covariates in question,4,20,22 which is consistent with visit schemes that are physician driven or protocols that

Downloaded from smm.sagepub.com at Northeastern University on November 14, 2015

XML Template (2014) [17.5.2014–9:38am] [1–23] //blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/SMMJ/Vol00000/140059/APPFile/SG-SMMJ140059.3d (SMM) [PREPRINTER stage]

12

Statistical Methods in Medical Research 0(0)

are history dependent. However, because the visit and outcome processes must be independent given Zi(t), inverse-intensity weighted methods do not handle unmeasured predictors.

4.3

Semiparametric joint modelling

The models studied so far cover regular visits, VCAR, and VAR, but do not handle visits that are not at random. Semiparametric models that jointly model the outcome and visit processes can be helpful in this situation. Correlation between the outcome and visit processes is captured through shared or correlated random effects; however, parametric models for the visit and outcome processes are not required. Most models discussed in this section generalize the Lin-Ying method and relax the conditions on the dependence between visits and outcomes. Although part of a common family, implementations of semiparametric joint models differ considerably in the types of covariates included in the models (time dependent or baseline), whether covariates in the outcome and visit models must be the same, and the avenues through which random effects enter the models. A comparison of these models is given in Table 3. All the methods use nonparametric intercepts in the mean model. Sun et al.8 use a multiplicative mean model, but doing so requires that the random effects and covariates are common across the outcome and intensity models. Other models are more flexible in this regard, allowing separate (but correlated) random effects. Each of these methods assumes that the outcome and visit processes are conditionally independent given the covariates and random effects, i.e. that the intensity models in Table 3 are the intensities conditional on both Y i ð1Þ and the random effects, in addition to any relevant covariates. Liang et al.6 assume that the random effect in the model for the visit intensity (Vi2) is nonnegative with mean 1 and variance s2. A model for the conditional expectation of the random effects Vi1 in the outcome model given the random effects in the visit model is required; the authors explore EðVi1 jVi2 Þ ¼ hðVi2  1Þ, but other specifications are possible. Note that the distribution of Vi2 must be modelled parametrically, for example using a Gamma distribution. The censoring time Ci is assumed to be noninformative, in the sense that, given covariates Xi(t) and time-invariant auxiliary covariates Zi, Ci must be independent of the observation times and the outcomes, that is Ci ??N i ð1Þ, Y i ð1ÞjXi ð1Þ, Zi 8t

Table 3. Comparison of semiparametric joint modelling approaches: outcome and visit intensity models. Method

Outcome mean model conditional on random effects

Intensity model for Ni conditional on outcomes and random effects

Liang et al.6 Sun et al.12 Sun et al.8 Song et al.11

0 ðtÞ þ Xi ðtÞb þ Wi ðtÞVi1 0 ðt; Vi1 Þ þ Xi b Vi 0 ðtÞ expðXi bÞ 0 ðtÞ þ Xi ðtÞb þ Vi1

Vi2 0 ðtÞ expðZi cÞ 0 ðt; Vi2 Þ expðXi cÞ Vi 0 ðtÞ expðXi cÞ Vi2 0 ðtÞ expðXi ðtÞcÞ

Here Xi is a row vector of covariates, which when denoted by Xi(t) is permitted to be time dependent. Vi, Vi1, and Vi2 are random effects, Wi(t) is a subset of the covariates Xi(t), Zi is a vector of auxiliary baseline covariates, Ci is a censoring time, 0 is a baseline hazard, and 0 ,b,c are regression coefficients. The intensity model for Ni ðtÞ is conditional on Y i ð1Þ as well as any random effects and covariates.

Downloaded from smm.sagepub.com at Northeastern University on November 14, 2015

ð11Þ

XML Template (2014) [17.5.2014–9:38am] [1–23] //blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/SMMJ/Vol00000/140059/APPFile/SG-SMMJ140059.3d (SMM) [PREPRINTER stage]

Pullenayegum and Lim

13

Advantages of Liang’s model are that it is the only semiparametric joint modelling method that allows covariates for the outcome and visit processes to differ, and it is also the only method that allows random effects for any regression coefficient, rather than just random intercepts. Sun et al.,12 which generalizes Cai et al.,10 allow the random effect to enter the baseline hazard in an arbitrary way (see Table 3). Covariates for the outcome and visit processes must be the same and must be time invariant. Sun et al.12 also allow for a terminal event, e.g. death, to occur at time Di. The terminal event is assumed to have cumulative hazard function  obeying log ðDi Þ ¼ Xi g þ i where the random errors  are assumed to follow an extreme value distribution, while the censoring time is assumed to satisfy Ci ??Vi1 , Vi2 , Di , Y i ð1Þ, N i ð1ÞjXi :

ð12Þ

Sun et al.8 develop methods for multiplicative models. The model requires both random effects and covariates be held in common between the outcome and visit models, and covariates must be time invariant. The censoring assumption is less stringent than for previous models, requiring that censoring be independent of visits and outcomes conditional on both the covariates and the random effects (Vi), i.e.  Ci ??N i ð1Þ, Yð1ÞjX i , Vi

ð13Þ

For identifiability, the mean of the random effect Vi is taken to be 1. Song et al.11 generalize the earlier methods of Sun et al.7,9 and is the only semiparametric joint modelling method that allows time-dependent covariates in the visit model. These covariates must also appear in the outcome model. The censoring time C must be conditionally independent of outcomes and visits given the covariates and the random effects Vi1, Vi2, and moreover, the censoring time must be independent of the random effects given the covariate process, i.e. Ci ??N i ð1Þ, Y i ð1ÞjXi ð1Þ, Vi1 , Vi2 ;

Ci ??Vi1 , Vi2 jXi ð1Þ

ð14Þ

The model can also be generalized to handle time-dependent regression coefficients, i.e. EðYi ðtÞjXi ðtÞ, Vi2 Þ ¼ 0 ðtÞ þ Xi ðtÞbðtÞ þ Vi2

4.3.1 Comments Together, the models offer considerable flexibility in using random effects to capture correlation between visit and outcome models. Random variables themselves do need to be time invariant, and several authors7–9 point out that it would be useful to relax this to allow time-varying random effects. Song et al. note that time-varying random effects may be used with their model provided that the conditional mean of the random effects given the covariates is zero, and the covariance of the random effects at time t does not depend on the values of the observed covariates at time t. Considerable attention has been paid to addressing goodness of fit of the models. The adequacy of the model for the visit intensity is usually assessed through graphical or numerical methods as

Downloaded from smm.sagepub.com at Northeastern University on November 14, 2015

XML Template (2014) [17.5.2014–9:38am] [1–23] //blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/SMMJ/Vol00000/140059/APPFile/SG-SMMJ140059.3d (SMM) [PREPRINTER stage]

14

Statistical Methods in Medical Research 0(0)

proposed by Lin et al.,23 Zeng and Cai,24 Huang and Wang,25 and Huang et al.26 Formal tests for the adequacy of the model for the outcomes have been developed.7–12 Assessment of the assumption that the outcome and visit processes are independent given the random effects and relevant covariates have received little attention, and given its importance would be an important avenue for future research. These models are suitable for special cases of VAR, and special cases of visiting not at random. They do not cover cases of VAR where visit intensity depends on past outcomes or auxiliary time-dependent covariates (see Tables 1 and 3); however, these cases can be handled using inverse-intensity weighting. Semiparametric joint models also do not handle cases of VAR where the visit process depends on auxiliary time-dependent covariates that are not in the outcome model, or cases of VAR where the random effects are time dependent (see Table 1).

4.4

Parametric joint models

If one is willing to specify a fully parametric model for the outcomes, including distributions for the random effects, then more general dependence between outcomes and visits can be considered. In the context of recurrent visits where there is a period of time following the occurrence of a visit when one is not at risk of another visit (e.g. with hospital admissions, one cannot be re-admitted until one is first discharged), Zhu et al.27 propose a joint model for outcomes and visit times ~ i ðtÞÞ þ i ðtÞ, Yi ðtÞ ¼ 0 þ Xi ðtÞb þ bi ð/ X ~ i ðtÞbi Þ  i ðt Þ, bi Þ ¼ IðNi ðt Þ ¼ Mi ðt ÞÞ0 ðtÞ expðZi ðtÞ þ X i ðtjZi ðtÞ, N i ðt Þ, M

ð15Þ

~ i ðtÞ where i is the intensity for Ni , the notation a b denotes componentwise product of a and b, X contains a column of ones as well as columns for any covariates that are common to both Xi(t) and the auxiliary covariate process Zi(t) used in the event intensity model, and Mi(t) is the counting process for discharges. Implicit in equation (15) is that the visit process at time t is independent of the outcome at time t given the covariates, the history of the visit and discharge processes, and the random effects b. On assuming distributions for i and bi, inference is through maximum likelihood. The covariates Xi(t) and Zi(t) must be known at all times rather than just at visit times. 4.4.1 Comments Zhu et al.27 give one example of a fully parametric joint model for visits and outcomes; however, the maximum likelihood framework allows considerable flexibility in the dependence structures between latent variables, which may be chosen on the basis of likelihood ratio tests. Moreover, a notable advantage of the parametric approach is that it is the only method involving latent variables that also allows the visit process to depend on observed auxiliary time-dependent variables. In principle one could also include time-dependent latent variables (for example, an unobserved symptom indicator variable). The latent variables would need to be modelled parametrically over time, and such models have yet to be explored in the literature. Thus while in theory parametric joint models can handle general cases of visiting not at random, in practice they have been evaluated only for time-invariant latent variables. Moreover, these models can be sensitive to distributional assumptions.28

Downloaded from smm.sagepub.com at Northeastern University on November 14, 2015

XML Template (2014) [17.5.2014–9:38am] [1–23] //blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/SMMJ/Vol00000/140059/APPFile/SG-SMMJ140059.3d (SMM) [PREPRINTER stage]

Pullenayegum and Lim

15

5 Example We use a study of juvenile dermatomyositis (JDM) and a study of tumour recurrence in bladder cancer to illustrate how examination of the visit process and modelling objectives influence the choice of analytic method. We begin with a brief description of the dataset and then outline the key steps in choosing an analytic method.

5.1

The JDM dataset

JDM is a childhood-onset autoimmune inflammatory myositis. Children with JDM develop weakness such that they may not be able to walk or perform activities of daily living. These children also develop characteristic rashes alongside their weakness. Data were collected from an inception cohort of patients diagnosed with JDM and followed at a dedicated clinic at the Hospital for Sick Children between January 1991 and December 2010. All children followed in this clinic were evaluated using a standard protocol from the first visit. The outcome of interest is a disease activity score (DAS) that includes measures of both the musculoskeletal and cutaneous manifestations of JDM,29 where higher scores indicate worse disease activity. The median follow-up time was 4.5 years (interquartile range 2.4–6.8 years), however we chose to artificially censor follow-up at 2 years. There were three reasons for this. Firstly, 80% of the cohort were followed for 2 years or longer, meaning that we would have a large portion of the sample under observation over the time frame of interest. Secondly, the disease trajectory tends to flatten out after a few years, so that interest lies primarily in the first few years. Thirdly, since data were collected at a paediatric institution, follow-up ceased at age 18. To avoid age being a predictor of censoring time (and hence possibly informative censoring), we decided to restrict the sample to those who would remain under 18 for the duration of interest. Choosing a 2-year follow-up interval meant that we were able to include 96% of the sample (i.e. 92 of 96 patients were aged 16 or younger on diagnosis). Unless otherwise specified, models were fitted using R version 3.0.1.30 5.1.1 Modelling objectives Interest is in disease burden over time as measured through the DAS, i.e. on a marginal model EðYi ðtÞÞ ¼ 0 ðtÞ where Yi(t) is the DAS for patient i at time t. In our example, however, we wish to estimate 0(t) rather than treat it as a nuisance parameter, and so neither the inverse-intensity weighted Lin-Ying equations nor the semiparametric joint models are suitable (see Table 2 and the corresponding footnote). Disease activity varies over time in a nonlinear fashion and will be captured using fractional polynomials.31 5.1.2 Determining the visit process The choice of analytic method should be informed by the visit process, which we now consider.

Distribution of visit times. The standard of care in the clinic was for visits at 0, 2, 4, 6, 10, 14, 22, 30, and 42 weeks, and thereafter every 2 to 3 months; however, those who still have high disease activity will visit more often. The number of visits per patient ranges from 3 to 14, with a mean of 10. Figure 1 shows the visit times for a random sample of 10 subjects. As can be seen, we have irregular visits, and given the visit protocol we should expect status at previous visits to be predictive of future visit intensity.

Downloaded from smm.sagepub.com at Northeastern University on November 14, 2015

XML Template (2014) [17.5.2014–9:38am] [1–23] //blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/SMMJ/Vol00000/140059/APPFile/SG-SMMJ140059.3d (SMM) [PREPRINTER stage]

16

Statistical Methods in Medical Research 0(0)

0.0

0.5

1.0

1.5

2.0

Time in years

Figure 1. Visit times for a random subset of 10 patients. Each dot represents a visit, and the dotted lines indicate times where the patient was not lost of follow-up but where there was no visit.

Measured predictors of visit intensity. A proportional hazards regression on the recurrent visits revealed that univariately, a number of time-varying patient characteristics and symptoms were associated with visit frequency. The DAS is a summary of many of these, and after accounting for DAS at the last visit, no other covariates were significantly associated with visit frequency. A one unit increase in DAS was associated with a 9% increase in hazard (95% CI 6% to 12%). There was some evidence of nonproportionality in the hazards; since disease activity tends to decrease over time, this can be caused by a nonlinear association between disease activity and the logarithm of visit intensity. We modelled logarithmic transforms of 1 þ DAS in addition to powers of 2, 1, 0.5, 0.5, 1, 2, 3, and found that the logarithmic term fitted best. Adding additional terms did not improve the fit of the model, and a test did not suggest nonproportionality in the hazards (p ¼ 0.67). Thus, letting i(t) be the visit intensity for subject i at time t, we have i ðtÞ ¼ 0 ðtÞ expð0:39 logð1 þ Yi ðTNi ðt Þ ÞÞÞ Visits are thus either at random or not at random, and there are time-dependent visit predictors that are not included as covariates in the outcome model.

Downloaded from smm.sagepub.com at Northeastern University on November 14, 2015

XML Template (2014) [17.5.2014–9:38am] [1–23] //blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/SMMJ/Vol00000/140059/APPFile/SG-SMMJ140059.3d (SMM) [PREPRINTER stage]

Pullenayegum and Lim

17

Unmeasured predictors of visits. The next step was to ask whether there were time-invariant latent covariates associated with both outcome and visit processes. To this end, we began by modelling the disease activity trajectory. The trajectory of disease activity over time is nonlinear, so it was modelled using fractional polynomials. When using a single polynomial function of time, the power of the polynomial is chosen from among 2, 1, 0.5, 0.5, 1, 2, 3, and the logarithm of time is also considered. Since we had times of zero in the dataset we used an offset to avoid singularities, and so considered fractional polynomial transforms of 1 þ t rather than t, where t indicates time. Taking the offset to be 1 rather than a smaller number avoided instability when time was close to zero. The best-fitting first-order solution, as judged by the Akaike Information Criterion (AIC), was a power of 2, i.e. ð1 þ tÞ2 . We then went on to consider whether including more than one fractional polynomial term improved the fit. Following Royston et al.31 we considered pairs of terms (1 þ t)i, (1 þ t)j, where i and j are chosen from among 2, 1, 0.5, 0, 0.5, 1, 2, 3, with 0 indicating a log transform; for i ¼ j the terms were (1 þ t)i and ð1 þ tÞi logð1 þ tÞ. The best-fitting pair was i ¼ j ¼ 2, i.e. the terms ð1 þ tÞ2 and ð1 þ tÞ2 logð1 þ tÞ. A generalized additive mixed model did not show evidence that additional nonlinear terms could improve the fit (p ¼ 0.31). There remains the possibility that there are unmeasured factors associated with both the outcomes and the visit process. We used a joint random effects model to ask whether, given our distributional assumptions, there were any time-invariant unmeasured factors associated with both the outcomes and the visit process. For the DAS, we fitted the fractional polynomial described earlier with a random intercept and slope, and for the visit times we used a proportional hazards model, regressing onto the logarithm of the last visit’s DAS and a random intercept to allow for correlated visit times. Specifically, we used the model Yi ðtÞ ¼ 0 þ 1 ð1 þ tÞ2 þ 2 ð1 þ tÞ2 logð1 þ tÞ þ ui þ vi t þ i ðtÞ i ðtÞ ¼ 0 ðtÞ expðlogð1 þ Yi ðTiNi ðt Þ ÞÞ þ wi Þ, where 0 1 ui B C @ vi A Multivariate Normal ð0, DÞ wi for an arbitrary covariance matrix D. This model was fitted using a slight adaptation of the code in Liu and Huang32 using Proc NLMIXED in SAS version 9.3. There was no evidence of correlation between the random effect for the proportional hazards model and the other random effects. We note, however, that this result is sensitive to the distributional assumptions made (see discussion of Diggle and Kenward33), and should be taken to indicate that we were unable to find evidence of correlation among the random effects for the outcome and visit models, rather than that no such correlation exists (see Kenward28). 5.1.3 Choosing an analytic model Assuming that there are no unmeasured factors associated with both visit and outcome processes, we have an irregular visit process with visits at random, predicted by past outcomes. As can be seen from the fourth row of Table 1, suitable methods for this case are mixed models, inverse-intensity weighted GEEs, or parametric joint models. Among the inverse-weighted methods, IIW-GEEs or doubly robust methods could be used; note that the inverse-intensity weighted Lin-Ying equations5 do not apply as these are used to estimate b from the mean model EðYi ðtÞjXi ðtÞÞ ¼ 0 ðtÞ þ Xi ðtÞb with 0(t) treated as a nuisance parameter that is not estimated. For the mixed model, we used random

Downloaded from smm.sagepub.com at Northeastern University on November 14, 2015

XML Template (2014) [17.5.2014–9:38am] [1–23] //blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/SMMJ/Vol00000/140059/APPFile/SG-SMMJ140059.3d (SMM) [PREPRINTER stage]

Statistical Methods in Medical Research 0(0)

8

18

4 0

2

Mean DAS

6

GEE Mixed Model Weighted GEE Doubly Robust

0.0

0.5

1.0

1.5

2.0

Time (years)

Figure 2. Fitted expected disease activity score as a function of time using (a) GEE; (b) a mixed model; (c) inverseintensity weighted GEE; (d) doubly robust estimation.

effects for the intercept and slope, with an exponential correlation structure on the residuals, as this led to the lowest AIC. 5.1.4 Results The resulting trajectories are shown in Figure 2, and the corresponding areas under the curves are presented in Table 4. For comparison, results from a GEE are also shown; this is an inappropriate analysis, but is included because it is so often wrongly used in practice. As can be seen from Table 2, the GEE overestimates the area under the curve; this is expected since patients with a high disease activity visit more often. The remaining methods give smaller areas, and as can be seen from Figure 2, the doubly robust method estimates a slower decline in disease activity in the first few months and a higher decline than the other methods after the first year.

5.2

Bladder cancer data

To illustrate a setting where semiparametric joint models may be helpful, we turn to the bladder cancer data. This consists of data from a randomized controlled trial conducted by the Veterans Administration Cooperative Urological Research Group. Patients with superficial bladder tumours had their tumour removed and were randomized to receive placebo, thiotepa or pyridoxine. Visits were scheduled every 3 months, and any new tumours were removed. The question of interest for

Downloaded from smm.sagepub.com at Northeastern University on November 14, 2015

XML Template (2014) [17.5.2014–9:39am] [1–23] //blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/SMMJ/Vol00000/140059/APPFile/SG-SMMJ140059.3d (SMM) [PREPRINTER stage]

Pullenayegum and Lim

19

Table 4. Area under the disease activity curve (AUC) and standard errors (SE) using fractional polynomials with estimation using GEEs, mixed effects, inverseintensity weighted GEE, and doubly robust estimation. Estimator

AUC

SE

GEE Mixed effects Weighted GEE Doubly robust

6.47 6.11 6.05 5.54

0.29 0.28 0.25* 0.25*

Standard errors marked with * were estimated through a nonparametric bootstrap with 100 bootstrap samples.

this analysis will be whether pyridoxine is more effective than placebo at reducing the rate of new tumour occurrences. We thus consider data from these two arms only, as reported by Hand et al.34 This contains data from 46 patients in the placebo group and 36 patients in the pyridoxine group, with visits up to and including 36 months; note that this is in contrast to other versions of the dataset that included longer follow-up durations (see for example Sun et al.,8 Cai et al.,10 Song et al.11). 5.2.1 Modelling objectives Since the outcome of interest is a recurrence rate, it is natural to consider a multiplicative model of the form EðYi ðtÞjXi Þ ¼ 0 ðtÞ expðXi Þ where Yi(t) is the number of new tumours for individual i at time t, Xi is a binary treatment group indicator, with Xi ¼ 1 if individual i was randomized to pyridoxine, and 0 if individual i was randomized to placebo. Additive models are also possible, but do not guarantee that the modelled means are positive. Previous applications have modelled the logarithm of Yi(t) (adding 1 to avoid taking the log of zero). 5.2.2 Determining the visit process The protocol requests fixed, quarterly visits. However, there was substantial deviation from the protocol, with intermittent missing visits, loss to follow-up, and death. Figure 3 shows the times at which visits occurred for each patient. Missing visits are most likely to be driven by patient. We did not find evidence that randomized group, the number of new tumours at the previous visit, or whether there were any new tumours at the previous visit was predictive of visit frequency prior to censoring (hazard ratios of 1.07 (p ¼ 0.6), 0.97 (p ¼ 0.27), 0.97 (p ¼ 0.86), respectively). Thus, deviation does not seem to be predicted by observed factors in the data. Deviation is thus either unrelated to the patient’s health or is associated with unobserved factors. The visit process is thus either completely at random or not at random. The version of the dataset we are working with does not contain information on survival status, and thus censoring times include both censoring due to loss to follow-up and censoring due to death. This makes the censoring assumptions (11), (12), (14) implausible; however, assumption (13) is plausible as it requires independence conditional on the random effects and randomized treatment group.

Downloaded from smm.sagepub.com at Northeastern University on November 14, 2015

XML Template (2014) [17.5.2014–9:39am] [1–23] //blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/SMMJ/Vol00000/140059/APPFile/SG-SMMJ140059.3d (SMM) [PREPRINTER stage]

20

Statistical Methods in Medical Research 0(0)

5

10

15

20

25

30

35

Time in months

Figure 3. Visit times for the bladder cancer data, with each dot representing a visit and each row representing a subject.

5.2.3 Choosing an analytic model Given the preference for a multiplicative model, the fact that unobserved patient characteristics are likely to be predictive of visit frequency prior to censoring, and that censoring is likely associated with outcome, the semiparametric joint model of Sun et al.8 is an attractive choice. For comparison, we also present the results of a GEE with a log link. 5.2.4 Results Sun et al.’s8 semiparametric joint model estimated that treatment led to a 53% reduction in tumour occurrence rate (95% CI 79% reduction to 2% increase). By comparison, the GEE using observed data estimated a 51% reduction (95% CI 76% reduction to 1% increase). Before leaving this example, note that a multistate model subject to interval censoring could also be used, see for example Chen et al.35

Downloaded from smm.sagepub.com at Northeastern University on November 14, 2015

XML Template (2014) [17.5.2014–9:39am] [1–23] //blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/SMMJ/Vol00000/140059/APPFile/SG-SMMJ140059.3d (SMM) [PREPRINTER stage]

Pullenayegum and Lim

21

6 Suggestions for study design As can be seen from Table 1, options are limited when visits are not at random. Thus, when studies are prospective it would help to design them so as to make VAR more plausible.

6.1

Move towards repeated measures

The simplest strategy statistically is to specify measurement times in the protocol, require that these be common across patients, and apply methods to improve follow-up such as cultivating a positive relationship between participants and study staff, providing small incentives and gifts for ongoing participation, covering participant travel costs, collecting multiple pieces of contact information at study outset as well as contact information for a friend or relative, and sending study newsletters and anniversary or birthday cards.36–40 This will make it more likely that the visit process is regular and can be treated as a missing data problem. However, this may prove infeasible in many cases. Firstly, the above methods for improving follow-up are suitable in the context of well-funded studies, but for many clinic-based study cohorts resources are limited. Secondly, a repeated measures set-up may unnecessarily occupy physicians’ time. For example, it will often not be medically necessary to follow all patients at the same times, since those patients who are well may require less frequent follow-up. If the protocol demands the same follow-up times for all patients, this means that each patient needs to be seen at the same frequency as the sickest patients. This is wasteful and will be unappealing when the aim is to collect reliable data as a part of routine follow-up.

6.2

Specify a protocol for follow-up

In situations where follow-up times cannot be set to be the same, it is desirable nonetheless that the follow-up schedule be determined by protocol. By formalizing the follow-up process one knows by design which factors contribute to visit frequency in the absence of deviation, and it is more likely that the visit process model used in inverse-intensity weighting methods captures all the relevant covariates. When there is no protocol, factors are usually chosen for inclusion in the visit model on the basis of statistical significance in models for the recurrent visit process, which may lead to the omission of important covariates in the model, especially when sample sizes are small. Moreover, when there is a formal protocol one can ensure by design that covariates associated with visit frequency are recorded in patients’ charts.

6.3

Record recommended visit intervals

In practice, there may be many factors that contribute to how often a patient must be seen, to the extent that writing down formal rules is too cumbersome. Moreover, different physicians may have different procedures for follow-up, and imposing a common protocol may compromise usual care. In such cases, follow-up intervals may be best left to the treating physician’s discretion, and so will not follow a formal rule. One problem with modelling visit schemes in the absence of a protocol is that it is difficult to determine based on statistical modelling whether the visit model includes the correct variables. Furthermore, it could be that the decision on follow-up interval is based on factors that are not recorded in the patient’s chart. One simple solution to this is for the physician to note in the chart at what interval the next visit should be. Recording visit interval would ensure that, in the absence of patient deviation, visiting would be at random.

Downloaded from smm.sagepub.com at Northeastern University on November 14, 2015

XML Template (2014) [17.5.2014–9:39am] [1–23] //blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/SMMJ/Vol00000/140059/APPFile/SG-SMMJ140059.3d (SMM) [PREPRINTER stage]

22

6.4

Statistical Methods in Medical Research 0(0)

Keep track of deviations

In practice, actual intervals between visits will deviate from recommended intervals. Small amounts of extra documentation could help to determine whether deviation is related to the patient’s health. For example, the date at which the next visit appointment is made and the scheduled visit date could be noted. If the visit date is changed, the date at which it is changed together with the new visit date can be recorded. The reason for the change could also be recorded, for example by choosing among the categories ‘schedule conflict unrelated to patient’s health’, ‘patient has another doctor’s appointment at this time’, ‘health concern requires an earlier visit’, ‘previous symptoms resolved’. Cases where visits are changed as a result of the patient’s health and without documentation would result in visits that were not at random; collecting information on re-scheduled visits makes VAR more plausible.

6.5

Specify minimal follow-up for patient-driven visit processes

Patient-driven visit processes will always be problematic. If a prospective study is envisioned, it would be helpful to schedule at least yearly visits, and then note both the date at which an additional visit is requested and the reason, as described in the previous paragraph. Since, faced with the same symptoms, some patients will schedule a visit whereas others will not, the best one can hope for with this strategy is to shift from visiting not at random with time-dependent latent variables to visiting not at random with time-invariant latent variables.

7 Summary This paper has reviewed some of the key methods available for analysing irregularly observed longitudinal data and shown that the choice of method hinges on the underlying visit process. It is thus crucial to determine what the visit process is. We have proposed a way of classifying visit processes based on the recommended intervals between visits and the deviation from these intervals, as this reflects the processes at play in practice. Understanding the context in which the data were collected will help in determining the visit process, but as actual visit processes will always deviate from planned processes, one must determine the nature of the deviation. When considered at the design stage, reasons for deviation can be recorded as the deviations occur. If this is not done, the physicians who see the patients may be able to give an overall impression of reasons for deviation. Statistical analysis can also play a role in distinguishing between deviations that are completely at random and deviations that are at random, but is limited both by what is recorded in the dataset and by sample size. Due to these limitations, and the lack of methods for handling unmeasured time-varying predictors, we have suggested approaches to study design that would enhance the analyst’s ability to study visit patterns, and would also reduce the chances of important unmeasured predictors contributing to visit frequency. An important finding of this work is that together the available methods cover many of the plausible visit scenarios, but that each method alone is insufficient for several important cases. Although it may be tempting for analysts to have a ‘go-to’ method, we have shown that it is important to keep the full array of methods in mind and to consider the visit process carefully before choosing a model. Funding EMP received operating funds from the Natural Sciences and Engineering Research Council and is supported by a New Investigator Award from the Canadian Institutes of Health Research.

Downloaded from smm.sagepub.com at Northeastern University on November 14, 2015

XML Template (2014) [17.5.2014–9:39am] [1–23] //blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/SMMJ/Vol00000/140059/APPFile/SG-SMMJ140059.3d (SMM) [PREPRINTER stage]

Pullenayegum and Lim

23

References 1. Hogan J, Roy J and Korkontzelou C. Handling drop-out in longitudinal studies. Stat Med 2004; 23: 1455–1497. 2. Rubin D. Multiple imputation for non-response in surveys. New York: John Wiley & Sons, 1987. 3. Schafer J. Multiple imputation: A primer. Stat Methods Med Res 1999; 8: 3–15. 4. Lin H, Scharfstein D and Rosenheck R. Analysis of longitudinal data with irregular, outcome-dependent follow-up. J R Stat Soc Ser B 2004; 66: 791–813. 5. Buzkova P and Lumley T. Semi-parametric modeling of repeated measurements under outcome-dependent followup. Stat Med 2009; 28: 987–1003. 6. Liang Y, Lu W and Ying Z. Joint modeling and analysis of longitudinal data with informative observation times. Biometrics 2009; 65: 377–384. 7. Sun J, Sun L and Liu D. Regression analysis of longitudinal data in the presence of informative observation and censoring times. J Am Stat Assoc 2007; 102: 1397–1406. 8. Sun L, Mu X, Sun Z, et al. Semiparametric analysis of longitudinal data with informative observation times. Acta Math Appl Sin English Ser 2011; 27: 29–42. 9. Sun L, Song X and Zhou J. Regression analysis of longitudinal data with time-dependent covariates in the presence of informative observation and censoring times. J Stat Plan Infer 2011; 141: 2902–2919. 10. Cai N, Lu W and Zhang H. Time-varying latent effect model for longitudinal data with informative observation times. Biometrics 2012; 68: 1093–1102. 11. Song X, Mu X and Sun L. Regression analysis of longitudinal data with time-dependent covariates and informative observation times. Scand J Stat 2012; 39: 248–258. 12. Sun L, Song X, Zhou J, et al. Joint analysis of longitudinal data with informative observation times and a dependent terminal event. J Am Stat Assoc 2012; 107: 688–700. 13. Rubin D. Inference and missing data. Biometrika 1976; 63: 581–592. 14. Society of Obstetricians and Gynaecologists of Canada, Ottawa. Healthy beginnings: Guidelines for care during pregnancy and childbirth, 1998. 15. Lipsitz S, Fitzmaurice G, Ibrahim J, et al. Parameter estimation in longitudinal studies with outcome-dependent follow-up. Biometrics 2002; 58: 621–630. 16. Lin D and Ying Z. Semiparametric regression analysis of longitudinal data with informative dropouts. Biostatistics 2003; 4: 385–398. 17. Lin D and Ying Z. Semiparametric and nonparametric regression analysis of longitudinal data. J Am Stat Assoc 2001; 96: 103–113. 18. Kalbfleisch J and Prentice R. The statistical analysis of failure time data. New York: Wiley, 1980. 19. Buzkova P and Lumley T. Longitudinal data analysis for generalized linear models with follow-up dependent on outcome-related variables. Can J Stat 2007; 35: 485–500. 20. Buzkova P, Brown E and John-Stewart G. Longitudinal data analysis for generalized linear models under participant-driven informative follow-up: An application in maternal health epidemiology. Am J Epidemiol 2010; 171: 189–197. 21. Seaman S and Copas A. Doubly robust generalized estimating equations for longitudinal data. Stat Med 2009; 28: 937–955. 22. Pullenayegum E and Feldman B. Doubly robust estimation, optimally truncated inverse-intensity weighting

23.

24.

25.

26.

27.

28.

29.

30.

31.

32.

33.

34. 35.

36.

37.

38.

39.

40.

and increment-based methods for the analysis of irregularly observed longitudinal data. Stat Med 2013; 32: 1054–1072. Lin D, Wei L, Yang I, et al. Semiparametric regression for the mean and rate functions of recurrent events. J R Stat Soc Ser B 2000; 62: 711–730. Zeng D and Cai J. A semiparametric additive rate model for recurrent events with an informative terminal event. Biometrika 2010; 97: 699–712. Huang C and Wang M. Joint model and estimation for recurrent event processes and failure time data. J Am Stat Assoc 2004; 99: 1153–1165. Huang C, Qin J and Wang MC. Semiparametric analysis for recurrent event data with time dependent covariates and informative censoring. Biometrics 2010; 66: 39–49. Zhu L, Zhao H, Sun J, et al. Joint analysis of longitudinal data and recurrent episodes data with application to medical cost analysis. Biometr J 2013; 55: 5–16. Kenward M. Selection models for repeated measurements with non-random dropout: An illustration of sensitivity. Stat Med 1998; 17: 2723–2732. Bode R, Klein-Gitelman M and Miller M. Disease activity score for children with juvenile dermatomyositis: Reliability and validity evidence. Arthritis Rheumatism 2003; 49: 7–15. R Core Team: R Foundation for Statistical Computing, Vienna, Austria. R: A Language and Environment for Statistical Computing 2013. http://www.R-project.org. Royston P, Ambler G and Sauerbrei W. The use of fractional polynomials to model continuous risk variables in epidemiology. Int Epidemiol Assoc 1999; 28: 964–974. Liu L and Huang X. Joint analysis of correlated repeated measures and recurrent events processes in the presence of death, with application to a study on acquired immune deficiency syndrome. J R Stat Soc Ser C (Appl Stat) 2009; 58: 65–81. Diggle P and Kenward M. Informative drop-out in longitudinal data analysis. J R Stat Soc Ser A (Appl Stat) 1994; 43: 49–93. Hand D, Daly F, McConway K, et al. A handbook of small data sets, volume 1. London: Chapman & Hall, 1993. Chen B, Yi G and Cook R. Progressive multi-state models for informatively incomplete longitudinal data. J Stat Plan Infer 2011; 141: 80–93. Hough R, Tarke H, Renke V, et al. Recruitment and retention of homeless mentally ill participants in research. J Consult Clin Psychol 1996; 64: 881–891. BootsMiller B, Ribisl K, Mowbray C, et al. Methods of ensuring high follow-up rates: Lessons from a longitudinal study of dual diagnosed participants. Substance Use Misuse 1998; 33: 2665–2685. Tansey C, Matte A, Needham D, et al. Review of retention strategies in longitudinal studies and application to followup of ICU survivors. Intern Care Med 2007; 33: 2051–2057. Bonk J. A road map for recruitment and retention of older adult participants in longitudinal studies. J Am Geriatr Soc 2010; 58: S303–S307. Gourash WF, Ebel F, Lancaster K, et al. LABS Consortium Retention Writing Group. Longitudinal Assessment of Bariatric Surgery (LABS): Retention strategy and results at 24 months. Surg Obes Relat Dis 2013; 9: 514–519.

Downloaded from smm.sagepub.com at Northeastern University on November 14, 2015

Longitudinal data subject to irregular observation: A review of methods with a focus on visit processes, assumptions, and study design.

When data are collected longitudinally, measurement times often vary among patients. This is of particular concern in clinic-based studies, for exampl...
296KB Sizes 0 Downloads 0 Views