HHS Public Access Author manuscript Author Manuscript

Biometrika. Author manuscript; available in PMC 2017 August 03. Published in final edited form as: Biometrika. 2012 ; 99(1): 151–165. doi:10.1093/biomet/asr076.

A functional generalized method of moments approach for longitudinal studies with missing responses and covariate measurement error Grace Y. YI, Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, Ontario, Canada N2L 3G1

Author Manuscript

Yanyuan MA, and Department of Statistics, Texas A&M University, TAMU 3143, College Station, Texas 77843-3143, U.S.A Raymond J. Carroll Department of Statistics, Texas A&M University, TAMU 3143, College Station, Texas 77843-3143, U.S.A

Summary

Author Manuscript

Covariate measurement error and missing responses are typical features in longitudinal data analysis. There has been extensive research on either covariate measurement error or missing responses, but relatively little work has been done to address both simultaneously. In this paper, we propose a simple method for the marginal analysis of longitudinal data with time-varying covariates, some of which are measured with error, while the response is subject to missingness. Our method has a number of appealing properties: assumptions on the model are minimal, with none needed about the distribution of the mismeasured covariate; implementation is straightforward and its applicability is broad. We provide both theoretical justification and numerical results.

Some key words Functional measurement error; Generalized method of moments; Inverse probability weighting; Longitudinal data; Measurement error; Missing response; Structural measurement error

Author Manuscript

1. Introduction Longitudinal data analysis has attracted considerable research interest and a large number of inference methods have been proposed in the literature. Their validity relies on the important requirements that variables are perfectly measured and data are complete. In practice, however, these conditions are commonly violated, and ignoring this could result in seriously biased results. Although there has been discussion of inference methods to address longitudinal data with either missing values or measurement error, relatively little attention has been directed to accounting simultaneously for both features, which is partly attributable to complexity in modelling and computation. In this paper, we propose a simple method for

YI et al.

Page 2

Author Manuscript

the marginal analysis of longitudinal data with time-varying covariates, some of which are measured with error, while the response is subject to missingness.

Author Manuscript

Our method has these important features. First, our model is marginal: we do not specify a full distribution for the complete data, but instead we use estimating equations based on moment restrictions. Secondly, our method does not require the assumption that E(Yij | Xi, Zi) = E(Yij | Xij, Zij), which has been widely used in marginal analysis, especially when using the generalized estimating equation approach, e.g., Pepe & Anderson (1994) and Lai & Small (2007). Here Yij and (Xij, Zij) represent the response and covariates for subject i at time j, and (Xi, Zi) are the vector-valued subject level covariates. This is useful because in applications it is often difficult to assess this assumption. Moreover, this relaxation extends the scope of applicability of our method. Thirdly, we develop a functional measurement error method i.e., nothing is assumed about the distribution of the variables measured with error.

Author Manuscript

In this paper, we explore how covariate measurement error may affect the association structure between the response and the covariates. We show that a conditional independence relationship between Yij and the true covariates (Xik, Zik) for j ≠ k may be distorted when replacing Xik by its observed surrogate Wik. In the presence of missing values, we show that measurement error will usually alter the missing data mechanism. For example, a missingat-random mechanism in error-prone covariates Xi is not retained when Xi is replaced by its surrogate Wi. To adjust for missing responses, we use inverse probability weighting based on the observed data. However, the development of a model for the probability of missingness is tricky. See § 2.4, where we discuss an obvious method that leads to difficulties and then a simple method that is completely transparent. In our development, the generalized method of moments, discussed by Hansen (1982), is used to combine unbiased estimating functions in the observed data. There is considerable work on the measurement error problem in longitudinal data using the structural approach, namely when the distribution of the error-prone covariates is specified. References include Wang et al. (1998), Palta & Lin (1999), Lin & Carroll (2006), Liang (2009), Zhou & Liang (2009) and Xiao et al. (2010). Pan et al. (2009) investigate a transition model and apply the conditional and sufficient score approach of Stefanski & Carroll (1987) to analyse it. Similar uses of the conditional and sufficient score approach are seen in Li et al. (2004, 2007).

Author Manuscript

Papers that address all three facets of the problem, namely missing responses, longitudinal data and mismeasured covariates include Liu & Wu (2007), Wang et al. (2008), Yi (2005, 2008) and Yi et al. (2011). Liu & Wu (2007) and Yi et al. (2011) take a mixed model framework for the response process, and likelihood-based inferential procedures are used for which the missingness probability is modelled as a function of the true covariate and response variables, while Wang et al. (2008) and Yi (2005, 2008) use a marginal model framework. Except for the latter, these methods take a structural approach and assume that there is a model for the distribution of the mismeasured covariate X given error-free covariates. In modelling the missing data process, Yi (2005) and Wang et al. (2008) assume that the missingness probability is covariate free. Yi (2008) takes a functional approach to

Biometrika. Author manuscript; available in PMC 2017 August 03.

YI et al.

Page 3

Author Manuscript

relax the need to model the covariate process, and allows the missingness probability to depend on covariates, but the resultant estimator is not exactly consistent due to the use of the simulation-extrapolation method (Cook & Stefanski, 1995).

2. Response model and its assumptions 2.1. No measurement error Our data set-up is the following. There are i = 1, …, n independent individuals, with the possibility of j = 1, …, m visits. At visit j, the complete data are (Yij, Xij, Zij), where Yij is the response, Zij are covariates observed exactly, and Xij are error-prone covariates whose true values are unobserved. Denote Yi = (Yi1, …, Yim)T,

and

Author Manuscript

. Let μij = E(Yij | Xij, Zij) and υij = var(Yij | Xij, Zij) be the time-specific conditional expectation and variance of Yij, respectively, given Xij and Zij. Consider the regression model

(1)

is the vector of regression where g(·) is a known monotone function, and parameters. If necessary, an intercept may be included in βz by including unity in the covariate vector Zij. Further, assume υij = h(μij, ϕ), where h(·) is a known function and ϕ is a dispersion parameter that is known or may be estimated. For instance, for binary data there is no ϕ and υij = μij(1 – μij). We treat ϕ as known here with emphasis on estimation of ℬ.

Author Manuscript

2.2. Measurement error process Let Wij be the observed version of the covariate Xij. Denote . We assume that Xij and Wij follow a classical additive measurement error model, i.e., given (Xij, Zij, Yij),

(2)

where eij ∼ N(0, Σj) is independent of Xi, Zi and Yi.

Author Manuscript

Here we merely model marginal characteristics of the measurement error process for each time-point, and employ a functional modelling strategy with the distribution of Xij left unspecified. Model (2) accommodates the general scenario of dependent measurement errors. We assume that the Σj are known or estimated from replication experiments (Carroll et al., 2006).

Biometrika. Author manuscript; available in PMC 2017 August 03.

YI et al.

Page 4

2.3. Working independence and evolving covariates

Author Manuscript

In our approach, we build separate unbiased estimating functions for ℬ at each time-point j = 1, …, m, eventually combining them via the generalized method of moments. This section describes why such an approach is practical in the context of measurement error. Model (1) for the response postulates the population mean at each time-point as a function of time-specific covariates. Because of the measurement error, attention must be paid to how the true covariates evolve over time, and how they are related to responses at other timepoints.

Author Manuscript

The conventional assumption when there is no measurement error and no missing data is that E(Yij | Xi, Zi) = E(Yij | Xij, Zij), which is effectively the same thing as stating that given (Xij, Zij), Yij is independent of (Xik, Zik) for j ≠ k. Covariates satisfying this condition are called Type I by Lai & Small (2007). Thus, for example, assume that there is no Z, m = 2, cov(Xi1, Xi2) = Σx, Yij = μ(Xij, ℬ) + ∊ij and cov(∊i1, ∊i2) = Σ∊. Under the Type I assumption, an unbiased estimating function for ℬ is

Author Manuscript

where the subscript ℬ denotes differentiation with respect to ℬ. Accounting for the correlation among the ∊ij leads to more efficient estimation of ℬ than ignoring the correlation. It is an unbiased estimating function because, under the Type I assumption, for any function G(·, ·), E[G(Xik, Xij){Yij – μ(Xij, ℬ)}] = 0 for j ≠ k. However, Pepe & Anderson (1994), Pepe & Couper (1997) and Lai & Small (2007) point out that if Xi2 is not independent of Yi1 given Xi1, then this generalized estimating function is not unbiased i.e., does not have mean zero when evaluated at the model, so the resulting estimates may be inconsistent. This fact has led many authors to assume that Σ∊ is diagonal, the so-called working independence assumption that leads to consistent estimators. This is an interesting debate, and indeed in cases that the Xij can be observed Lai & Small (2007) show how to test the Type I assumption.

Author Manuscript

We are dealing with a very different problem. Suppose that the Type I assumption holds in the underlying data, so that for any j, given (Xij, Zij), Yij is independent of (Xik, Zik) for k ≠ j. We can show that in the presence of measurement error, the observed data, Yij and (Wik, Zik), given (Wij, Zij) for j ≠ k, are generally dependent. Thus, testing the Type I assumption based on observed error-prone data is likely to be difficult in practice. We sum this up in the following result. Lemma 1. Suppose the Type I assumption of Lai & Small (2007) holds in the complete data (Yij, Xij, Zij), and thus that Yij and (Xik, Zik) are independent given (Xij, Zij) for j ≠ k. In general, it is not the case that the Type I assumption holds in the observed data (Yij, Wij, Zij). See Appendix A1 for a proof of Lemma 1. As illustrated there, with dependent errors, the dependence of Yij on the observed Wik may not be fully captured by that of Yij on Wij. Even when the true covariates Xij and Xik are independent, the dependence between the errors eij Biometrika. Author manuscript; available in PMC 2017 August 03.

YI et al.

Page 5

Author Manuscript

and eik induces correlation between the observed Wij and Wik, which may distort the relationship between the response and the true covariates. 2.4. Missing data process First we examine how measurement error may change the missing data mechanism classification in the error-free context. In general, the model structure between the missing data indicator and the covariates is not preserved when the true covariates Xi are replaced with their surrogate Wi. In particular, Appendix A2 sketches an illustration of the following result. Let Rij = 1 if Yij is observed and Rij = 0 otherwise. Let R̃ij = {Ri1, …, Rij–1} be the history of the missing data indicator at time-point j, and let response measurements of Yi.

Author Manuscript

Lemma 2. If

contain the observed

, then it is possible that

. That is, the missingness process could be missing at random in Xi, but not missing at random in Wi. Next, we discuss how and why we model the missing data process. There are two obvious ways to do so. In the first case, one would take a measurement error view, and model the missing data process in terms of the underlying unobserved Xij data, and then adjust this in terms of the observed Wij data. This method is needlessly complex because it involves two stages:

Author Manuscript

Stage I. Fit a possibly complex logistic measurement error model with Rij being the response and with covariates involving the unobserved Xij, forming in the simplest case, for example, a model with probabilities π(Xij, Zij, α) = pr(Rij = 1 | Xij, Zij), where α is a parameter. Stage II. Obtain the observed probabilities by regressing π(Xij, Zij, α) on (Wij, Zij). This almost inevitably requires a model for Xij, which goes against the philosophy of functional measurement error models, and seems an indirect way to estimate probabilities in the observed data space. In the second case, one simply models the missing data process directly in terms of the observed data. The first case is needlessly complex, even if there are no repeated measures, and thus for the reasons described above we use the second approach.

Author Manuscript

Because the problems considered here involve covariate measurement error, it is not sensible to stick to the usual classification of missing data mechanisms that are defined in the errorfree context. Therefore, we abandon the usual modelling scheme of postulating the dependence of the missing data indicator on the true covariates Xi along with other variables, but modulate the missingness probability based on the observed surrogates Wi. This treatment of the missing data process enables us to build a more sensible model and allows more transparent interpretation of model parameters. To reflect the dynamic nature of the observation process over time, we assume that . This assumption is analogous to

Biometrika. Author manuscript; available in PMC 2017 August 03.

YI et al.

Page 6

Author Manuscript

the missing-at-random mechanism in the error-free context, and it says that the missingness probability depends on the observed data. As required in marginal analysis within the errorfree context, inclusion of such an assumption is merely to ensure model identifiability related to the missing data process, a necessary condition for estimating the associated parameters. This assumption is not essential to the development here, and it can be removed when conducting sensitivity analyses is of interest. Because subjects are assessed sequentially over time, it is natural to make a further assumption to reflect inherent ordering in time, i.e.,

Author Manuscript

where S̃ij = {Si1, Si2, …, Si,j–1} represents the history of variable Sij at time-point j for , Wij and Zij. Let

. One may use a logistic regression model to posit the

missing data process, i.e.,

, where uij is the vector consisting of the

information on the history of surrogates Wij, covariates Zij and the observed responses as well as the missing data indicator Rij. Here α is the vector of regression parameters. Estimation of α can proceed using a likelihood-based method. Let be the likelihood contribution from subject i, and Si(α) =

Author Manuscript

(∂/∂α) log Li(α). Then solving

leads to a consistent estimator α̂ of α.

3. Methodology 3.1. Overview In this section, we propose an inference method based on the time-specific marginal structure of the response process. The method is simple to implement but flexible enough to accommodate a wide class of applications. The key idea is to construct an unbiased estimating function, say ij(Yij, Xij, Zij, ℬ), for each time-point j under the ideal situation when neither missing responses nor covariate measurement error is present. Then in § 3.2 we correct for the error effects by using the marginal moments for the error process, and

Author Manuscript

denote by the adjusted estimating functions that are expressed in terms of the responses and the observed covariates. In the next step, in § 3.3 we modify these estimating functions by using inverse probability weights to incorporate missingness effects. Finally, in § 3.4, we use the generalized method of moments to combine those time-specific unbiased estimating functions to formulate one that is efficient in the class of all of their linear combinations.

Biometrika. Author manuscript; available in PMC 2017 August 03.

YI et al.

Page 7

3.2. Corrected scores adjusting for error effects

Author Manuscript

For each time-point j = 1, …, m, let Gij(Yij, Xij, Zij, ℬ) be unbiased estimating functions for ℬ when there is neither measurement error nor missing observations. Specifically, we take

which requires the weakest model assumption for the response process as made in § 2.1. For example, a logit link function for binary data yields , where H(t) = exp(t)/{1 + exp(t)} is the logistic distribution function.

Author Manuscript

In the absence of measurement error and missing observations, ij(Yij, Xij, Zij, ℬ) are unbiased estimating functions for ℬ, and hence they can be used to estimate ℬ. When Xij is subject to measurement error, however, we cannot directly use ij(Yij, Xij, Zij, ℬ) by replacing Xij with its observed value Wij, because the resulting estimating functions are not unbiased. One strategy to remedy this is to work on the conditional expectation E{ ij (Yij, Xij, Zij, ℬ) | Yij, Wij, Zij} to correct for the error effects. However, this approach requires a distributional assumption for Xij, which is difficult to specify in many applications. Alternatively, we proceed with a correction approach which does not require specification of the distribution of Xij. The idea is to construct estimating functions such that

Author Manuscript

(3)

With this construction, unbiasedness of ij(Yij, Xij, Zij, ℬ) under the response model (1) ensures unbiasedness of in the absence of missingness. This strategy has the same spirit as the so-called corrected likelihood method discussed by Nakamura (1990) for generalized linear models, but it applies more generally to estimating functions. With regression models such as linear regression, Gamma regression, inverse Gaussian regression and Poisson regression, Yi (2005) presents the expressions of the functions. However, with logistic regression for binary data under error

Author Manuscript

model (2), there exist no analytical functions such that (3) is true (Stefanski, 1989). This nonexistence is basically caused by the terms like involved in the logistic regression. To develop a more flexible method we next propose to introduce proper weights for the estimating functions ij(Yij, Xij, Zij, ℬ) so that a workable version is readily constructed. To be specific, let η(Xij, Zij, ℬ) be a function that does not depend on the

Biometrika. Author manuscript; available in PMC 2017 August 03.

YI et al.

Page 8

Author Manuscript

response Yij or missing data indicators Ri, but could depend on covariates and parameters. Define

such that

If there is a function

Author Manuscript

then is an unbiased estimating function for ℬ in the absence of missing observations. This scheme can be used in other contexts as well. For instance, with binary data Huang & Wang (2001) use this strategy for the logistic regression model, and in unpublished work, the first author has invoked this technique to construct estimating functions for analysis of interval count data. In particular, for binary data with logistic regression, setting leads to

By the moment identities associated with the error model (2), E (Wij | Xij) = Xij,

Author Manuscript

and , hence we take to be

, where the column vector 0 has the dimension of Zij. 3.3. Inverse probability weights adjusting for missingness effects

Author Manuscript

Let . Then is unbiased by the definition of πij. Indeed let ERi, ERi1 and ERij|R̃ij denote the conditional expectations taken with respect to the conditional densities f(ri | Yi, Wi, Zi), f(ri1 | Yi, Wi, Zi) and f(rij | R̂ij, Yi, Wi, Zi). Then

Biometrika. Author manuscript; available in PMC 2017 August 03.

YI et al.

Page 9

Author Manuscript

where the sign … represents the evaluation for the sequence of conditional expectations ERi3|R̃i3, …, ERi,j–1|R̃i,j–1, and the third identity is due to the assumptions made in § 2.4 and the definition of πij. As a result, established in § 3.2. The unbiasedness of

by the unbiasedness of

allows us to express all unbiased estimating functions in the form

for some function ℋ(·). In cases such as ours where

is linear in Yij, so that

Author Manuscript

say, we propose setting ℋ(Wij, Zij, ℬ) = B(Wij,

Zij, ℬ). This is algebraically equivalent to replacing argument Yij with (Rij/πij)Yij in and we thus propose the estimating function

,

where dependence of Φij(ℬ) on the parameter α is suppressed. 3.4. Generalized method of moments across j = 1, …, m, using the

We now optimally combine the functions

Author Manuscript

generalized method of moments (Hansen, 1982). Let

and

. Then a generalized method of moments estimator of ℬ is obtained by minimizing ΦT(ℬ)GΦ(ℬ), where G is a weight matrix. The asymptotically optimal weight matrix G is the inverse of the covariance matrix of Φi(ℬ). Equivalently, the optimal generalized method of moments estimator solves the estimating equations

(4)

where Ψi(ℬ) = DV−1Φi(ℬ),

and

.

Author Manuscript

In implementing (4), we replace D and V by the consistent estimates and Then solving

leads to the estimator ℬ̂ of ℬ.

Biometrika. Author manuscript; available in PMC 2017 August 03.

. Set Ψ̂i(ℬ) = D̂V̂−1Φi (ℬ).

YI et al.

Page 10

3.5. Asymptotic theory

Author Manuscript

When α is known to be α0, say, then under regularity conditions, n1/2(ℬ̂ – ℬ) has an asymptotic multivariate normal distribution with mean zero and covariance matrix , where Γ0 = E{∂Ψi(ℬ, α0)/∂ℬT}. However, when α is unspecified and estimated the variation of α̂ must be taken into account. A sketch of the technical details is given in Appendix A3. In this case, under regularity conditions, n1/2(ℬ̂ – ℬ) has an asymptotic multivariate normal distribution with mean zero and covariance matrix Γ−1(Γ−1)T, where Γ = E{∂Ψi(ℬ, α)/∂ℬT}, and Qi(ℬ, α) = Ψi(ℬ, α) – E{∂Ψi(ℬ, α)/∂αT}[E{∂Si(α)/∂αT}]−1Si(α). Thus, inference on ℬ can be conducted by replacing the asymptotic covariance matrix with its consistent estimate in the asymptotic normal distribution for ℬ̂. All these quantities can be estimated by method of moments calculations. Specifically, as n → ∞, Γ is estimated by the consistent estimator

Author Manuscript

, and Ω is estimated by , where .

Author Manuscript

The development here primarily focuses on accounting for the variation induced by estimation of α, the parameter associated with the missing data process. The dispersion parameter ϕ in the response model and the parameters governing the measurement error process are typically assumed known. One does not, however, have to be restricted by this assumption. It is straight-forward to modify the proof in Appendix A3 to accommodate the variability due to estimation of those parameters when necessary. For instance, if there are replicates of Wij, one may use the method of moments to estimate parameters, say σ, for the measurement error model. Now let Ui(σ) be the corresponding vector of estimating functions for σ from subject i. Then under regularity conditions, n1/2(ℬ̂ – ℬ) has an asymptotic multivariate normal distribution with mean zero, and its asymptotic covariance matrix assumes the same form as before, except for replacing Si(α) with and replacing α with θ = (σT, αT)T. In the same spirit, if the dispersion parameter ϕ is estimated from an estimating function, one can add this function to to work out the asymptotic covariance matrix for n1/2(ℬ̂ – ℬ).

4. Simulation studies 4.1. Comparison with the other methods

Author Manuscript

In this section, we discuss the results of simulation studies meant to assess the performance of our method and contrast these with other analyses which ignore measurement error or missingness or both. We set n = 500 and m = 5 and generated 1000 simulated datasets for each parameter configuration. We generated response measurements Yij independently from the logistic regression model

(5)

Biometrika. Author manuscript; available in PMC 2017 August 03.

YI et al.

Page 11

where Zi takes values 0 or 1 with probability 0.5. Independent of Zi, Xij = (Xij1, Xij2)T was

Author Manuscript

, ) and generated as N(μx, Σx) where μx = (μx1, μx2)T, while Σx has variances ( correlation ρx, with μxr = 0.5 and σxr = 1 for r = 1, 2. We set β0 = −0.1, βx1 = 0.3, βx2 = 0.6 and βz = 0.5. The surrogate value Wij = (Wij1, Wij2)T was generated as N(Xij, Σj), where Σj has variances ( , ) and correlation ρ. Various configurations were considered to feature distinct scenarios of measurement error in the covariate Xij. Specifically we considered (σ1, σ2) = 0.15, 0.5, 0.75 to feature minor, moderate and severe marginal measurement error, and (ρx, ρ) = (0, 0), (0, 0.5), (0.5, 0) or (0.5, 0.5) to allow for various correlations. Since the main conclusions are similar in all cases, we will display only the last case. We considered a case with drop-outs where the missing data indicator is generated from the model

Author Manuscript

(6) where we set α0 = −0.3, αy = 0.5, αw = 0.2 and az = 0.2. This yields about 45% missingness. Four analyses were conducted: (a) the naive one which ignores both covariate measurement error and missing responses, performed using the usual generalized estimating equations method with an independence correlation matrix employed; (b) the analysis that takes missingness into account but ignores measurement error in covariates; (c) the analysis that accounts for measurement error effects but ignores missingness and (d) our method which accommodates both measurement error and missingness.

Author Manuscript

In Table 1, we report on the case that (ρx, ρ) = (0.5, 0.5), the other configurations being similar. We displayed the biases of the estimates, their mean squared errors and also the coverage rates of nominal 95% confidence intervals. As expected the three analyses that do not accommodate measurement error or missingness produce strikingly biased results. Their biases increase as the degree of measurement error increases. Ignoring measurement error or missingness also has a profound impact on coverage rates of confidence intervals, primarily because of the bias, and the coverage rates decrease as the measurement error variance increases. Biases are also affected by the correlations between the true and observed covariates. In contrast, our method is very satisfactory, with coverage rates close to the nominal 95%.

Author Manuscript

These empirical studies demonstrate that ignoring either missingness or measurement or both could result in visibly biased results. It is important to adjust for effects induced by measurement error and missing data in inferential procedures, as does our method. 4.2. Sensitivity of our method In this section, we assess the sensitivity of our method. In particular, we consider the case that the missing data model is misspecified. To be specific, the missing data indicator for drop-outs is generated from the model which depends on the true covariates Xi through

Biometrika. Author manuscript; available in PMC 2017 August 03.

YI et al.

Page 12

Author Manuscript

(7)

where the parameter values are the same as in §4.1. However, when fitting the data, we use the model (6) that depends on the observed surrogate variable Wi. All other aspects of the data generation process are the same as in §4.1 with the same parameter values, except when one parameter value is changed intentionally to allow us to study its effect. We performed extensive numerical experiments, but here we discuss just one case, when the true covariates Xij1 and Xij2 are correlated with common variance

and correlation 0.5,

and the measurement errors have correlation ρ = 0.5 and common variance

.

Author Manuscript

First we examined the sensitivity of estimation of β relative to the error degree. In particular, we varied the measurement error variance scenarios. Because the covariate variance already nontrivial, while at

from 0.2 to 1.6 to represent a wide range of , when

, the corresponding error is

the error variance is as large as the covariate variance, i.e.,

, the measurement error dominates the noise and the signal are roughly the same. For the signal, so recovering the information in the covariates is not an easy task. The results on βx1 are displayed in Fig. 1(a). It is seen that if the measurement error is not too large, our method possesses a surprising robustness property, in that the average estimate is quite close to the true value. As expected when the measurement error increases, the bias increases.

Author Manuscript

We also experimented with different true values for βx1 in (5) and αw in (7). Figures 1(b) and (c) contain the corresponding estimates of βx1 and their confidence intervals when the αw or βx1 value changes, while all other parameters are kept at the original values. Results in Fig. 1 clearly suggest that the robustness property we observed does not rely on the specific values of the parameters in either model.

5. Empirical example As an illustration, we applied our method to analyse a dataset arising from the Continuing Survey of Food Intake by Individuals (Agricultural Research Service, 1997). The dataset consists of repeated measurements for 1737 individuals with 24 hour recall food intake interviews taken on four different days. Information on age, vitamin A intake, vitamin C intake, total fat intake and total calorie intake is collected at each interview.

Author Manuscript

Individuals with high levels of fat in their diet have higher risks of outcomes such as obesity and cancers. Let Y be the binary response variable to indicate whether or not an individual's reported percentage of calories exceeds 35%, a threshold that had been previously established see Food and Nutrition Board (2005, Ch. 8). Here we study how the fat intake changes with age and how it is associated with the intakes of vitamin A and vitamin C. Often in nutritional epidemiology, 24 hour recalls on caloric intake are treated as missing when the values are physiologically implausible. While there is no universal agreement what values should be taken as implausible (Tooze et al., 2007), Beasley et al. (2008) state that Biometrika. Author manuscript; available in PMC 2017 August 03.

YI et al.

Page 13

Author Manuscript

reported intakes of less than 500 calories for women and less than 800 calories for men are implausibly low. Here we took their definition, and this yields about 4% missingness of the Yij in the data we analysed. We considered the logistic regression model

where Zi denotes the age for subject i, and Yij, Xij1 and Xij2 represent the response, and intakes of vitamins A and C for subject i at interview j, respectively.

Author Manuscript

Vitamins A and C are measured with substantial random error. We set Wij1 to be the logarithm of 0.005 plus the standardized reported vitamin A intake, Wij2 to be similarly defined for reported vitamin C intake and Zi to be the baseline age in years divided by 100. The transformation on the raw scales of the vitamins A and C values allows us to assume a normal error distribution, with the Kolmogorov–Smirnov test yielding a p-value of around 0.2. However, the study does not have sufficient information to estimate the covariance matrix of the measurement errors directly. To obtain an approximate assessment, we first treated the four measurements of the vitamins A and C intakes as repeated measurements of

Author Manuscript

, and their correlation ρ = 0.36. an average intake value, and obtained that Considering that the estimated variances we have obtained in fact contain two sources of variability, the variability of the true vitamin intake near the time of the visits and the measurement error variability, yet we do not have sufficient information to separate these two, we decided to allocate half to each, so that the measurement error variances are 0.45 and 0.42. Using this, we performed the corresponding analyses, and the results are reported in Table 2. The analysis shows a significantly positive correlation between vitamin A intake and over-consumption of fat, while this association is negative for vitamin C. Considering that common sources of vitamin A are meat and animal organs, while that of vitamin C are vegetables and fruits, these results are perhaps plausible. The consequence of ignoring the measurement error is attenuation towards zero, while ignoring the missingness seems to result in slightly overestimating the covariate effects. We also conducted a sensitivity analysis to assess how different degrees of measurement error correlation may affect estimation of the parameters. In this analysis, we fixed the error variances as described above, and let the correlation vary from 0 to 1. Figure 2 shows that the error correlation does have an effect on the quantitative level of the estimation, although its effect is not dramatic and does not alter the qualitative aspects.

Author Manuscript

6. Extensions We have emphasized the marginal generalized linear model together with normally distributed measurement error and corrected scores for functional measurement error estimation. However, our basic methodology applies far more generally. For example, suppose that marginally, we propose a parametric model at each time-point for Yij given (Xij, Zij) in terms of a parameter ℬ. Suppose further that one has a marginal parametric

Biometrika. Author manuscript; available in PMC 2017 August 03.

YI et al.

Page 14

Author Manuscript

model for the measurement error in Wij given (Xij, Zij), in terms of parameters νj that are either known or estimated at the n1/2-rate: this does not have to be an additive measurement error model, nor does it have to be a normal measurement error model. Then Tsiatis & Ma (2004) show how to construct an unbiased estimating function for ℬ based on the observed data, generically denoted here as (Yij, Wij, Zij, ℬ, ν1, …, νm). Replacing with (Yij, Wij, Zij, , ν1, …, νm), all our developments in § 3 carry through. Hall & Ma (2007) show how to extend this to the case that the error model is additive but estimated nonparametrically through replication. Even more generally, as long as one can construct a marginal, unbiased estimating function, then our approach applies. Furthermore, if a large number of covariates are present and there is need to select a subset from it, the method in Ma & Li (2010) can be used to perform variable selection and estimation simultaneously.

Author Manuscript

Although extensive research has been directed to analysis of longitudinal data with either missing values or measurement error, those methods cannot be immediately applied to handle data with both features. Simultaneously, addressing missingness and measurement error is more challenging because these two characteristics could interactively affect inference about response parameters. Our method has applications in a broad variety of settings. It can be directly applied to deal with correlated data, such as clustered data and multivariate data, with missing observations and covariate measurement error. Our method can also be readily modified to handle data with more complex association structures, such as longitudinal data arising in clusters or longitudinal multivariate data.

Acknowledgments Author Manuscript

The authors thank the referees for their helpful comments. Yi's research was supported by the Natural Sciences and Engineering Research Council of Canada. Ma's research was supported by the National Science Foundation and the National Institute of Neurological Disorders and Stroke. Carroll's research was supported by the National Cancer Institute, the National Institute of Neurological Disorders and Stroke and the King Abdullah University of Science and Technology.

Appendix

Appendix

A1. Proof of Lemma 1

Author Manuscript

We consider a simple counterexample where m = 2 and there is no Z. Suppose that Yij = Xi,jβ + ∊ij, j = 1, 2, (∊i1, ∊i2)T ∼ N(0, Σ∊) and (∊i1, ∊i2)T is independent of (Xi1, Xi2)T. Then the Type I assumption holds for the regression mean of Yij on (Xi1, Xi2), j = 1, 2. Suppose further that (Xi1, Xi2) ∼ N(0, I) where I is the 2 × 2 identity matrix, and that (Wi1, Wi2)T = (Xi1, Xi2)T + (ei1, ei2)T where (ei1, ei2) ∼ N(0, Σe) with Σe being not diagonal, and (Xi1, Xi2)T and (ei1, ei2)T are independent. Then it is easily seen that as long as Σe is not diagonal and β ≠ 0, the Type I assumption fails for the observed data. For example, since Xij has mean zero, cov(Yi1, Yi2) = β2I + Σe, cov(Wi1, Wi2) = I + Σe and cov{(Yi1, Yi2), (Wi1, Wi2)} = βI. The joint covariance matrix of (Yi1, Yi2, Wi1, Wi2) is

Biometrika. Author manuscript; available in PMC 2017 August 03.

YI et al.

Page 15

Author Manuscript

Hence, E{(Yi1, Yi2)T | (Wi1, Wi2)} = β(I + Σe)−1(Wi1, Wi2)T. If the diagonal elements of Σe both equal 1, and if the correlation coefficient of ei1 and ei2 is ρe, then the regression mean of Yi1 on (Wi1, Wi2) is Lemma 1.

Appendix

, not a function of Wi1 alone. This proves

A2. Error effect on the model for the missing data process

Author Manuscript

Lemma 2 follows from the following example that the underlying missingness process is missing-at-random in Xi but not in Wi. There are no Zi covariates. Consider a case with drop-outs which is modelled by a probit regression model

(A1)

where F(·) is the standard normal cumulative distribution function. Conditional on Yi and X, the Rij are assumed to be independent. Assume that conditional on Xi, the Yij are independent having a N(β0 + βxXij, σ2) distribution, and that the Wij are independent having distribution. Assume that the Xij are independent having a marginal

a distribution

.

Author Manuscript

Now we show that although (A1) corresponds to missing at random if Xi were available, the conditional distribution of the missing data indicator given the observed surrogate Wi is not missing-at-random. Indeed with the assumption analogous to nondifferential error that pr(Rij = 1 | Yi, Xi, Wi) = pr(Rij = 1 | Yi, Xi), we obtain

Since Xij (Yij, Wij) ∼ N(a + bYij + cWij, d2) for some constants a, b, c, d that are determined by β0, βx, , , σ2 and μx, then using the identity ∫(1/γ)F(δ + u)ϕ(u/γ) du = F{δ/(1 + γ2)1/2} for constants γ and δ, where ϕ(·) is the standard normal density function, we can

Author Manuscript

show that . The dependence of pr(Rij = 1 | Yi, Wi) on the response measurement Yij at time-point j shows that the missingness probability depends on a future unobserved response component, suggesting that a missing-at-random structure is not true for the conditional distribution pr(Rij = 1 | Yi, Wi).

Biometrika. Author manuscript; available in PMC 2017 August 03.

YI et al.

Page 16

Author Manuscript

Appendix A3. Sketch of the arguments for the asymptotic theory Let . Then E{Hi(ℬ, α)} = 0. By Newey & McFadden (1993, Theorem 3.4), under standard regularity conditions there is a unique solution (ℬ̂, α̂) to the equation

with probability approaching 1, that satisfies

. For the estimator ℬ of central interest, we have the asymptotic distribution for n1/2(ℬ̂ – ℬ).

. The central limit theorem then leads to

References Author Manuscript Author Manuscript Author Manuscript

Agricultural Research Service. Design and Operation: The Continuing Survey of Food Intakes by Individuals and the Diet and Health Knowledge Survey, 1994–96. US Department of Agriculture; Hyattsville, MD: 1997. Beasley JM, Riley WT, David A, Singh J. Evaluation of a PDA-based dietary assessment and intervention program: a randomized controlled trial. J Am College Nutr. 2008; 27:280–86. Carroll, RJ., Ruppert, D., Stefanski, LA., Crainiceanu, CM. Measurement Error in Nonlinear Models: A Modern Perspective. 2nd. New York: Chapman & Hall/CRC Press; 2006. Cook J, Stefanski LA. A simulation extrapolation method for parametric measurement error models. J Am Statist Assoc. 1995; 89:1314–28. Food and Nutrition Board. Dietary Reference Intakes for Energy, Carbohydrate, Fiber, Fat, Fatty Acids, Cholesterol, Protein, and Amino Acids (Macronutrients). Washington, D.C.: The National Academies Press; 2005. Hall P, Ma Y. Measurement error models with unknown error structure. J R Statist Soc B. 2007; 69:429–46. Hansen LP. Large sample properties of generalized method of moments estimators. Econometrica. 1982; 50:1029–54. Huang Y, Wang CY. Consistent functional methods for logistic regression with errors in covariates. J Am Statist Assoc. 2001; 96:1469–82. Lai TL, Small D. Marginal regression analysis of longitudinal data with time-dependent covariates: a generalized method of moments approach. J R Statist Soc B. 2007; 69:79–99. Li E, Zhang D, Davidian M. Conditional estimation for generalized linear models when covariates are subject-specific parameters in a mixed model for longitudinal measurements. Biometrics. 2004; 60:1–7. [PubMed: 15032767] Li E, Wang N, Wang NY. Joint models for a primary endpoint and multiple longitudinal covariate processes. Biometrics. 2007; 63:1068–78. [PubMed: 17501940] Liang H. Generalized partially linear mixed-effects models incorporating mismeasured covariates. Ann Inst Statist Math. 2009; 61:27–46. Lin X, Carroll RJ. Semiparametric estimation in general repeated measures problems. J R Statist Soc B. 2006; 68:68–88. Liu W, Wu L. Simultaneous inference for semiparametric nonlinear mixed-effects models with covariate measurement errors and missing responses. Biometrics. 2007; 63:342–50. [PubMed: 17688487] Ma Y, Li R. Variable selection in measurement error models. Bernoulli. 2010; 16:274–300. [PubMed: 20209020] Nakamura T. Corrected score functions for errors-in-variables models: methodology and application to generalized linear models. Biometrika. 1990; 77:127–37.

Biometrika. Author manuscript; available in PMC 2017 August 03.

YI et al.

Page 17

Author Manuscript Author Manuscript Author Manuscript

Newey, WK., McFadden, D. Estimation in large samples. In: McFadden, D., Engler, R., editors. Handbook of Econometrics. Vol. 4. Amsterdam: North-Holland: 1993. Palta M, Lin CY. Latent variables, measurement error and methods for analysing longitudinal binary and ordinal data. Statist Med. 1999; 18:385–96. Pan W, Zeng D, Lin X. Estimation in semiparametric transition measurement error models for longitudinal data. Biometrics. 2009; 65:728–36. [PubMed: 19173696] Pepe MS, Anderson GL. A cautionary note on inference for marginal regression models with longitudinal data and general correlated response data. Commun Statist B. 1994; 23:939–51. Pepe MS, Couper D. Modeling partly conditional means with longitudinal data. J Am Statist Assoc. 1997; 92:991–8. Stefanski LA. Unbiased estimation of a nonlinear function of a normal mean with application to measurement error models. Commun Statist A. 1989; 18:4335–58. Stefanski LA, Carroll RJ. Conditional scores and optimal scores in generalized linear measurement error models. Biometrika. 1987; 74:703–16. Tooze JA, Vitolins MZ, Smith SL, Arcury TA, Davis CC, Bell RA, DeVellis RF, Quandt SA. High levels of low energy reporting on 24-hour recalls and three questionnaires in an elderly lowsocioeconomic status population. J Nutr. 2007; 137:1286–93. [PubMed: 17449594] Tsiatis AA, Ma Y. Locally efficient semiparametric estimators for functional measurement error models. Biometrika. 2004; 91:835–48. Wang CY, Huang Y, Chao EC, Jeffcoat MK. Expected estimating equations for missing data, measurement error, and misclassification, with application to longitudinal nonignorable missing data. Biometrics. 2008; 64:85–95. [PubMed: 17608787] Wang N, Lin X, Gutierrez RG, Carroll RJ. Generalized linear mixed measurement error models. J Am Statist Assoc. 1998; 93:249–61. Xiao Z, Shao J, Palta M. GMM in linear regression for longitudinal data with multiple covariates measured with error. J Appl Statist. 2010; 37:791–805. Yi GY. Robust methods for incomplete longitudinal data with mismeasured covariates. Far East J Theor Statist. 2005; 16:205–34. Yi GY. A simulation-based marginal method for longitudinal data with dropout and mismeasured covariates. Biostatistics. 2008; 9:501–12. [PubMed: 18199691] Yi GY, Liu W, Wu L. Simultaneous inference and bias analysis for longitudinal data with covariate measurement error and missing responses. Biometrics. 2011; 67:67–75. [PubMed: 20528858] Zhou Y, Liang H. Statistical inference for semiparametric varying-coefficient partially linear models with error-prone linear covariates. Ann Statist. 2009; 37:427–58.

Author Manuscript Biometrika. Author manuscript; available in PMC 2017 August 03.

YI et al.

Page 18

Author Manuscript Fig. 1.

Author Manuscript

Sensitivity analysis of our method. The three panels display the average estimates of βx1 (dotted curves) and 95% confidence intervals (dashed curves) as functions of and βx1 (c). Solid lines correspond to the true values of βx1.

Author Manuscript Author Manuscript Biometrika. Author manuscript; available in PMC 2017 August 03.

(a), αw (b)

YI et al.

Page 19

Author Manuscript Author Manuscript

Fig. 2.

Sensitivity analyses for the empirical example in § 5. (a) contains estimates of βx1 (solid curve) and 95% confidence intervals (dashed curves), displayed as a function of the correlation coefficient ρ. (b) contains the same information for βx2.

Author Manuscript Author Manuscript Biometrika. Author manuscript; available in PMC 2017 August 03.

Author Manuscript Table 1

Author Manuscript

Author Manuscript

Author Manuscript

Biometrika. Author manuscript; available in PMC 2017 August 03.

No missing adjusted

No error adjusted

Naive

Our method

No missing adjusted

No error adjusted

Naive

Our method

No missing adjusted

No error adjusted

Naive

Method

x1

−0.11

−0.08

β −0.03 −0.00 0.02

−0.19 −0.02 −0.14 −0.02 −0.06 −0.02

βz βx1 βz

βx1 βz βx1 βz βx1 βz

x1

−0.00

βz

x1

β

−0.17 −0.01

x1

0.01

0.00

βz

β

βz

x1

−0.03

β

−0.08

βz

4.24

15.99

1.32

2.28

1.10

3.92

σ 1 = σ2 = 0.75

1.82

1.20

1.64

3.42

1.40

1.56

1.18

3.24

σ1 = σ2 = 0.50

1.61

0.85

1.50

1.27

1.45

0.01

βz βx1

1.21

2.07

0.72

x1

−0.01

−0.13

MSE

σ1 = σ2 = 0.15

−0.04

β

βz

β

Bias

86.9

68.0

95.0

30.6

88.0

0.3

95.0

94.8

87.4

70.8

94.1

58.5

85.8

3.0

94.8

94.3

87.5

67.4

94.9

89.8

86.5

23.8

CVG (%)

β0

βx2

β0

βx2

β0

βx2

β0

βx2

β0

βx2

β0

βx2

β0

βx2

β0

βx2

β0

−0.72

−0.08

0.17

−0.27

−0.70

−0.38

0.01

0.01

−0.71

−0.14

0.13

−0.21

−0.73

−0.34

0.01

0.01

−0.70

−0.15

0.04

β0 βx2

−0.08

−0.80

−0.26

βx2

β0

βx2

Bias

57.23

18.51

3.76

7.55

50.10

14.88

1.13

1.44

52.36

5.48

2.48

4.85

54.64

12.13

0.96

0.99

50.10

2.87

0.96

1.20

64.42

7.05

MSE

0.5

50.8

43.1

0.7

0.0

0.0

93.6

95.5

0.0

49.9

64.9

12.3

0.0

0.0

93.7

95.0

0.0

39.6

91.0

80.8

0.0

0.9

CVG (%)

Results of the simulation study when both the covariates and the measurement errors have nondiagonal covariance matrices with correlation ρx = ρ = 0.5

YI et al. Page 20

Author Manuscript 0.01

βz 95.7

95.3

β0

βx2 0.03

−0.00 5.01

2.35

MSE

94.2

96.1

CVG (%)

), the measurement error variances; BIAS, the bias; CVG, the actual coverage of a nominal 95% confidence interval. Naive ignores both missing data and

1.78

3.16

Bias

measurement error; No error adjusted accommodates missing data but ignores the measurement error; No missing adjusted adjusts for measurement error but ignores the missing data; Our method accounts for both features.

,

−0.02

βx1

MSE, 100× mean squared error; (

Our method

CVG (%)

Author Manuscript MSE

Author Manuscript

Bias

Author Manuscript

Method

YI et al. Page 21

Biometrika. Author manuscript; available in PMC 2017 August 03.

Author Manuscript

Author Manuscript

0.17

−0.12

βx1

βx2

0.39

0.03

0.03

0.14

Naive

s.e.

0.29

0.00

0.00

0.14

p-value s.e.

p-value

0.42

−0.13

0.12

0.21

0.39

0.03

0.03

0.15

0.29

0.00

0.00

0.14

No error adjusted

est

s.e.

p-value

0.60

−0.31

0.39

0.31

0.40

0.06

0.07

0.15

0.13

0.00

0.00

0.04

No missing adjusted

est

0.57

s.e.

p-value

0.41

0.07

0.07

0.16

0.16

0.00

0.00

0.09

Our method

−0.31

0.28

0.26

est

est, the estimate; s.e., the standard error of the estimate; p-value, the p-value when testing whether the corresponding coefficient is zero; Naive ignores both missing data and measurement error; No error adjusted accommodates missing data but ignores the measurement error; No missing adjusted adjusts for measurement error but ignores the missing data; Our method accounts for both features.

041

0.21

est

β0

βz

Author Manuscript Table 2

Author Manuscript

Results for data analysis of § 5 using (2) as the measurement error structure

YI et al. Page 22

Biometrika. Author manuscript; available in PMC 2017 August 03.

A functional generalized method of moments approach for longitudinal studies with missing responses and covariate measurement error.

Covariate measurement error and missing responses are typical features in longitudinal data analysis. There has been extensive research on either cova...
NAN Sizes 0 Downloads 14 Views