Behav Res DOI 10.3758/s13428-014-0444-4

iVAR: A program for imputing missing data in multivariate time series using vector autoregressive models Siwei Liu & Peter C. M. Molenaar

# Psychonomic Society, Inc. 2014

Abstract This article introduces iVAR, an R program for imputing missing data in multivariate time series on the basis of vector autoregressive (VAR) models. We conducted a simulation study to compare iVAR with three methods for handling missing data: listwise deletion, imputation with sample means and variances, and multiple imputation ignoring time dependency. The results showed that iVAR produces better estimates for the cross-lagged coefficients than do the other three methods. We demonstrate the use of iVAR with an empirical example of time series electrodermal activity data and discuss the advantages and limitations of the program. Keywords Time series . Vector autoregressive model (VAR) . Missing data With the increasing emphasis on studying intraindividual processes and the development of intensive data collection technology such as personal digital assistants (PDA), smartphones, and ambulatory sensors, time series analysis has gained enormous popularity in the behavioral sciences in recent years. Time series data are sequences of data points measured repeatedly from the same individual(s) over time. Unlike traditional longitudinal data, time series data typically involve a larger number of time points than of individuals, making it ideal for idiographic examinations of behavioral and psychological phenomena. In psychology, time series data usually come from studies of psychophysiology (e.g., electrodermal activity, heart rate, electroencephalography, and S. Liu (*) Human Development and Family Studies, Department of Human Ecology, University of California, Davis, Davis, CA 95616, USA e-mail: [email protected] P. C. M. Molenaar Department of Human Development and Family Studies, Pennsylvania State University, University Park, PA, USA

functional magnetic resonance imaging), daily diary studies, or ecological momentary assessments. Applications of time series analysis are seen in studies of emotions (e.g., Chow, Nesselroade, Shifren, & McArdle, 2004), substance use (e.g., Velicer, Redding, Richmond, Greeley, & Swift, 1992; Zheng, Wiebe, Cleveland, Molenaar, & Harris, 2013), treatment adherence patterns (e.g., Aloia et al., 2008), social interactions (e.g., Belz, Beekman, Molenaar, & Buss, 2013; Ferrer & Helm, 2013; Molenaar, Sinclair, Rovine, Ram, & Corneal, 2009), kinesiology (e.g., Wang, Molenaar, & Newell, 2013), and many other aspects of human behavior. This article introduces an R program, iVAR, for imputing missing data in multivariate time series. Missing data are a common problem in time series analysis, due to attrition, the noncompliance of participants, or technical problems during physiological recordings (e.g., disconnection of sensor). Advanced techniques for handling missing data, such as full information maximum likelihood (FIML) estimation and multiple imputation (MI), have been developed for cross-sectional and longitudinal panel data and can be easily carried out in standard statistical packages (e.g., SPSS and SAS) or standalone software (e.g., NORM; Graham, 2012; Schafer, 1997). FIML reads in the raw data one case at a time and maximizes the likelihood function using whatever information is available for each case. The MI approach first imputes multiple data sets from random samples of the population using techniques such as bootstrapping (Efron, 1982) or data augmentation (Tanner & Wong, 1987), then combines the results from the imputed data sets using Rubin’s rules (Rubin, 1987). When conducted properly, both FIML and MI enable researchers to make valid statistical inferences when data are missing at random (Graham, 2009). However, these techniques either have limitations or are difficult to carry out in the time series context. For example, many time series models can be estimated in structural equation modeling (SEM) programs, which handle missing data using FIML.

Behav Res

However, the SEM approach requires the model to be fully specified, and existing programs do not have an automatic order selection procedure. With respect to MI, although existing programs can appropriately impute missing values in longitudinal panel data (e.g., PAN; Schafer, 2001), they cannot handle missing data in time series. This is because longitudinal panel data and time series data have fundamentally different structures, despite both involving repeated measures over time. In longitudinal panel data, time is fixed, whereas in time series, time is random. In addition, in longitudinal panel data, information is pooled over the units of analysis (e.g., individuals) to impute missing data, whereas time series typically have only one unit of analysis, and information can only be pooled over time. Due to difficulty in implementing the FIML and MI approaches, the options available for handling missing data in time series are limited. Most statistical packages either do not allow missing data in time series analysis or only allow ad-hoc procedures such as listwise deletion and mean imputation. These methods often lead to bias in the estimates. In this article, we present a new method and an accompanying program, iVAR, for handling missing data in multivariate time series. iVAR uses one-step-ahead predictions to impute missing data on the basis of vector autoregressive (VAR) modeling. In a simulation study, we show that iVAR produces better estimates than do ad-hoc procedures such as listwise deletion and imputation using sample means and variances, as well as the naïve approach of conducting MI on time series data while ignoring the time dependency (i.e., treating observations at different time points as independent). We then demonstrate the use of iVAR with an empirical example and discuss the advantages and limitations of the program.

The vector autoregressive model The vector autoregressive (VAR) model is an extension of the univariate autoregressive (AR) model and the most commonly used approach for analyzing stationary multivariate time series when no latent variables are involved (as in general statespace models). Any stationary multivariate process can be approximated arbitrarily closely by means of a VAR model of sufficiently large order (Hannan, 1970). The general form of a VAR model of order p is *

*

*

*

*

*

Y t ¼V þA1  Y t−1 þ A2  Y t−2 þ … þ Ap  Y t−p þ U t ; *

ð1Þ

where Y t is an m × 1 vector of observed data at time t,*Ak are m × m matrices of autoregressive * coefficients at lag k, V is an m × 1 vector of intercepts, and U t represents an m-dimensional white noise process with E(ut) = 0, E(utut′) = Σu, and

E(utus′) = 0 for s ≠ t (Lütkepohl, 2006). The diagonal coefficients in Ak capture the serial dependency of a variable on itself, and the off-diagonal coefficients capture the serial dependency of a variable on other variables. It can be shown that any VAR(p) process can be written as a *first-order VAR process by extending the dimension of Y t (Lütkepohl, 2006). In addition, the means of the data can be removed prior to the analysis by centering the data. Hence, a simplified VAR(1) process with two variables can be written as: 

y1t y2t



 ¼

a11 a21

a12 a22



   y1ðt−1Þ u þ 1t ; y2ðt−1Þ u2t

ð2Þ

Here, a11 and a22 are the lag-1 autoregressive coefficients of y1 and y2 predicting themselves, a12 is the lag-1 coefficient of y2 predicting y1, and vice versa for a21. Notably, a12 and a21 are measures of Granger causality, which refers to the idea that lagged values of a time series are useful in predicting another time series after all other relevant information (e.g., history of the dependent time series) is taken into account (Granger, 1969; Lütkepohl, 2006). In empirical studies, assessing Granger causality allows us to determine the direction of influences between multiple time series, such as electroencephalography (EEG) signals recorded from different channels or measures obtained from different individuals simultaneously. Therefore, the off-diagonal coefficients in the A matrices are often the parameters of interest in psychological studies involving time series data.

iVAR and alternative methods for handling missing data in time series iVAR1 (“imputation with VAR”) utilizes a recursive procedure that imputes missing values on the basis of a VAR model from a moving window containing previous observations. Specifically, users first specify a window width n. Missing values at time t are then replaced by one-step-ahead predictions using a VAR model based on observations in the window of [t – n, t – 1]. For example, for a first-order bivariate time series with means of zero, the one-step ahead predictions can be computed as ⌢   y ⌢1t ¼ y 2t where ⌢ a 11 ⌢ a 21

1

⌢ a 11 ⌢ a 21

 ⌢  y a 12 1ðt−1Þ ; ⌢ y2ðt−1Þ a 22

ð3Þ

⌢  a 12 ⌢ a 22

iVAR can be downloaded from http://siweiliu.weebly.com/programs.html.

Behav Res

is the estimated A1 matrix. The estimation of VAR is conducted using the vars package in R (Pfaff, 2008) and the order of VAR is selected using the Akaike information criterion (AIC; Akaike, 1974). This automatic order selection procedure ensures that the VAR model used for prediction represents a faithful description of the sequential structure in the data within each time window. In iVAR, this recursive imputation routine runs from the beginning of the time series toward the end of the time series, filling up each missing value. Notably, iVAR also allows users to specify whether the data need to be detrended. Trends are removed by fitting a polynomial function to each time series. One-step-ahead predictions are then made on the basis of the VAR model fitted to the residuals. The imputed value is computed by summing the predicted value from the polynomial regression model and the one-stepahead predicted value from the VAR model. For simplicity, in this article we will focus on data without trends. To examine the utility of iVAR, we conducted a simulation study in which we compared iVAR with two popular ad-hoc techniques for handling missing data. The first ad-hoc technique was listwise deletion. In listwise deletion, the whole vector of observations at a particular time point is deleted if one of the elements in the vector is missing. This approach shortens the length of the data and results in a decreased sample size. In time series analyses in which the lag between data points is important, listwise deletion can distort the autoregressive relations in the data. Despite these drawbacks, this method is used extensively by researchers. In R, listwise deletion is the only option for handling missing data when estimating autoregressive models using the basic stats package (R Development Core Team, 2011). In SAS, users can only choose between listwise deletion, replacing missing values with the first contiguous observations, and mean imputation in the estimation of VAR models (SAS Institute Inc., 2010). Two studies have examined the consequences of listwise deletion in the time series context. Rankin and Marsh (1985) found that the negative effects of this approach are small, as long as the missing data do not exceed 20 %. Velicer and Colby (2005) conducted a simulation study based on an ARIMA (autoregressive integrated moving average) model and found that listwise deletion generally yields accurate estimates of intercepts and error variances, but that it tends to overestimate the linear trend in the data. It also underestimates the degree of dependency when the autocorrelation is negative. Both studies, however, only consider univariate time series. It is unclear how listwise deletion affects the estimation of VAR, which involves multiple time series. The second ad-hoc technique considered in this study is data imputation using sample means and variances. Specifically, for each variable, missing values are substituted by the sum of two components: (1) the mean of the nonmissing observations, and (2) a random component that is normally distributed with a mean of zero and a variance

equivalent to the variance of the nonmissing observations. As compared to listwise deletion, this approach retains the sample size, and thus may be more appealing when the proportion of missing data is large. However, the autoregressive relations may be distorted because the random component is assumed to have no serial dependency. We also compared iVAR to a naïve approach of MI in which time dependency in the data is ignored and observations are treated as independent across time. In this approach, data are imputed solely on the basis of the lag-0 covariance matrix; hence, we would expect to find a distortion in the autoregressive relations between variables.

A simulation study To compare the four methods, we simulated time series data in R 2.11.1 (R Development Core Team, 2011) on the basis of the bivariate VAR(1) model shown in Eq. 2. All data were generated from processes with a unidirectional effect from y1 to y2 (a21 ≠ 0, a12 = 0). The simulation of A1 followed a fully crossed design with three factors. The first factor was Magnitude of a11 and a22, and we set these two parameters to be identical with absolute values of 0.4 or 0.8. The second factor was Sign of a11 and a22, which could be either positive or negative. The third factor was Sign of a21, and we set this parameter to be either 0.3 or –0.3. In total, we created eight (2×2×2) different patterns for A1. These patterns are shown in the first row in Table 1 and approximate a wide range of autoregressive patterns seen in empirical studies. The error process was set to be a bivariate white noise process with unit variances and zero covariances: 

X u

 1 0 : ¼ 0 1

The length of the time series was 240.2 Because iVAR required a starting model, the data that we actually simulated had 480 time points. Missing data were created by randomly eliminating values from the last 240 time points of the time series, with the proportion of missing data being 10 %, 20 %, or 30 %. For each pattern of A1, we simulated 100 sets of data. We aimed to compare the four missing-data techniques in their abilities to recover the cross-lagged coefficients, a12 and a21. For MI, 20 sets of data were imputed for each set of missing data using the mi package in R (Su, Gelman, Hill, & Yajima, 2011), and the averaged estimates over the 20 imputed data sets were used in the comparison, as had been suggested by Rubin (1987). Model estimation was conducted in R 2 The length of data was arbitrary. We discuss the effect of data length on iVAR later in this article.

Behav Res Table 1 Estimated VAR models on the complete data, broken down by conditions



A1

−0:4 0 −0:3 −0:4

VAR(1)a DTF12b DTF21b

 

−0:4 0 −0:3 −0:4

 

−0:8 0 −0:3 −0:8

 

−0:8 0 0:3 −0:8

 

0:4 0 −0:3 0:4

88 %

90 %

89 %

82 %

89 %

.0529 (.0395) .2827 (.0953)

.0459 (.0375) .2867 (.0707)

.0423 (.0360) .2930 (.0840)

.0417 (.0319) .2790 (.0937)

.0465 (.0377) .2673 (.0795)

 

0:4 0 0:3 0:4

 

85 %

0:8 0 −0:3 0:8

 

0:8 0 0:3 0:8



85 %

93 %

.0509 (.0382) .0435 (.0446) .2552 (.0853) .2759 (.0924)

.0367 (.0338) .3008 (.0713)

N = 100. a Percentage of first-order VAR models in all estimated models. b Means and standard deviations (in parentheses) of the estimated directed transfer functions (DTF)

2.11.1 using the Yule–Walker method (Lütkepohl, 2006; R Development Core Team, 2011). The order of VAR was determined by the AIC. Because the estimated models were not necessarily of order 1, direct comparisons of the parameter estimates were not possible. Therefore, the estimated VAR models were transformed into the frequency domain using the formula Að f Þ ¼ I−

p X

Ak ⋅expð–2πkfiÞ;

ð4Þ

k¼1

where Ak is the autoregressive coefficient matrix at lag k in the estimated VAR(p) model without intercepts, i is the square root of –1, and f ¼

n ; N

n ¼ 0; 1; …; N −1;

ð5Þ

where N is the number of time points (Schlögl & Supp, 2006). This transformation allowed us to compute the frequencydomain statistic representing a composite measure of the directional influences between y1 and y2. This statistic is called the directed transfer function (DTF) and can be computed as   H ij ð f Þ DT F ij ð f Þ ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Xm  ffi H 2 ð f Þ k¼1

ik

  H ij ð f Þ ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ; H i: ð f ÞH H i: ð f Þ

the averaged DTFs across frequencies as dependent variables in the later analyses. We conducted repeated measures analysis of variance (ANOVA) to compare the four missing-data techniques. According to the design of the study, the three betweensubjects factors were Magnitude of a11 and a22, denoted as AR_LARGE, with 0 representing small and 1 representing large; Sign of a11 and a22, denoted as AR_POSITIVE, with 0 representing negative and 1 representing positive; and Sign of a21, denoted as CR_POSITIVE, with 0 representing negative and 1 representing positive. In addition, there were two within-subjects factors: Missing Data Technique and Proportion of Missing Data. We tested for sphericity in the repeated measures ANOVA models, and adjusted the F tests with the Huynh–Feldt epsilon if sphericity was violated. Because the p values of the effects are influenced by the number of replications, which is arbitrary in a simulation study, we estimated effect sizes using η2. According to Cohen (1988), an effect size of η2 = .01 can be considered “small,” η2 = .06 can be considered “medium,” and η2 = .14 can be considered “large.” To focus on the most important effects, we only report results that were both significant and had an effect size of η2 ≥ .06. Complete data analysis

ð6Þ

where H(f) = A–1(f), and Hi:(f) is the ith row of H(f) (Kamiński & Blinowska, 1991). The DTFs are normalized between 0 and 1, with higher values indicating larger influences. They have been shown to be valid measures of Granger causality in the frequency domain (Kamiński, Ding, Truccolo, & Bressler, 2001). In recent years, DTFs have gained increasing popularity in psychological studies for assessing the strength of influences in dynamic interactions, especially in the area of brain imaging (Blinowska, 2011; Schlögl & Supp, 2006). For the first-order bivariate VAR model shown in Eq. 2, DTF12 corresponds to a12, and DTF21 corresponds to a21. We computed

Table 1 shows the results from the complete data analysis, broken down by conditions. The first row shows the percentages of first-order VAR in the 100 replications. We can see that even with complete data, not all estimated VAR models were of order 1. The percentages of VAR(1) models ranged from 82 % to 93 % across conditions. Looking at the average DTFs over frequencies, we see that DTF12 (i.e., influences from y2 to y1) was close to zero, whereas DTF21 (influences from y1 to y2) was close to .3, which reflects the equivalence between the timedomain VAR models and their frequency-domain counterparts. Directed transfer function from y1 to y2 (DTF21) A five-way repeated measures ANOVA (Technique × Proportion of Missing × AR_LARGE × AR_POSITIVE ×

Behav Res

CR_POSITIVE) was first conducted with DTF21 as the dependent variable. The complete case scenario was not included in this model, because it was not part of the factorial design and its inclusion led to an unbalanced data structure. None of the interaction effects involving both technique and proportion of missing data had an effect size of η2 ≥ .06. In other words, all effects involving technique applied similarly across proportions of missing data, and vice versa. This justifies analyses including only technique or proportions of missing data. Next, we conducted a four-way repeated measures ANOVA (Technique × AR_LARGE × AR_POSITIVE × CR_POSITIVE) to test the difference between the complete data and the four missing-data techniques, averaged across proportions of missing data. No interaction effect involving CR_POSITIVE had an effect size of η2 ≥ .06, indicating that the sign of a21 did not affect the magnitude of DTF21. Hence, we removed this factor from the model. The results of the trimmed model are shown in Table 2. We found two interaction effects with η2 ≥ .06: Technique × AR_LARGE, F(3.33, 2652.63) = 294.74, p < .001, η2 = .27 (Fig. 1), and Technique × AR_POSITIVE, F(3.33, 2652.63) = 271.39, p < .001, η2 = .25 (Fig. 2). Post-hoc contrasts indicated that AR_LARGE had different effects on listwise deletion (η2 = .07), imputation using sample means and variances (η2 = .07), and MI (η2 = .31) than on the complete data. Looking at Fig. 1, we can see that averaging across proportions of missing data and the sign of the autoregressive and cross-lagged coefficients, iVAR produced estimates similar to the complete data. The two ad-hoc techniques and MI all produced biased estimates. For listwise deletion, a small magnitude of autoregressive coefficients (solid line, a11 = a22 = ±.4) led to a considerably larger downward bias than did a large magnitude of autoregressive coefficients

Fig. 1 Estimated marginal means of DTF21 by technique and AR_LARGE. Complete = complete data. Listwise = listwise deletion. Mean = imputation using sample means and variances. MI = multiple imputation

(dotted line, a11 = a22 = ±.8). For the other two imputation methods, the pattern was reversed. In particular, AR_LARGE had the most salient effect on MI. When the autoregressive coefficients were ±.8, MI produced a much larger downward bias than when the autoregressive coefficients were ±.4. For the interaction of technique and AR_POSITIVE (Fig. 2), post-hoc

Table 2 Effects of technique, AR_LARGE, and AR_POSITIVE on DTF21, averaged across proportions of missing data Source Within-Subjects Effects Technique Technique × AR_POSITVE Technique × AR_LARGE Technique × AR_POSITIVE × AR_LARGE Error (Technique) Between-Subjects Effects Intercept AR_POSITIVE AR_LARGE AR_POSITIVE × AR_LARGE Error N = 100

df

F

p Value

η2

3.33 3.33 3.33 3.33

960.51 271.39 294.74 22.35

.000 .000 .000 .000

.55 .25 .27 .03

.000 .000 .000 .000

.97 .06 .01 .05

2,652.63 1 1 1 1 796

23,227.17 47.47 5.57 39.82

Fig. 2 Estimated marginal means of DTF 21 by technique and AR_POSITIVE. Complete = complete data. Listwise = listwise deletion. Mean = imputation using sample means and variances. MI = multiple imputation

Behav Res

contrasts indicated that AR_POSITIVE had different effects on listwise deletion (η2 = .39) than on the complete data. When a11 and a22 were positive, listwise deletion yielded estimates that were close to the complete data, but when a11 and a22 were negative, it resulted in a large downward bias. The estimates from iVAR were very similar to those from the complete data analysis, whereas imputation using sample means and variances and MI resulted in severe underestimation of DTF21. Another four-way repeated measures ANOVA was conducted to examine the effects of proportion of missing data, AR_LARGE, AR_POSITIVE, and CR_POSITIVE on DTF21, averaged across techniques. The only interaction effect with a medium or large effect size was Proportion of Missing Data × AR_LARGE, F(2.38, 1882.90) = 64.80, p < .001, η2 = .08 (Fig. 3). We can see that when a11 and a22 were small in magnitude, downward bias in the estimates of DTF21 increased as the proportion of missing data increased, roughly following a linear pattern. In contrast, when a11 and a22 were large in magnitude, a reflected point appeared in the trend, such that 10 % of missing data resulted in a considerable downward bias, but 20 % and 30 % missing data were not associated with larger bias, but instead with a slight improvement. In other words, averaging across techniques and other parameters in the simulation, the proportion of missing data did not seem to affect the estimates of DTF21 when a11 and a22 were large in magnitude. To get a more detailed picture of the severity of bias, estimates of DTF21 are plotted in Fig. 4, broken down by technique, proportion of missing data, AR_LARGE, and AR_POSITIVE. The solid and dotted horizontal lines

represent the means and 95 % confidence interval of DTF21 from the complete data analysis. In all conditions, the means of DTF21 using iVAR fell within the 95 % confidence intervals. In contrast, none of those estimated using listwise deletion, imputation with sample means and variances, or MI fell within these intervals. In most cases, these techniques resulted in a downward bias. However, when a11 = a22 = .8, listwise deletion produced an upward bias, which increased as the proportion of missing data increased. This unusual pattern appears to be the underlying cause of the interaction effects that we see in Figs. 1, 2, and 3. DTF21 from first-order VAR models To examine whether bias in the estimated DTF21 is due to model misspecification for the three poorly performing techniques (listwise deletion, imputation using sample means and variances, and MI), we looked at cases in which the estimated VAR models were of order 1. Average DTF21 values from these cases are shown in Table 3, along with the average estimates from all cases and the complete data analysis. It should be noted that when the autoregressive coefficients are large in magnitude, none of the cases yielded a VAR(1) model when the two imputation methods were used. Estimates of DTF21 tended to be higher when models were of order 1 than for the whole sample, regardless of technique, proportion of missing data (including complete data), and the simulated values of the autoregressive coefficients. Hence, there appears to be no systematic reduction in bias when the order of VAR is correctly specified. Most importantly, estimates from VAR(1) using these three techniques were still outside the 95 % confidence intervals of the complete data estimates in most conditions. The only exceptions were when a11 = a22 = .4, missing data were less than or equal to 20 %, and MI was used. However, in this scenario the sample size was only 16, and the standard error of the complete data estimates was large as compared to the other conditions, suggesting a high level of uncertainty. These results indicate that even when the order of the estimated model was identical to the data-generating model, listwise deletion, imputation using sample means and variances, and MI still produced biased estimates of DTF21. Directed transfer function from y2 to y1 (DTF12) We conducted a similar set of repeated measures ANOVAs for DTF12. We found no effect with η2 ≥ .06, indicating that the estimate of the influence from y2 to y1 was not severely affected by missing-data technique, the proportion of missing data, or the pattern of A1. Summary and discussion of simulation results

Fig. 3 Estimated marginal means of DTF21 by proportion of missing data and AR_LARGE

We simulated data such that a unidirectional effect existed between time series y1 and y2. Specifically, y2 was predicted

Behav Res

iVAR

Listwise

Mean

MI

Fig. 4 Estimated marginal means of DTF21, broken down by technique, proportion of missing data, AR_LARGE, and AR_POSITIVE. The solid and dotted horizontal lines represent the mean and the borders of the 95 %

confidence interval from the complete data analysis. Listwise = listwise deletion. Mean = imputation using sample means and variances. MI = multiple imputation

by y1 at lag 1 with a coefficient of .3 or –.3, and y1 was not predicted by previous values of y2. Because of variation in the order of the estimated VAR models, we compared the frequency-domain measures of lagged influence—the directed transfer functions (DTF)—yielded by the four methods for handling missing data. DTF21 reflected influences from y1 to y2, combining all lags, and vice versa for DTF12. Hence, an accurately estimated VAR model should have a relatively large DTF21 and a DTF12 close to zero. The results indicated that all four techniques yielded acceptable estimates of DTF12. With regard to DTF21, iVAR produced the most accurate estimates, and its performance was the most stable over simulation conditions. Specifically, the average DTF21 produced by iVAR always fell within the 95 % confidence intervals of the complete data estimates, regardless of the simulation parameter values and proportion of missing data. The other two imputation methods

consistently resulted in large downward biases. The performance of MI depended heavily on the magnitude of the autoregressive coefficients. Larger autoregressive coefficients (±.8) had a more detrimental effect on MI than did smaller autoregressive coefficients (±.4), yet even in the latter scenario, the performance of MI was unacceptable. Listwise deletion also showed substantial variation across simulation conditions. In most cases, it resulted in large downward biases, but when a11 and a22 were positive and large, it resulted in an upward bias. Correctly specifying the order of VAR did not systematically improve the poor performance of the latter three methods. The performance of imputation using sample means and variances can be expected, because values imputed in this way are independent both across time and across variables, which leads to underestimation of the cross-lagged coefficients. MI ignores the serial dependency in time series data but accounts

Standard errors are included in parentheses for the complete data estimates. a Number of cases yielding VAR(1) across levels of the proportion of missing data (including complete data). b Means within the 95 % confidence intervals of the complete data estimates

n = 127 a .3148 (.0032) .3440 .3766 .4020 n=0a – – – – n=0a – – – – n = 200 .2883 (.0059) .3064 .3431 .3770 n = 200 .2883 .0059) .1499 .1632 .1678 n = 200 .2883 (.0059) .1093 .1153 .1180 n = 124 a .2851 (.0048) .2642 .2479 .2275 n = 98 a .2824 (.0054) .2374 .1931 .1558 n = 16 a .2817 (.0177) .2638 b .2500 b .2311 n = 200 .2613 (.0058) .2417 .2378 .2164 n = 200 .2613 (.0058) .2187 .1796 .1472 n = 200 .2613 (.0058) .2303 .2089 .1824 n = 90 a .3198 (.0032) .2092 .1336 .1488 n=0a – – – – n=0a – – – – n = 200 .2860 (.0063) .1994 .1506 .1489 n = 200 .2860 (.0063) .1387 .1606 .1552 n = 200 .2860 (.0063) .1034 .1154 .1186 Listwise Deletion Complete Missing = 10 % Missing = 20 % Missing = 30 % Mean Imputation Complete Missing = 10 % Missing = 20 % Missing = 30 % Multiple Imputation Complete Missing = 10 % Missing = 20 % Missing = 30 %

n = 200 .2847 (.0059) .2030 .1474 .1120 n = 200 .2847 (.0059) .2342 .1887 .1538 n = 200 .2847 (.0059) .2544 .2210 .1921

n = 84 a .3134 (.0055) .2154 .1696 .1490 n = 95 a .3010 (.0048) .2531 .2041 .1660 n = 17 a .3134 (.0088) .2845 .2637 .2373

All VAR(1) All VAR(1) All VAR(1) All

a11 = a22 = .4 a11 = a22 = –.8 a11 = a22 = –.4 Technique

Table 3 Average estimates of DTF21 from all cases versus from subsets of cases with first-order VAR models, broken down by simulation conditions

a11 = a22 = .8

VAR(1)

Behav Res

for the lag-0 covariance between variables. As a result, its performance is compromised most when the autoregressive coefficients are large in magnitude. The performance of listwise deletion is consistent with the findings in Velicer and Colby (2005) and is not surprising. In this approach, missing data are replaced by the first full vector (i.e., with no missing) of values at a later time point. If the time series is spiky, such that the observations at time t differ substantially from the observations at times t – 1 and t + 1, the estimation of VAR will be severely affected. This is the case when a11 and a22 are negative. In contrast, if the time series is smooth, such that observations at time t are similar to those at times t – 1 and t + 1, listwise deletion would have a less detrimental impact. This is the case when a11 and a22 are positive. The direction of the bias for DTF21 can be explained by mathematical deduction. With a12 = 0, Eq. 2 can be written as 

u1t y1t ¼ a11 y1ðt−1Þ þ y2t ¼ a21 y1ðt−1Þ þ a22 y2ðt−1Þ þ u :

ð7Þ

2t

Hence,   y2t ¼ a21 a11 y1ðt−2Þ þ u1ðt−1Þ   þa22 a21 y1ðt−2Þ þ a22 y2ðt−2Þ þ u2ðt−1Þ þ u2t ¼ ða21 a11 þ a22 a21 Þy1ðt−2Þ þ a222 y2ðt−2Þ

ð8Þ

þ a21 u1ðt−1Þ þ a22 u2ðt−2Þ þ u2t ;

and so on for larger lags. Therefore, when a11 = a22 = .8, the regression coefficient for using y1 at a larger lag to predict y2 would be greater than that for a21, leading to an overestimation of DTF21.When a11=a22= .4, DTF21 would be underestimated. Overall, the severity of bias indicates that none of the methods examined in this study except iVAR is appropriate for handling missing data in time series analysis. On the basis of our simulation results, iVAR is the most appropriate method for handling missing data. It should be noted that in the simulation study, the data were simulated to be stationary—that is, the process generating the data did not change over time. Violation of stationarity would likely impair the performance of the procedure. For multivariate Gaussian processes, nonstationarity is reflected in changes in the means and/or changes in the variances and covariances (contemporaneous and lagged) over time. iVAR allows users to accommodate changes in the means by removing polynomial trends in the data. However, users should be cautious about possible nonstationarity in the variance–covariance structure. In addition, because missing data at later time points are predicted with an estimated model that may itself depend on imputed data, the prediction error cumulates as the recursive procedure runs from the beginning to the end of the time series. Thus, the

Behav Res

accuracy of the estimated VAR model in the first window is crucial for the performance of iVAR. We recommend that users of iVAR first test for homogeneity of the process over time, especially between the first window and later time windows. One way of doing this would be to compute the so-called block-Toeplitz covariance matrix for different sections of the data and test for their equivalence in SEM programs. The block-Toeplitz matrix of lag p is 1

0

C ð 0Þ B ⋮ S ð pÞ ¼ B @C ðp−1Þ C ð pÞ

⋱ … C ð 0Þ … C ð 1Þ

C C; A

ð9Þ

C ð 0Þ

where C(0) and C(p) are the lag-0 and lag-p covariance matrices of the data (Molenaar, 1985; Zhang, Hamaker, & Nesselroade, 2008). If the block-Toeplitz matrices of different sections of the time series are statistically equivalent, iVAR can be used to impute missing values. This testing procedure is illustrated in the following empirical example.

An empirical example We drew data from a psychophysiological study of dyadic interactions (Hedman, 2010). In this study, electrodermal activity (EDA) was measured from pairs of children with sensory processing disorder (SPD) and their therapists every 1 s simultaneously during psychotherapy.3 Here, we used EDA from a therapist–child dyad within a 4-min time window (i.e., 240 data points). We first fitted a VAR model to the complete data and computed the DTFs to examine the directional influences between the child and the therapist. The averaged DTF from the therapist to the child (DTFCT) across frequencies was .12, and the averaged DTF from the child to the therapist (DTFTC) was .04. This suggests a unidirectional influence from the therapist to the child.4 Next, we randomly generated 10 % missing data in the last 120 time points. To evaluate the validity of using iVAR, we computed the pairwise lag-5 block-Toeplitz covariance matrices for the first 120 data points and the last 120 data points. In a SEM program, we constrained the first two columns of the two block-Toeplitz matrices to be equivalent (because the data were bivariate) and 3

The raw measurements are 0.5-s intervals. The raw data contain long strings of identical values, indicating a problem of oversampling. Hence, we subsampled the data with 1-s intervals. 4 Standard errors of the DTFs can be computed using the bootstrapping, “jackknife,” or surrogate data methods (Efron, 1981; Kamiński et al., 2001; Schlögl & Supp, 2006), but the computation is not incorporated in existing statistical programs.

evaluated the fit of the model. The chi-square test showed that the two matrices were not significantly different [χ2(23) = 28.71, p = .19]. In addition, we obtained a root mean square error of approximation (RMSEA) of .05, a comparative fit index (CFI) of 1.00, and a nonnormed fit index (NNFI) of .99, indicating excellent fit. Hence, we used iVAR for imputing the missing data. The estimated DTFCT and DTFTC using the imputed data set are shown in Table 4, along with the estimates from listwise deletion, imputation using sample means and variances, and MI. Among the four methods, iVAR produced estimates that are closest to the complete data.

Discussion In this study, we have introduced an R program, iVAR, for imputing missing data in multivariate time series on the basis of VAR modeling. We showed that iVAR outperformed listwise deletion, imputation using sample means and variances, and a naïve approach of MI in estimating the lagged relations between variables. Although our study only considered data generated by a VAR model, iVAR can be used for any multivariate Gaussian time series data as long as the process is stationary. This is because the VAR model is an extremely flexible tool and can be used to describe any stationary time series given a sufficiently large order (Hannan, 1970). In the following section, we discuss some limitations of the program. First, as we noted previously, the validity of iVAR depends on the stationarity of the process. Of particular concern here is the homogeneity of the variance–covariance structure, because changes in the means can be handled by removing trends in the data. Hence, a test of equivalence of the blockToeplitz matrices needs to be carried out, as was illustrated in the empirical example. If the block-Toeplitz matrix of data is time-variant, iVAR is not appropriate, and researchers will need to consider alternative approaches such as state-space modeling. Table 4 Estimated directed transfer functions (DTFs) between child and therapist

DTFCT DTFTC

Complete Data

iVAR

Listwise

Mean

MI

.12 .04

.14 .03

.08 .04

.18 .06

.08 .06

DTFCT represents the influence from the therapist to the child, and DTFTC represents the influence from the child to the therapist. Listwise = listwise deletion. Mean = imputation using sample means and variances. MI = multiple imputation

Behav Res

Second, as was demonstrated in Lütkepohl (2006), the mean square error (MSE) matrix of the one-step-ahead predictions of VAR is X

ð1Þ ¼ by

X

þ u

mp þ 1 X ; u n

Author note This study was supported by a Dissertation Award to S.L. from the Society of Multivariate Experimental Psychology, and by National Science Foundation Grant No. 1157220 to P.C.M.M. We thank Matthew Goodwin and Elliot Hedman for providing the empirical data, and Lawrence Lo for assisting with the simulation.

ð10Þ

where Σu is the covariance matrix of residuals, m is the dimension of the time series, p is the order of VAR, and n is the number of time points. Hence, a shorter length of data for estimating VAR is associated with larger prediction errors when all other parameters are the same. Since iVAR requires complete data in the first window, it is more appropriate when no data are missing at the beginning of the time series. Otherwise, users are forced to choose a small window size, and the predictions may be subject to large prediction errors. Third, iVAR assumes the inputting variables to be multivariate normal. In practice, data transformation (e.g., by taking logarithms) may be necessary for highly skewed variables, just as in MI programs that assume multivariate normal distributions (e.g., NORM). The imputed data can then be transformed back to their original scale. Other violations of the assumption may, to unknown degrees, compromise the performance of iVAR. Finally, in this study we only considered the scenario in which data were missing completely at random—that is, the cause of the missing data was independent of both the observed and missing values. A less stringent assumption of missing data mechanism—missing at random (MAR)—may be more realistic in practice. MAR refers to the case in which missingness is related to the observed values, but not to the missing values. In time series analysis, we can think of two scenarios in which data would be MAR. First, missingness only depends on one or more time series variables that are included in the VAR model for imputation. For example, imagine a time series in which, if the value at time t is larger than a constant, the value at time t + 1 is missing. In this case, we would still expect iVAR to produce valid predictions, because the cause of missingness is taken into account in the imputation model. This is analogous to including auxiliary variables in the imputation model of MI. The other scenario of MAR is when the missingness depends on one or more variables outside of the VAR model. For example, in a daily diary study, missing data may be more likely on weekends than on weekdays. Because the day of the week cannot be incorporated into iVAR at its current form, the imputed data may be biased. Despite these limitations, iVAR provides an important alternative to current methods for handling missing data in multivariate time series. Further extension of the program to include other types of variables and incorporate various missing-data mechanisms will be merited.

References Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19, 716–723. doi:10. 1109/TAC.1974.1100705 Aloia, M. S., Goodwin, M. S., Velicer, W. F., Arnedt, J. T., Zimmerman, M., Skrekas, J., … Millman, R. P. (2008). Time series analysis of treatment adherence patterns in individuals with obstructive sleep apnea. Annals of Behavioral Medicine, 36, 44–53. doi:10.1007/ s12160-008-9052-9 Belz, A. M., Beekman, C., Molenaar, P. C. M., & Buss, K. A. (2013). Mapping temporal dynamics in social interactions with unified structural equation modeling: A description and demonstration revealing time-dependent sex differences in play behavior. Applied Developmental Science, 17, 152–168. doi:10.1080/10888691.2013. 805953 Blinowska, K. J. (2011). Review of the methods of determination of directed connectivity from multichannel data. Medical & Biological Engineering & Computing, 49, 521–529. doi:10.1007/ s11517-011-0739-x Chow, S.-M., Nesselroade, J. R., Shifren, K., & McArdle, J. J. (2004). Dynamic structure of emotions among individuals with Parkinson’s disease. Structural Equation Modeling, 11, 560–582. doi:10.1207/ s15328007sem1104_4 Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Erlbaum. Efron, B. (1981). Nonparametric estimates of standard error: The jackknife, the bootstrap and other methods. Biometrika, 68, 589–599. doi:10.1093/biomet/68.3.589 Efron, B. (1982). The jackknife, the bootstrap, and other resampling plans. Philadelphia, PA: Society for Industrial and Applied Mathematics. Ferrer, E., & Helm, J. (2013). Dynamical systems modeling of physiological coregulation in dyadic interactions. International Journal of Psychophysiology, 88, 296–308. doi:10.1016/j.ijpsycho.2012.10. 013 Graham, J. W. (2009). Missing data analysis: Making it work in the real world. Annual Review of Psychology, 60, 549–576. doi:10.1146/ annurev.psych.58.110405.085530 Graham, J. W. (2012). Missing data: Analysis and design. New York, NY: Springer. Granger, C. W. J. (1969). Investigating causal relations by econometric models and cross-spectral methods. Econometrica, 37, 424–438. Hannan, E. J. (1970). Multiple time series. New York, NY: Wiley. Hedman, E. B. (2010). In-situ measurement of electrodermal activity during occupational therapy (Unpublished master ’s thesis). Cambridge, MA: Massachusetts Institute of Technology. Kamiński, M. J., & Blinowska, K. J. (1991). A new method of the description of the information flow in the brain structures. Biological Cybernetics, 65, 203–210. doi:10.1007/BF00198091 Kamiński, M. J., Ding, M., Truccolo, W. A., & Bressler, S. L. (2001). Evaluating causal relations in neural systems: Granger causality, directed transfer function and statistical assessment of significance. Biological Cybernetics, 85, 145–157. doi:10.1007/s004220000235 Lütkepohl, H. (2006). New introduction to multiple time series analysis. Berlin, Germany: Springer.

Behav Res Molenaar, P. C. M. (1985). A dynamic factor model for the analysis of multivariate time series. Psychometrika, 50, 181–202. doi:10.1007/ BF02294246 Molenaar, P. C. M., Sinclair, K. O., Rovine, M. J., Ram, N., & Corneal, S. E. (2009). Analyzing developmental processes on an individual level using nonstationary time series modeling. Developmental Psychology, 45, 260–271. doi:10.1037/a0014170 Pfaff, B. (2008). VAR, SVAR, and SVEC models: Implementation within R package vars. Journal of Statistical Software, 27. Retrieved from www.jstatsoft.org/v27/i04/ Rankin, E. D., & Marsh, J. C. (1985). Effects of missing data on the statistical analysis of clinical time series. Social Work Research and Abstracts, 21, 13–16. doi:10.1093/swra/21.2.13 R Development Core Team. (2011). R: A language and environment for statistical computing (ISBN 3-900051-07-0). R Foundation for Statistical Computing, Vienna, Austria. Retrieved from www.Rproject.org Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. New York, NY: Wiley. SAS Institute Inc. (2010). SAS/IML 9.22 user’s guide. Cary, NC: SAS Institute Inc. Retrieved from http://support.sas.com/documentation/ cdl/en/imlug/63541/PDF/default/imlug.pdf Schafer, J. L. (1997). Analysis of incomplete multivariate data. London, UK: Chapman & Hall/CRC Press. Schafer, J. L. (2001). Multiple imputation with PAN. In L. M. Collins & A. G. Sayer (Eds.), New methods for the analysis of change (pp. 357–377). Washington, DC: American Psychological Association. Schlögl, A., & Supp, G. (2006). Analyzing event-related EEG data with multivariate autoregressive parameters. Progress in Brain Research, 159, 135–147. doi:10.1016/S0079-6123(06)59009-0

Su, Y.-S., Gelman, A., Hill, J., & Yajima, M. (2011). Multiple imputation with diagnostics (mi) in R: Opening windows into the black box. Journal of Statistical Software, 45, 1–31. Retrieved from http://hdl. handle.net/10022/AC:P:15342 Tanner, M. A., & Wong, W. H. (1987). The calculation of posterior distributions by data augmentation (with discussion). Journal of the American Statistical Association, 82, 528–540. Retrieved from www.jstor.org/stable/2289457 Velicer, W. F., & Colby, S. M. (2005). A comparison of missing-data procedures for ARIMA time-series analysis. Educational and Psychological Measurement, 65, 596–615. doi:10.1177/ 0013164404272502 Velicer, W. F., Redding, C. A., Richmond, R., Greeley, J., & Swift, W. (1992). A time series investigation of three nicotine regulation models. Additive Behaviors, 17, 325–345. doi:10.1016/03064603(92)90039-X Wang, Z., Molenaar, P. C. M., & Newell, K. M. (2013). The effects of foot position and orientation on inter- and intra-foot coordination in standing postures: A frequency domain PCA analysis. Experimental Brain Research, 230, 15–27. doi:10.1007/s00221013-3627-9 Zhang, Z., Hamaker, E. L., & Nesselroade, J. R. (2008). Comparisons of four methods for estimating a dynamic factor model. Structural Equation Modeling, 15, 377–402. doi:10.1080/ 10705510802154281 Zheng, Y., Wiebe, R. P., Cleveland, H. H., Molenaar, P. C. M., & Harris, K. S. (2013). An idiographic examination of day-to-day patterns of substance use craving, negative affect, and tobacco use among young adults in recovery. Multivariate Behavioral Research, 48, 241–266. doi:10.1080/00273171.2013.763012

iVAR: a program for imputing missing data in multivariate time series using vector autoregressive models.

This article introduces iVAR, an R program for imputing missing data in multivariate time series on the basis of vector autoregressive (VAR) models. W...
404KB Sizes 0 Downloads 0 Views