Original Paper Caries Res 2014;48:13–18 DOI: 10.1159/000351642

Received: January 22, 2013 Accepted after revision: March 31, 2013 Published online: November 6, 2013

Statistical Power of Multilevel Modelling in Dental Caries Clinical Trials: A Simulation Study G. Burnside a C.M. Pine b P.R. Williamson a a Department of Biostatistics, University of Liverpool, Liverpool, and b Barts and The London Dental Institute, Queen Mary University of London, London, UK

Abstract Outcome data from dental caries clinical trials have a naturally hierarchical structure, with surfaces clustered within teeth, clustered within individuals. Data are often aggregated into the DMF index for each individual, losing tooth- and surface-specific information. If these data are to be analysed by tooth or surface, allowing exploration of effects of interventions on different teeth and surfaces, appropriate methods must be used to adjust for the clustered nature of the data. Multilevel modelling allows analysis of clustered data using individual observations without aggregating data, and has been little used in the field of dental caries. A simulation study was conducted to investigate the performance of multilevel modelling methods and standard caries increment analysis. Data sets were simulated from a three-level binomial distribution based on analysis of a caries clinical trial in Scottish adolescents, with varying sample sizes, treatment effects and random tooth level effects based on trials reported in Cochrane reviews of topical fluoride, and analysed to compare the power of multilevel models and tradi-

© 2013 S. Karger AG, Basel 0008–6568/14/0481–0013$39.50/0 E-Mail [email protected] www.karger.com/cre

tional analysis. 40,500 data sets were simulated. Analysis showed that estimated power for the traditional caries increment method was similar to that for multilevel modelling, with more variation in smaller data sets. Multilevel modelling may not allow significant reductions in the number of participants required in a caries clinical trial, compared to the use of traditional analyses, but investigators interested in exploring the effect of their intervention in more detail may wish to consider the application of multilevel modelling to their clinical trial data. © 2013 S. Karger AG, Basel

In studies with dental caries as an outcome, data are collected during a clinical examination by a trained examiner [Ismail, 2004], where each surface on each tooth is assessed by the examiner as to whether it is sound, decayed or filled, or if the tooth is missing as a result of extraction due to caries. These data are traditionally aggregated into the DMF index, a single measure for each individual. This approach means that the tooth- and surface-specific information is lost. These data have a naturally hierarchical structure, with surfaces clustered within teeth, which are clustered within individuals, who in turn may be clustered within social Girvan Burnside School of Dentistry Research Wing, University of Liverpool Daulby Street Liverpool L69 3GN (UK) E-Mail g.burnside @ liv.ac.uk

Downloaded by: University of Tokyo 157.82.153.40 - 6/2/2015 3:48:09 AM

Key Words Multilevel modelling · Randomised controlled trials · Statistical analysis

14

Caries Res 2014;48:13–18 DOI: 10.1159/000351642

The aim of this study is to investigate the potential to increase efficiency by reducing sample size in caries clinical trials, by assessing the power of multilevel modelling of caries data compared to traditional caries increment analysis.

Materials and Methods The simulated data sets were based on real data from a clinical trial of the efficacy of chlorhexidine varnish on caries in 12- to 16-year-old schoolchildren in Scotland [Forgie et al., 2000]. The participants (n = 987) had a mean D1MFS of 10.7 at baseline (standard deviation, SD 10.2), with a mean D1FS caries increment of 6.4 (SD 7.5) over the trial period. There was no statistically significant intervention effect in this trial, but in the two varnish groups (chlorhexidine and placebo) those participants who were compliant to the protocol had significantly lower caries increment than those who were not. Therefore, protocol compliance was used instead of intervention to generate groups with significantly different caries levels. Based on previously published work by the authors [Burnside et al., 2007], the model chosen to analyse the data was a three-level logistic model, with the hierarchy individual – tooth – surface, and predictor variables for group and tooth position. The tooth position classification used has 10 levels – incisors, canines, premolars, first molars and second molars separately for the upper and lower jaw. The following model was fitted to the data:

logit ␲ ijk  ␤0 ␤1x1

10

œ ␤m xm ␷ 0k u0jk e0ijk

m2

yijk ~ Bin 1, ␲ ijk

var ␷ 0 k  ␴ v20 var u0jk  ␴ u2 0 var yik |␲ ijk  ␲ ijk 1  ␲ ijk

The outcome variable for a specific surface i, on tooth j, in individual k, is denoted by yijk, which takes the value 0 if the surface does not develop caries and 1 if it does develop caries, and is assumed to follow a binomial distribution with probability of success πijk. The logistic model uses πijk as the dependent variable, which represents the probability that the ith surface becomes affected by caries during the course of the study. The right-hand side of the equation includes categorical predictors x1, the indicator variable for group allocation based on compliance to protocol, and x2,...,x10, indicator variables for tooth position. The additional terms on the right-hand side of the equation are the random effects, υ0k representing the individual participant effect and u0jk representing the tooth level effect. Also, σ2ν0 is the variance of the error term at the participant level, and σ2u0 is the variance of the error term at the tooth level. Both quantities are estimated in the modelling process. Simplified two-level models were also fitted for comparison, specified as above, but removing the random effect at the tooth level. The model was fitted using MLwiN software [Rasbash

Burnside /Pine /Williamson  

 

 

Downloaded by: University of Tokyo 157.82.153.40 - 6/2/2015 3:48:09 AM

or administrative groupings. This type of clustered data is a common feature of oral health research, with multiple sites being studied within an individual [Begg, 2009]. If analysis is to be performed at the tooth or surface level, statistical methods appropriate for clustered data must be used. Failure to do so could result in confidence intervals which are too narrow, and therefore increased probabilities of incorrect conclusions [Donner and Klar, 2000]. Most reports of dental caries clinical trials have ignored this hierarchical structure of caries data in the analysis [Burnside et al., 2006], although the issue has been discussed in the dental literature since the widespread availability of software to perform more complex analyses [Macfarlane and Worthington, 1999; Gilthorpe et al., 2000]. There has been much discussion in the literature around ways to improve efficiency in caries clinical trials. Efficiency has many specific definitions in the literature, but in general refers to the principle of the effect achieved in relation to the resources expended [Hausen, 2004]. Key drivers for enhancing efficiency relate to the search to reduce costs of trials in terms of time or sample size. The use of analysis methods with increased power may have the potential to reduce the number of participants required in a clinical trial, although work using a surfacespecific generalised estimating equations-based method suggests that the gain may be modest [Mancl et al., 2004]. Various methods have been suggested for the analysis of caries data at tooth or surface level. These include an adjustment to the χ2 test to account for the clustering [Ahn et al., 2002], and several investigations of clustered time-to-event analysis at surface level [Hannigan et al., 2001; Stephenson et al., 2010; Wong et al., 2011]. A previously published study conducted by the authors has introduced the use of three-level logistic multilevel modelling of caries clinical trial data, and shown that this approach can increase understanding of the patterns of caries development, and allows for full use of the data collected at surface level [Burnside et al., 2007]. This method uses surface as the basic unit of analysis, but accounts for the clustering by partitioning the variance in the outcome variable into variance at individual, tooth and surface level [Goldstein, 2003]. This work showed that this method was potentially useful for the analysis of caries data, but the question remains whether it could lead to a gain in efficiency by reducing the required sample size. None of the previous work published on efficiency gains using tooth and surface level analysis has used this analysis method.

Table 1. Parameters estimated from model fitted to trial data set

Variable Constant Group (compliant or not) Upper incisor (reference category) Upper canine Upper premolar Upper first molar Upper second molar Lower incisor Lower canine Lower premolar Lower first molar Lower second molar

Parameter (SE) –3.247 (0.090) 0.417 (0.130) –1.040 (0.133) –0.655 (0.093) 1.439 (0.088) 1.200 (0.085) –2.510 (0.160) –1.928 (0.177) –0.883 (0.096) 1.930 (0.088) 1.361 (0.085)

Choice of Simulation Parameters In order to assess power in different situations, some parameters of the model were varied. The parameters which vary in the simulations are number of individuals per group, coefficient of intervention effect, and variance of the random effect at tooth level. The values of the first two parameters were chosen based on observed values in trials reported in the Cochrane review on topical fluorides for preventing dental caries in children and adolescents [Marinho et al., 2003]. The review included 133 trials, with group size varying from 10 to 708. The median group size was 158, with quartiles at 93 and 254. Based on this distribution values of 50, 150 and 300 individuals per group were chosen for the simulations to represent small, medium and large sample sizes. The intervention effect in the 133 trials ranged from a prevented fraction (PF) of 0 to 80%. The PF is defined as the difference in caries increment between intervention and control group, expressed as a percentage of the increment in the control group. The median PF was 25%, with quartiles of 17 and 32%. The three values chosen for the simulation were 15, 25 and 35%, representing small, medium and large PFs. As the trials in the Cochrane review generally do not use multilevel analysis, the values for the third parameter, variance of the random effect at tooth level, cannot be chosen in the same way. Previous analyses have shown variance of the random effect at approximately 1 [Burnside et al., 2007], and the values for the simulation were chosen as 0 (no random effect at tooth level), 1 and 2. These 3 varying parameters give a total of 27 combinations of parameter values. The remaining parameters (variance of random effect at individual level, and coefficients of tooth position variables) were kept constant at the values from the fitted model. 1,500 data sets were simulated for each of the 27 combinations of parameters detailed above using the MLwiN syntax language. For each simulated participant, predicted values of the outcome variable for each tooth surface were calculated based on the model parameters. Random effects at individual and tooth level were then applied to these predicted values, resulting in a probability of decay for each surface. The data were then sampled from the binomial distributions with these probabilities to obtain the simulated data sets. The two-level model ignoring the random effect at tooth level was fitted to each of the data sets. In addition, the full threelevel model was also fitted to each data set. All models were fitted using the second-order PQL method. As PQL can have convergence problems [Goldstein, 2003] the maximum number of iterations was set at 100 for each data set, and if this maximum was reached, the model for that data set was considered to have failed to converge. All models which did converge, did so in fewer than 50 iterations. Finally, the traditional analysis of comparing DFS

The results of the initial analysis of the trial data set are shown in table 1. The observed coefficient of group effect here (0.417) corresponds to a PF of 25%, which was the median value found in the Cochrane review [Marinho et al., 2003]. The second-order PQL analysis in MLwiN had some convergence problems. Table 2 shows the proportion of analyses which converged for each combination of parameters. All models converged for the data sets with no random effect at tooth level. For data sets with random effect at tooth level set at 2, almost all models failed to converge. This suggests that this variance was unrealistic in combination with the rest of the parameters in the model, and for the remainder of the analysis, these data sets have been excluded. The models with random effect of variance 1, which is the closest value to that observed from the caries data set, had more success in converging in larger data sets, and where the PF was larger. In order to investigate if the failure to converge is systematic, table 3 compares the coefficients of intervention group estimated in a simpler two-level model, ignoring tooth level to give a hierarchy of individual – surface, between those data sets where the three-level model converged, and those where it did

Power of Multilevel Modelling in Caries Clinical Trials

Caries Res 2014;48:13–18 DOI: 10.1159/000351642

Random effect – participant Random effect – tooth

1.332 (0.105) 1.099 (0.063)

caries increment using t tests was performed on each simulated data set. As a check, data sets were also simulated with no treatment effect, to check the value of α. For each combination of parameters, the proportion of data sets which showed a statistically significant difference (p < 0.05) from each method of analysis was used to estimate power.

Results

15

Downloaded by: University of Tokyo 157.82.153.40 - 6/2/2015 3:48:09 AM

et al., 2005], using the iterative generalised least squares method [Goldstein, 1986], and second-order penalised quasi-likelihood (PQL) method to linearise the model [Goldstein and Rasbash, 1996]. This method can sometimes have convergence problems, and can sometimes show bias [Browne and Draper, 2006]. Markov Chain Monte Carlo (MCMC) estimation would be preferable, but these methods are much more computationally intensive, and would take a prohibitively long time to run many simulations. Some MCMC simulations have been carried out in this paper to investigate the accuracy of the PQL estimates. Based on the results of this analysis, the simulated data sets were created by sampling from the binomial distribution with parameters generated by the values of the estimates from the model.

Table 2. Percentage of models which successfully converged with-

Table 4. Mean ± SD of group coefficient estimates for simulated

in 100 iterations1

data sets for which the three-level model converged Random effect

variance 0 variance 1 variance 2

variance 0

variance 1

50 per group PF and true coefficient value PF 15% 0.268 PF 25% 0.417 PF 35% 0.591

0.269 ± 0.246 0.419 ± 0.249 0.599 ± 0.246

0.306 ± 0.266 0.443 ± 0.253 0.607 ± 0.258

150 per group PF and true coefficient value PF 15% 0.268 PF 25% 0.417 PF 35% 0.591

0.260 ± 0.147 0.412 ± 0.144 0.593 ± 0.148

0.283 ± 0.151 0.427 ± 0.145 0.606 ± 0.147

300 per group PF and true coefficient value PF 15% 0.268 PF 25% 0.417 PF 35% 0.591

0.271 ± 0.101 0.415 ± 0.100 0.589 ± 0.103

0.276 ± 0.108 0.425 ± 0.105 0.603 ± 0.104

100 100 100

72 84 91

0 1 3

100 100 100

72 94 100

0 0 0

100 100 100

86 99 100

0 0 0

1 All models which successfully converged did so within 50 iterations.

Table 3. Group coefficients (mean ± SD) from two-level model by

convergence status of three-level models, for simulated data sets with variance of random tooth effect equal to 1 Coefficient of group from two-level model

50 per group PF 15% PF 25% PF 35% 150 per group PF 15% PF 25% 300 per group PF 15%

failed to converge

converged

0.234 ± 0.212 (n = 427) 0.332 ± 0.225 (n = 244) 0.497 ± 0.195 (n = 135)

0.267 ± 0.233 (n = 1,073) 0.387 ± 0.222 (n = 1,256) 0.530 ± 0.230 (n = 1,365)

0.226 ± 0.136 (n = 413) 0.344 ± 0.141 (n = 84)

0.247 ± 0.132 (n = 1,087) 0.372 ± 0.127 (n = 1,416)

0.224 ± 0.096 (n = 211)

0.240 ± 0.094 (n = 1,289)

not. The comparisons which are excluded are those where the number of failures was less than 20. As PQL methods have been shown to exhibit bias in certain circumstances, MCMC estimation was carried out on 100 of the data sets with 50 simulated participants per group. The mean difference between the group coefficient estimates in the two methods was 0.0002 (SD 0.02), suggesting that PQL is giving similar results to MCMC in this case. Tables 2 and 3 both suggest that convergence difficulties increase as the group effect decreases. In table 2, data 16

Caries Res 2014;48:13–18 DOI: 10.1159/000351642

sets simulated using lower PFs have higher proportions of convergence failure. Table 3 shows that on average, the group coefficient estimated from the two-level model appears to be slightly lower in those data sets where the three level model failed to converge. However, as this difference is quite small in most cases, the subsequent investigation of the estimated power for the various analysis methods will use only those data sets for which the three-level model converged. Table 4 shows the mean group coefficient estimates for the simulated data sets. The analysis of data sets simulated with no treatment effect showed that α was within a sampling tolerance of 0.05 for all combinations of parameters and tests. The mean estimated coefficients from the three-level models of the simulated data sets with no random effect are mostly similar to the true values. As this model specification is the one used in the simulation, this is to be expected. For the data sets simulated with the random effect included with variance 1, the estimates from the threelevel model are biased upwards. This is likely to be partly due to the exclusion of the datasets where the three-level model did not converge, as they tended to have smaller group effects. Table 5 shows the estimated power for each combination of parameters, using the three-level model and the traditional DFS increment analysis. The estimated power for the traditional DFS increment method was also Burnside /Pine /Williamson  

 

 

Downloaded by: University of Tokyo 157.82.153.40 - 6/2/2015 3:48:09 AM

50 per group PF 15% PF 25% PF 35% 150 per group PF 15% PF 25% PF 35% 300 per group PF 15% PF 25% PF 35%

Random effect

ing the three-level model and the traditional DFS increment analysis (percentage) Random effect variance 0

Random effect variance 1

three-level with tooth position

three-level with tooth position

50 per group PF 15% 18 PF 25% 39 PF 35% 68 150 per group PF 15% 43 PF 25% 81 PF 35% 98 300 per group PF 15% 76 PF 25% 99 PF 35% 100

DFS increment

DFS increment

21 40 67

20 39 64

23 43 67

44 81 98

44 80 98

48 81 98

74 97 100

71 98 100

73 98 100

very similar to that of the three-level multilevel analysis, with more variation in the smaller data sets (50 per group).

Discussion

This simulation study has shown little difference between the estimated power of multilevel models and traditional DFS increment analysis. It should be noted that although a range of parameters were simulated, all data sets were based on a population of Scottish adolescents with a particular disease level, and results may vary in populations with higher or lower levels of caries. This work does give some indication that the advantages of multilevel modelling in dental caries clinical trials may lie in greater understanding of the data structure and within mouth patterns of caries development, rather than reduction of required sample size. For example, the analysis presented in table 1 shows the differing effects of tooth position on the probability of developing caries. The multilevel framework could also allow investigation of the efficacy of agents on different surface types (fissure or smooth). Published work on the estimation methods in MLwiN [Browne and Draper, 2006] shows that the default estimation method for multilevel logistic regression, first-order MQL, is significantly biased, with the second-order PQL method giving more accurate estimates, although still Power of Multilevel Modelling in Caries Clinical Trials

sometimes biased, particularly in estimating random effect. Another disadvantage of PQL is that it can fail to converge in some models, as was observed in this article. The MCMC routine has been shown to give the least biased estimates, but this method takes a prohibitively long time to converge, which makes it unrealistic for largescale simulations. Although the second-order PQL method used has been shown to give reasonably accurate estimates, its failure to converge on some data sets has limited the data available for certain parameter combinations in the simulation study. The lack of evidence for a reduction in required sample size in analysing at tooth and surface level was similar to conclusions drawn by Mancl et al. [2004], in relation to time-to-event analysis with surface as the unit of analysis, where the authors stated that ‘the gain in efficiency due to the use of surface-specific information will most likely be small under most circumstances’ [Mancl et al., 2004]. The work presented here is the first to consider this issue in caries data using a multilevel modelling framework. There is no indication from this work that the use of multilevel modelling using the natural hierarchy of surfaces and teeth clustered within individuals will allow investigators to make significant reductions in the number of participants required in a clinical trial of a caries-preventive agent, compared to the use of the traditional comparison of caries increments. However, multilevel modelling does have the advantage of allowing greater understanding of the patterns of caries development within the mouth. Modelling the random structure allows estimates to be made of the relative variance at individual, tooth and surface level [Burnside et al., 2007]. Therefore, investigators who are interested in exploring the effect of their intervention in more detail may wish to consider the application of multilevel modelling to their clinical trial data. Acknowledgements This work has benefited from a research donation from Unilever Oral Care. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Conceived the project: G.B., C.M.P., P.R.W.; planned simulations: G.B., P.R.W.; carried out simulations and data analysis: G.B.; drafted the paper: G.B.; commented on and edited the paper: C.M.P., P.R.W.

Disclosure Statement There are no conflicts of interest.

Caries Res 2014;48:13–18 DOI: 10.1159/000351642

17

Downloaded by: University of Tokyo 157.82.153.40 - 6/2/2015 3:48:09 AM

Table 5. Estimated power for each combination of parameters, us-

References

18

Caries Res 2014;48:13–18 DOI: 10.1159/000351642

Forgie AH, Paterson M, Pine CM, Pitts NB, Nugent ZJ: A randomised controlled trial of the caries-preventive efficacy of a chlorhexidinecontaining varnish in high-caries-risk adolescents. Caries Res 2000;34:432–439. Gilthorpe MS, Maddick IH, Petrie A: Introduction to multilevel modelling in dental research. Community Dent Health 2000; 17: 222–226. Goldstein H: Multilevel mixed linear model analysis using iterative generalized least squares. Biometrika 1986;73:43–56. Goldstein H: Multilevel Statistical Models, ed 3. London, Arnold, 2003. Goldstein H, Rasbash J: Improved approximations for multilevel models with binary responses. J R Stat Soc Ser A 1996;159:505–513. Hannigan A, O’Mullane DM, Barry D, Schafer F, Roberts AJ: A re-analysis of a caries clinical trial by survival analysis. J Dent Res 2001; 80: 427–431. Hausen H: How to improve the effectiveness of caries-preventive programs based on fluoride. Caries Res 2004;38:263–267. Ismail AI: Visual and visuo-tactile detection of dental caries. J Dent Res 2004;83:C56–C66.

Macfarlane TV, Worthington HV: Some aspects of data analysis in dentistry. Community Dent Health 1999;16:216–219. Mancl LA, Hujoel PP, DeRouen TA: Efficiency issues among statistical methods for demonstrating efficacy of caries prevention. J Dent Res 2004;83(Spec No C):C95–C98. Marinho VC, Higgins JP, Logan S, Sheiham A: Topical fluoride (toothpastes, mouthrinses, gels or varnishes) for preventing dental caries in children and adolescents. Cochrane Database Syst Rev 2003;4:CD002782. Rasbash J, Steele F, Browne W, Prosser B: A User’s Guide to MLwiN Version 2.0. Bristol, Centre for Multilevel Modelling, 2005. Stephenson J, Chadwick BL, Playle RA, Treasure ET: Modelling childhood caries using parametric competing risks survival analysis methods for clustered data. Caries Res 2010; 44:69–80. Wong MC, Lam KF, Lo EC: Analysis of multilevel grouped survival data with time-varying regression coefficients. Stat Med 2011; 30: 250–259.

Burnside /Pine /Williamson  

 

 

Downloaded by: University of Tokyo 157.82.153.40 - 6/2/2015 3:48:09 AM

Ahn C, Jung SH, Donner A: Application of an adjusted chi2 statistic to site-specific data in observational dental studies. J Clin Periodontol 2002;29:79–82. Begg MD: Analysis of correlated responses; in Lesaffre E, Feine J, Leroux BG, Declerck D (eds): Statistical and Methodological Aspects of Oral Health Research. Chichester, Wiley, 2009. Browne WJ, Draper D: A comparison of Bayesian and likelihood-based methods for fitting multilevel models. Bayesian Anal 2006;1:473–514. Burnside G, Pine CM, Williamson PR: Statistical aspects of design and analysis of clinical trials for the prevention of caries. Caries Res 2006; 40:360–365. Burnside G, Pine CM, Williamson PR: The application of multilevel modelling to dental caries data. Stat Med 2007;26:4139–4149. Donner A, Klar N: Design and Analysis of Cluster Randomization Trials in Health Research. London, Arnold, 2000.

Statistical power of multilevel modelling in dental caries clinical trials: a simulation study.

Outcome data from dental caries clinical trials have a naturally hierarchical structure, with surfaces clustered within teeth, clustered within indivi...
91KB Sizes 0 Downloads 0 Views