DENTAL-2472; No. of Pages 18

ARTICLE IN PRESS d e n t a l m a t e r i a l s x x x ( 2 0 1 4 ) xxx.e1–xxx.e18

Available online at www.sciencedirect.com

ScienceDirect journal homepage: www.intl.elsevierhealth.com/journals/dema

Two regression methods for estimation of a two-parameter Weibull distribution for reliability of dental materials Lukas Bütikofer a , Bogna Stawarczyk b , Malgorzata Roos a,∗ a b

Division of Biostatistics, Institute of Social and Preventive Medicine, University of Zurich, Switzerland Department of Prosthodontics, Dental School, Ludwig-Maximilians University, Munich, Germany

a r t i c l e

i n f o

a b s t r a c t

Article history:

Objectives. Comparison of estimation of the two-parameter Weibull distribution by two least

Received 31 October 2013

squares (LS) methods with interchanged axes. Investigation of the influence of plotting

Received in revised form

positions and sample size. Derivation of 95% confidence intervals (95%CI) for Weibull param-

9 April 2014

eters applicable in the context of LS estimation. Preparation of a free available Excel template

Accepted 3 November 2014

for computation of point estimates and 95%CI for Weibull modulus (m) and characteristic

Available online xxx

strength (s).

Keywords:

sizes. Mathematical derivation of formulae for computation of 95%CI according to a Menon-

Methods. Monte Carlo simulation covering a wide range of Weibull parameters and sample Weibull modulus

type approach for both m and s. Empirical proof that the practically observed coverage agrees

Weibull characteristic strength

with the nominal one of 95%.

Least squares

Results. Relative and absolute performance of LS estimators depended on sample size,

Failure probability

plotting positions and parameter to be estimated. For most situations they outperformed

Plotting positions

the corresponding Maximum Likelihood (ML) estimator in terms of bias, while precision

Mean ranks

was almost the same. Naïve Wald-type 95%CI based on standard errors of LS regression

Median ranks

coefficients did not reach targeted coverage. An easy-to-apply alternative based on asymp-

Hazen ranks

totic standard errors (Menon 95%CI) resulted in excellent coverage.

Confidence interval

Conclusion. Accuracy of the LS methods for Weibull modulus and characteristic strength

Coverage

essentially depend on plotting position and sample size. Large sample sizes (n ≥ 30) support a credible Weibull parameters estimation. An important complement of the point estimates of Weibull parameters is provided by the Menon 95%CI. A free available Excel template considerably facilitating computation of point and interval estimates of Weibull parameters is provided. © 2014 Academy of Dental Materials. Published by Elsevier Ltd. All rights reserved.

1.

Introduction

Dental restorative ceramics have excellent properties in terms of esthetics [1] and biocompatibility [2] but are susceptible to



brittle fracture, a type of failure that is particularly difficult to predict. In general, ceramics are sensitive to defects or flaws within the material, motivating the application of probabilistic concepts and in particular the weakest link (or largest flaw) model. When measuring the flexural strength of a material,

Corresponding author at: Hirschengraben 84, 8001 Zurich, Switzerland. Tel.: +41 44 634 46 48. E-mail address: [email protected] (M. Roos).

http://dx.doi.org/10.1016/j.dental.2014.11.014 0109-5641/© 2014 Academy of Dental Materials. Published by Elsevier Ltd. All rights reserved.

Please cite this article in press as: Bütikofer L, et al. Two regression methods for estimation of a two-parameter Weibull distribution for reliability of dental materials. Dent Mater (2014), http://dx.doi.org/10.1016/j.dental.2014.11.014

DENTAL-2472; No. of Pages 18

ARTICLE IN PRESS

xxx.e2

d e n t a l m a t e r i a l s x x x ( 2 0 1 4 ) xxx.e1–xxx.e18

the true value may be envisaged as a perfect flaw-free specimen and variations from this true value are due to defects during preparation. Each specimen contains a discrete flaw population and the largest flaw within each test specimen will cause failure if a uniform tension is applied. The distribution of the largest flaws of a sample of test specimens (and therefore the failure strength) follows an extreme value distribution such as Gumbel, Fréchet or Weibull. Among these, the Weibull distribution [3,4] is most frequently used since it is bounded (lowest possible flexural strength is zero), has a great shape flexibility, can provide accurate failure predictions even with a small number of test specimen, and provides a simple interpretation as a linear regression model (Eqs. (3) and (5)) [5].

1.1.

Weibull distribution

The Weibull distribution allows for prediction of the failure probability at any level of stress providing information about the reliability of a material. The three-parameter Weibull cumulative distribution function (cdf) relates probability of failure G(x) = P(X ≤ x) to stress x:

  x − s m  0

G(x) = 1 − exp −

s

where s is a scale parameter called characteristic strength, m a shape parameter denoted Weibull modulus and s0 a threshold representing the minimum stress below a specimen will not break. When the sample size is small (n < 50, typically the case for dental material investigations) and when there is no physical rationale for a non-zero threshold [6], it has been suggested to rather consider the two-parameter Weibull distribution with s0 = 0 [7]. The corresponding cdf and probability density functions (pdf) are:

  x m 

G(x) = 1 − exp −

g(x) =

m s

 x m−1 s

(1)

s

  x m 

exp −

s

(2)

Unlike for normal distribution, expected values and variance depend on both parameters of the Weibull distribution (Fig. 1a and b). Nevertheless, in the context of dental studies the characteristic strength s can be interpreted approximately as the “location” of the distribution, i.e. the stress at which an average specimen breaks. The Weibull modulus m can be seen as an approximate measure of the precision (inverse of the spread) of the distribution. A larger value indicates a narrower distribution and hence a higher precision; a larger fraction of specimen breaks in a smaller range of applied stress. A large Weibull modulus is a highly desirable property for a dental material as it guarantees more uniform performance among different specimens and therefore a higher reliability. The Weibull distribution shows a great shape flexibility (Fig. 1a and b) [7,8]. For 2 < m < 4 the density is fairly symmetric and is well approximated by a normal distribution (for m = 3.4 the best symmetry is attained). A low value of modulus (m < 1.25) gives a right-skewed, whereas high value of modulus (m > 10) gives a left-skewed density curve. Note that there are different parameterization of the Weibull distribution [4] and dependent on the software

package other parameters than m and s may be estimated. The R function survreg from package survival [9,10], e.g. estimates ı = 1/m and  = log(s), respectively (Appendix B).

1.2.

Monte Carlo simulation

Monte Carlo simulation is a well known approach for performance evaluation of statistical methods [8,11]. In the context of the two-parameter Weibull distribution it consists of the generation of random numbers (strength data) for a given modulus (m) and characteristic strength (s). Example 1 (Appendix A.3): A sample of n = 5 strength values was generated from a Weibull distribution with m and s set to 2 and 10, respectively. Using Monte Carlo methods, it is certain that the assumption of the underlying Weibull sampling distribution holds and the exact parameter values of this distribution (m, s) are known. The advantages of such methodology are manifold. First, strength data from a Weibull distribution with precisely defined m and s can be generated without any necessity for execution of real experiments. Second, parameter values can be freely chosen and, e.g. set to values typical for dental material research. Third, as much data as desirable can be generated and sample size can be varied freely. Fourth, a statistical technique can be applied to a given simulated sample in the same way as to measured data. However, in contrast to measured data, parameter estimates obtained for a simulated ˆ sˆ ) can be compared to the known true values (m, sample (m, s) to evaluate the performance of the statistical technique in terms of absolute error and bias. The absolute error is defined ˆ − m| and |ˆs − s|, respectively. The bias is defined as the as |m difference between the expected value of the estimator (that ˆ and sˆ over a large can be approximated by the mean of m number of samples) and the true values (m, s). ˆ − m ≈ mean(m ˆ i) − m ˆ = E(m) bias(m) bias(ˆs) = E(ˆs) − s ≈ mean(ˆsi ) − s Error and bias close to 0 indicate that the statistical technique works well. A large absolute error and bias indicate that the statistical technique performs poorly when estimating the Weibull parameters of interest. Example 2 (Table A1): For YonX (Eq. (3)) and mean plotting ˆ = 1.646 with an absolute error of 0.354. In positions (Eq. (7)), m contrast, for XonY (Eq. (5)) and hazen plotting positions (Eq. ˆ = 3.948 with an absolute error of 1.948. (9)), m

1.3. Estimation of the modulus (m) and characteristic strength (s) in a two-parameter Weibull distribution In practice, the least squares (LS) technique for the estimation of the Weibull parameters is preferred due to its simplicity and independence from sophisticated statistical software [6–8]. As we describe below there are actually two competing methods for LS estimation [7]. We call them here YonX and XonY, respectively. In fact, both approaches can lead to differing estimates of Weibull parameters. What is more, for LS methods the failure probability for each observation (G(xi ) = Gi , refer to Eq. (1)) has to be estimated. These failure probability estimates are typically referred to as plotting positions and definition of

Please cite this article in press as: Bütikofer L, et al. Two regression methods for estimation of a two-parameter Weibull distribution for reliability of dental materials. Dent Mater (2014), http://dx.doi.org/10.1016/j.dental.2014.11.014

DENTAL-2472; No. of Pages 18

ARTICLE IN PRESS d e n t a l m a t e r i a l s x x x ( 2 0 1 4 ) xxx.e1–xxx.e18

xxx.e3

Fig. 1 – (a and b) Weibull densities for different values of modulus m and characteristic strength s.

the plotting positions affects the LS estimates. Below, three definitions for plotting positions will be investigated: mean ranks, median ranks and hazen ranks. Up to now, a clear evaluation of the performance of the two possible LS methods (YonX vs. YonX) and their interplay with plotting positions for varying sample sizes is still missing. In general, the LS method is based on the linearized form of the Weibull cdf (taking two times the logarithm of Eq. (1)), allowing for parameter estimation by simple linear regression (ordinary least squares). To fix the notation assume that Y stands for log(−log(1 − G(x))) and X for log(x). Here, log(x) denotes the logarithm with base e. Let us stress that both LS approaches lead to numerically differing estimates of the Weibull parameters. Example 3 (Table A1, Fig. 2a): For YonX (Eq. (3)), median plotˆ = 1.911 and for XonY (Eq. (5)), median ting positions (Eq. (8)), m ˆ = 3.431. plotting positions, m

1.3.1.

1.3.2.

log(−log(1 − G(x))) = m log(x) − m log(s) → Y = mX − m log(s)

(3)

leading to: s = exp

 −a  b

(4)

Computation of the least squares regression for Y at the ordinate and X at abscissa leads to the estimates aˆ and bˆ of ˆ and sˆ of the regression coefficients and finally to estimates m the Weibull parameters.

Least squares (XonY)

The LS regression XonY takes the form X = dY + c with X at ordinate and Y at abscissa. Such a choice is motivated by the fact that X (the actual measurement) varies more than Y and should therefore be the dependent variable in the regression [7,8]. However, this regression is less straightforward for the practical use since the inverse of the Weibull modulus is estimated by its slope.

Least squares (YonX)

This frequently used LS approach [5,12] resembles the visual approach for Weibull parameters estimation [8]. Its advantage is that the Weibull modulus can be estimated by the slope of the YonX regression directly. The YonX linear regression takes the form Y = b X + a and the correspondence of the intercept (a) and slope (b) to the functions of m and s can be accomplished by equating the regression parameters with the regression equation obtained from:

m = b,

Example 4 (Table A1, Fig. 2a): For YonX (Eq. (3)), median plotting position (Eq. (8)), aˆ = −5.091 and bˆ = 1.911. Consequently, ˆ = bˆ = the Weibull parameter estimates can be obtained: m ˆ = 14.350. Absolute errors are 0.089 and 1.911, sˆ = exp(−ˆa/m) 4.350, respectively. Note that in case of YonX the characteristic strength sˆ is usually estimated as the uniform stress at which the probability of failure is 0.63 [5,13] rather than by the formula suggested ˆ above sˆ = exp(−ˆa/m).

log(x) =

1 1 log(−log(1 − G(x))) + log(s) → X = Y + log(s) m m

(5)

leading to m=

1 , d

s = exp(c)

(6)

Computation of the least squares regression for X at the ordinate and Y at abscissa leads to the estimates cˆ and dˆ of the regression coefficients. Example 5 (Table A1, Fig. 2a): For XonY (Eq. (5)), median plotting positions (Eq. (8)), cˆ = 2.550 and dˆ = 0.291. Consequently, the Weibull parameter estimates can be obtained as follows: ˆ = 1/dˆ = 3.431, and sˆ = exp(ˆc) = 12.807. Absolute errors of the m estimates are 1.431 and 2.807, respectively.

1.3.3.

Estimation of G(x): plotting positions for LS

A critical point in LS Weibull modeling is the estimation of the failure probability G(x). Discrepancies in the LS Weibull statistics found for different software packages can for example be explained by different estimation techniques for G(x) [6]. Typically an estimate of the failure probability for each

Please cite this article in press as: Bütikofer L, et al. Two regression methods for estimation of a two-parameter Weibull distribution for reliability of dental materials. Dent Mater (2014), http://dx.doi.org/10.1016/j.dental.2014.11.014

DENTAL-2472; No. of Pages 18

ARTICLE IN PRESS

xxx.e4

d e n t a l m a t e r i a l s x x x ( 2 0 1 4 ) xxx.e1–xxx.e18

Fig. 2 – The two different least square linear regression approaches for the data in Appendix A.3 generated for m = 2, s = 10 and n = 5. (a) Comparison of YonX and XonY in the same plot using median ranks for plotting positions. Note that the two regressions do not have identical slopes and intercepts. (b and c) Influence of plotting positions on regression of YonX and XonY.

ˆ i ) is computed as an order statistic, i.e. a observation (termed G function of the ordered observations ranked from 1 (weakest ˆ i are specimen) to n (strongest specimen). These estimates G sometimes referred to as plotting positions. Among numerous different strategies that have been suggested to estimate failure probabilities, the following three are most frequently used [14]:

maximizing the likelihood of m and s given the data.

arg maxL(m, s|x1 , . . ., xn ) = arg max m,s

n 

g(xi )

m,s

i=1

= arg max

n    m x m−1 i

s

m,s

s

  x m  i

exp −

s

i=1

ˆi = G

Ri , n+1

(7) Solution of the maximization problem typically requires statistical software packages.

ˆ i = Ri − 0.3 , G n + 0.4

(8)

ˆ i = Ri − 0.5 , G n

(9)

where Ri indicates the rank of the ith observation and n the sample size. These estimates are called mean (Eq. (7)), median (Eq. (8)) and hazen ranks (Eq. (9)). LS linear regressions of YonX and XonY using the different plotting positions are shown in Fig. 2b and c. Mean ranks have been suggested early [12] and are widely used. Median ranks have been claimed to be most accurate [7], whereas hazen ranks have been shown to lead to the least biased estimates for medium to large sample sizes (n > 20) [11,15]. In particular, mean ranks have been used in combination with regression of YonX [12] and median ranks in combination with XonY [7]. Bias caused by plotting positions in the context of the XonY estimation is also discussed in [8]. Example 6 (Table A1, Fig. 2b): Errors and bias can also be caused by the plotting positions definition within one LS estimation technique. For example for YonX regression: ˆ (median ranks) = 1.911, m ˆ (hazen ˆ (mean ranks) = 1.646, m m ranks) = 2.164 can differ considerably. Absolute errors of the estimates are 0.354, 0.089 and 0.164, respectively.

1.3.4.

Maximum likelihood (ML)

ML is an alternative estimation technique. It has attractive mathematical properties such as consistency, asymptotic normality and efficiency and allows a simple construction of confidence intervals [7,8]. ML estimates are calculated by

1.4. Estimation of 95% confidence intervals (95%CI) for characteristic strength and modulus of a two-parameter Weibull distribution ˆ and sˆ for the Weibull As a matter of fact, point estimates (e.g. m parameters m and s) are imprecise and depend on a given data ˆ sample. Even when a large sample is provided the estimates m and sˆ will never be exactly equal to the true and unknown values m and s (except for infinite sample size or zero variability within the sample). ˆ and sˆ Example 7 (Table A1): There is always an error in m ranging from 0.089 to 1.948 and from 2.681 to 4.668, respectively. In order to obtain a measure of how good a point estimate describes the true value, there is a strong need for complementing the point estimates with interval estimates. In fact, it is good statistical practice to not only report point but also interval estimates [16]. Such an interval estimate is provided by a 95%CI. It defines a range that would, in the long run, include the true parameter value in 95% of the cases. The width of a 95%CI provides information about the precision of the point estimate. A narrow 95%CI results from large sample size and/or low variation within the sample and indicates a precise estimation. Example 8 (Table A1): The 95%CI suggests a range that will in most cases include the true value. In our example the true parameters m = 2 and s = 10 are covered by all 95%CI. For example for YonX with mean plotting positions the true value 2 is

Please cite this article in press as: Bütikofer L, et al. Two regression methods for estimation of a two-parameter Weibull distribution for reliability of dental materials. Dent Mater (2014), http://dx.doi.org/10.1016/j.dental.2014.11.014

DENTAL-2472; No. of Pages 18

ARTICLE IN PRESS d e n t a l m a t e r i a l s x x x ( 2 0 1 4 ) xxx.e1–xxx.e18

contained in (0.656; 4.128) and 10 is included in (8.250, 26.078). However, the 95% are very wide, reflecting low sample size and high variability within the sample. A very important measure for evaluation of a 95%CI estimate is the coverage. It is defined as the fraction of repeated experiments for which the 95%CI actually contains the true value and is desired to be close to 95%. Of course, it is impossible to check the coverage for a real sample due to the unknown underlying sampling distribution and true parameter values. For Monte Carlo simulated data, however, the coverage of the 95%CI can be investigated directly. Although the importance and relevance of estimation of 95%CI for Weibull parameters is widely acknowledged [7,8,17], they are not always reported. The main reason is that although the computation of the point estimates of the Weibull distribution for both LS (YonX and XonY) approaches discussed above is well defined it is actually unclear how the 95%CI estimates should be computed in such a case. For ML estimators, in contrast, the construction of 95%CI is well-established using the so-called Wald-approach:

2.

Materials and methods

2.1.

Software

xxx.e5

All analyses were performed in the R programming language Version 2.7.0 (www.r-project.org) [9].

2.2.

Simulation procedure

Monte Carlo simulation was performed according to potential values in dental studies [6]. The exact values chosen were as follows: m = 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, (11) s = 10, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000,

(12)

n = 3, 4, 5, 7, 10, 15, 20, 25, 30, 35, 40, 45, 50, 60, 70, 95%CIw = estimate ± z0.975 × se(estimate)

where z0.975 = 1.96 denotes the 0.975 quantile of the standard normal distribution and “se” the standard error. 95%CI can also be calculated based on 1-to-1 transformation of point estimates (e.g. the logarithm) if standard errors are adapted appropriately (e.g. using the delta method [18]). Such 95%CI may have superior properties in some cases. In context of LS estimation of Weibull parameters, it is necessary to clarify if the Wald-type 95%CI may be used together with se obtained from LS estimation (Yonx or XonY). If not, it is essential to develop a method which provides se and 95%CI for those estimators. Up to our knowledge there are no public domain programs providing Weibull parameter estimates for both LS procedures YonX and XonY for different plotting positions techniques (mean, median and hazen ranks), which are complemented by 95%CI. Therefore, it is highly desirable to develop a freely accessible Excel template, where practitioners can input their data and obtain instantaneously not only point estimates for modulus (m) and characteristic strength (s) by all methods discussed here but also the corresponding 95%CI.

1.5.

80, 90, 100.

(10)

Hypotheses

Using stochastic simulations, this study compares linear regression of YonX and XonY in combination with different plotting positions. The aim is to investigate whether the two methods are equivalent and whether plotting positions do influence the estimated Weibull parameters or the relative performance of the methods. Simulations include a broad spectrum of sample sizes and absolute values of the Weibull parameters to test for respective influences. The two LS methods are evaluated against the well-known ML estimators. Finally, two different types of 95%CI are outlined and their feasibility for LS estimators is tested.

(13)

Procedures of the stochastic simulations are outlined in Fig. 3. For a specific set of parameter values, a master sample of size N was simulated from the corresponding Weibull distribution. Samples of different sizes n were generated by using the first n data points of the master sample. Smaller samples were therefore always subsets of larger samples. This procedure reduces the random variability between samples of different sizes and allowed for improved analysis of the influence of sample size. Two different procedures are used to either directly compare the performance of the two estimation methods under investigation (YonX vs. XonY) (Fig. 3a) or to evaluate their characteristics together with the ML (Fig. 3b).

2.3. Comparison of LS approaches YonX and XonY for point estimation of m and s For direct comparison of two estimation methods (Fig. 3a) R = 10,000 samples were simulated for a specific set of parameter values and sample size (Eq. (11)–(13)). For one particular choice of m and s, Weibull parameters were estimated with the two methods under comparison (YonX and XonY) leading to ˆ i and sˆ i . The method resulting in esti10,000 method-specific m mates closer to the true value (i.e. smaller absolute error ε(m) = ˆ i − m|or ε(s) = |ˆsi − s|, respectively) was evaluated. The frac|m tion of the 10,000 samples for which one method resulted in estimates with smaller absolute errors was computed. This was achieved by summing up an appropriate indicator function. The indicator function equals 1 if its condition is fulfilled (e.g. εYonX (m) smaller than εXonY (m)) and 0 otherwise.

2.4. and s

Evaluation of bias for different estimators of m

For a specific set of parameter values and sample size (Eq. (11)–(13)), simulations were repeated for R = 1000 times. Estiˆ i and sˆ i ) were computed based mates for Weibull parameters (m on YonX, XonY or ML methods for each repetition (Fig. 3b).

Please cite this article in press as: Bütikofer L, et al. Two regression methods for estimation of a two-parameter Weibull distribution for reliability of dental materials. Dent Mater (2014), http://dx.doi.org/10.1016/j.dental.2014.11.014

DENTAL-2472; No. of Pages 18

ARTICLE IN PRESS

xxx.e6

d e n t a l m a t e r i a l s x x x ( 2 0 1 4 ) xxx.e1–xxx.e18

Fig. 3 – Scheme of Monte Carlo simulations for (a) direct comparison of YonX and XonY or (b) evaluation of the characteristics of a specific estimation method with respect to (b1) point estimates and (b2) interval estimates. R: number of repetitions, 1000 or 10,000; cv: coefficient of variation; sd: standard deviation; cov: coverage; CIM : Menon 95CI; CILS : least squares 95%CI; CIML : maximum likelihood 95%CI; I: indicator function, 1 if condition fulfilled, 0 otherwise.

The R estimates obtained for a specific method, set of paramˆ and eter values and sample size were averaged (leading to m sˆ ) and normalized with respect to the true values of m and ˆ s (m/m and sˆ /s). The deviation of these averaged normalized estimates from one (exact match) can be interpreted as an approximation for the relative bias:

ˆ i) − m ˆ mean(m m = −1 m m

exp z0.975

2.5.1.

Formulae for 95%CI

 1.1/n



z0.975





1.1 n



(15)

(14)

where the expectation is approximated by the average over a large number of repetitions (R). The same equation can be formulated for s. Standard deviation (sd) and coefficient of variation (cv = sd/mean) were calculated in order to obtain information about the precision of the estimators under investigation. For an overview, relative biases and cv were averaged for 64 different combinations of true parameter values (m = 1, 2, 3, 4, 5, 10, 15, 20, s = 10, 20, 40, 100, 200, 500, 1000, 2000).

Interval estimates: 95%CI



ˆ 95% CIM (m), upper = mexp

95% CIM (s), lower =

2.5.

ˆ m



95% CIM (m), lower =



ˆ ˆ −m bias(m) E(m) ˆ = relative bias(m) = m m ≈

The following formulas are derived and empirically verified in Appendix B.

We suggest the use of a Wald-type 95%CI based on a se formulated by Menon [19]. It is termed Menon 95%CI (95%CIM ).

exp z0.975



95% CIM (s), upper = sˆ exp

z0.975





ˆ (1.168/n)(1/m)



1.168 1 n m ˆ



(16)

These 95%CIM are compared with naïve Wald-type 95%CI based on se of the regression coefficients (YonX: intercept a and slope b; XonY: intercept c and slope d). They are denoted CILS .

ˆ 95% CILS,YonX (m), lower = bˆ − z0.975 se(b) ˆ 95% CILS,YonX (m), upper = bˆ + z0.975 se(b)

(17)

Please cite this article in press as: Bütikofer L, et al. Two regression methods for estimation of a two-parameter Weibull distribution for reliability of dental materials. Dent Mater (2014), http://dx.doi.org/10.1016/j.dental.2014.11.014

ARTICLE IN PRESS

DENTAL-2472; No. of Pages 18

xxx.e7

d e n t a l m a t e r i a l s x x x ( 2 0 1 4 ) xxx.e1–xxx.e18







95% CILS,YonX (s), lower = exp ⎣exp ⎝log







95% CILS,YonX (s), upper = exp ⎣exp ⎝log



aˆ − bˆ

− z0.975

se(aˆ ) aˆ



 + z0.975

−1

(19)

ˆ 95% CILS,XonY (m), upper = dˆ + z0.975 se(d)

95% CILS,XonY (s), lower = exp[ˆc − z0.975 se(ˆc)] 95% CILS,XonY (s), upper = exp[ˆc + z0.975 se(ˆc)]

(20)

Note that the CILS, YonX for s is computed on the double logarithmic scale and back-transformed using the double exponential function. However, it has not yet been investigated whether these naïve 95%CILS fulfill basic requirements, i.e. whether they cover the true parameter value in the assigned fraction of cases. For ML, Wald-type 95%CIML are frequently applied and well-established. Standard errors for ML estimates can be computed based on the square root of the inverse observed Fisher Information matrix and are typically given by statistical software packages. If different parameterizations are used, corresponding 95%CIML can be back-transformed to m and s using the appropriate transformation.

2.5.2.

Coverage of 95%CIM

Simulations for analysis of the coverage of 95%CIM were performed according to the outline in Fig. 3b with R = 10,000 repetitions. A variety of true parameter values and sample sizes were used (Eqs. (11)–(13)). 95%CIM,i were calculated based on YonX and XonY for each of the R = 10,000 samples simulated from a Weibull distribution with specific true parameter values m and s, and sample size n. Coverage of 95%CIM with respect to m and s was computed as the fraction of CIM,I that contained the true parameter value m and s, respectively.

2.5.3.

Comparison of 95%CIM , 95%CILS and 95%CIML

Simulations were performed according to the scheme in Fig 3b with fixed true parameter values m = 2 and s = 10 and R = 1000 repetitions. A variety of sample sizes was used (Eq. (13)). For each of R = 1000 samples of a specific size n, 95%CIM,i and 95%CILS,i for LS estimators as well as 95%CIML,i for ML estimators were computed. Coverage with respect to m and s was evaluated as the fraction of 95%CIi that contained the true parameter values.

2.5.4.

Excel-template (Appendix A)

A free Excel-template “Weibull YonX XonY Calculator.xlsx” is provided,1 calculating point and interval estimates (95%CI) of Weibull parameters for a given sample. Point estimates based on all LS methods and plotting positions described above were obtained by applying formulae provided in Eqs. (4) and (6). 95%CI were calculated based in the Menon approach using

1

2

se(aˆ ) aˆ

−1

ˆ 95% CILS,XonY (m), lower = dˆ − z0.975 se(d)





aˆ bˆ





Excel-template is available as Online Supplementary material.

 +

2

ˆ se(b) bˆ

 +

2

ˆ se(b) ˆb

⎞⎤ ⎠⎦

2

⎞⎤

(18)

⎠⎦

Eqs. (15) and (16). The use of the template is twofold. First, measurements can be input to get instantaneous estimates for Weibull parameters according to both YonX and XonY estimation approaches for mean, median and hazen plotting positions (sheet1). Second, given a sample size and estimates of the modulus and the characteristic strength obtained by an LS approach the corresponding 95%CI can be computed (sheet2).

3.

Results

3.1.

YonX or XonY

The fraction of samples for which YonX resulted in smaller absolute errors than XonY for estimation of m and s, respectively, is shown in Fig. 4. Each plot represents these fractions for combinations of m along values in Eq. (11) and s along values in Eq. (12) for a specific sample size (5, 20 or 100) and plotting positions (mean, median or hazen ranks) for estimation of either m or s (left and right half of Fig. 4). Strikingly, the magnitude of the true parameter value did not appear to have an influence on which method resulted in more accurate estimates, as the plots were very homogenous. In the majority of samples XonY resulted in more accurate estimates for m in combination with mean and median ranks and YonX in combination with hazen ranks. With increasing n, the fraction of samples for which XonY performed better tended to increase for median and hazen ranks. XonY was e.g. clearly the better method in combination with median ranks for large samples while the situation was less distinguishable for small samples. Similarly, YonX was better with hazen ranks for small samples but both methods performed equally well for large samples. A difference between the two linear regression methods was hardly detectable for sˆ . XonY performed slightly better than YonX in combination with mean ranks.

3.2.

Bias and variability of the estimators

ˆ and sˆ averaged for different combinations Relative biases of m of true parameter values and evaluated as a function of sample size are shown in Fig. 5. Generally, estimates obtained with all methods under investigation were approaching the true values with increasing sample size (Fig. 5a). Improvement was most striking for sample sizes between three and 30 and rather small for samples larger than 50. Differences between estimation methods decreased with sample size and eventually all estimates almost coincided at the true parameter values. Relative biases were clearly larger in terms of absolute values for ˆ than for sˆ . This was true for almost all sample sizes but parm ticularly evident for small samples. A rough summary of the following findings is shown in Table 2a.

Please cite this article in press as: Bütikofer L, et al. Two regression methods for estimation of a two-parameter Weibull distribution for reliability of dental materials. Dent Mater (2014), http://dx.doi.org/10.1016/j.dental.2014.11.014

DENTAL-2472; No. of Pages 18

ARTICLE IN PRESS

xxx.e8

d e n t a l m a t e r i a l s x x x ( 2 0 1 4 ) xxx.e1–xxx.e18

Fig. 4 – Fraction of 10,000 simulated samples for which YonX has a smaller absolute error than XonY for estimation of Weibull modulus m (left half) and characteristic strength s (right half). Different combinations of true parameter values m and s, and different sample sizes n are indicated. Numbers represent the average over the respective field.

Mean ranks based estimates for m showed relatively low biases when sample size was small but the improvement with increasing sample size was only moderate. In accordance with the previous findings, the regression of XonY performed better than YonX over almost the whole range but especially for small samples. Hazen ranks based estimates for m showed very little bias when the sample size was large (n > 50) but were more biased than those based on mean ranks for small samples (n < 10). In contrast to mean ranks, YonX was less biased than XonY in combination with hazen ranks for all sample sizes tested. Median ranks based estimates for m showed intermediate characteristics. Regression of YonX appeared to result in less biased estimates for small samples, whereas XonY was better for large samples. Interestingly, hazen ranks overestimated m in almost all situations, whereas mean ranks tended to underestimate m as long as sample size was not very small (n ≥ 5). Median ranks resulted in overestimation in combination with XonY and underestimation with YonX. In comparison, ML estimates for m behaved poorly when the sample size was small (n < 10), were generally more biased than most LS estimates for samples smaller than 50 and tended to overestimate the true parameter value. A dramatic overestimation of m of more than two-fold was found for small samples.

Estimate sˆ was more independent on the estimation approach and most methods provided good estimates even for sample sizes as small as three. In general, mean ranks were worse than median and hazen ranks and XonY performed slightly better than YonX in most situations. However, differences were small and corresponding confidence intervals were overlapping (data not shown). The coefficient of variation (cv = sd/mean) decreases with increasing sample size, reflecting the associated gain in precision (Fig. 5b). Somewhat similar to bias, improvement was greatest between 3 and 30 and only moderate for samples ˆ was larger than of sˆ suglarger than 50. Generally, cv of m gesting that s can not only be estimated with less bias but also with higher precision. Surprisingly, all regression methods show a very similar cv and only for large sample sizes, ML may be slightly more precise. In order to analyze the influence of the true parameter values on relative bias, the results for a number of different parameter values are shown separately in Fig. 6. Relative ˆ was only marginally influenced by magnitude of bias of m the true values of m or s over the whole range tested. Relative bias of sˆ was not much affected by the true value of s either but was clearly increasing for decreasing m, in particular for small m. However, the relationship between estimation

Please cite this article in press as: Bütikofer L, et al. Two regression methods for estimation of a two-parameter Weibull distribution for reliability of dental materials. Dent Mater (2014), http://dx.doi.org/10.1016/j.dental.2014.11.014

DENTAL-2472; No. of Pages 18

ARTICLE IN PRESS d e n t a l m a t e r i a l s x x x ( 2 0 1 4 ) xxx.e1–xxx.e18

xxx.e9

Fig. 5 – Bias and variability for estimation of Weibull modulus m and characteristic strength s by YonX (), XonY (), and ML ˆ and sˆ estimated by the different methods. (b) Variability of the estimates in terms of coefficient of (). (a) Relative bias of m variation (cv = sd/mean). 64 combinations of true values between 1 and 10 (m) and 10 and 2000 (s) were used.

Please cite this article in press as: Bütikofer L, et al. Two regression methods for estimation of a two-parameter Weibull distribution for reliability of dental materials. Dent Mater (2014), http://dx.doi.org/10.1016/j.dental.2014.11.014

DENTAL-2472; No. of Pages 18

ARTICLE IN PRESS

xxx.e10

d e n t a l m a t e r i a l s x x x ( 2 0 1 4 ) xxx.e1–xxx.e18

Fig. 6 – Influence of the true values of Weibull modulus m and characteristic strength s on relative bias of YonX (), XonY (), and ML (). Median ranks were used for plotting positions.

methods seemed not to be affected by the magnitude of the true parameter values. Similar results were obtained for the cv (results not shown).

3.3.

Coverage

Coverage of 95%CIM for m and s estimated by LS methods was computed for a variety of true parameter values according to Eqs. (11)–(13). Results for m < 1 and m ≥ 1 were identical and only the latter are shown in Fig. 7. Importantly, coverage of 95%CIM did not depend on the true parameter values. Coverage of m was generally excellent and only YonX together with mean ranks showed some undercoverage. Coverage of s tended to be low for small samples (n = 5) if median and hazen ranks were used but were in the targeted region for samples with n ≥ 20,

regardless of estimator, plotting positions or true parameter value. In order to compare 95%CIM and 95%CILS , true parameter values were fixed at m = 2 and s = 10. Coverage for 95%CIM was in the targeted region for all estimation methods and, surprisingly, even for small samples sizes (Fig. 8a). In accordance to the findings above, some differences between estimation techniques were found, in particular for coverage of m (refer to Table 2b for a summary). Coverage of s was not affected by whether YonX or XonY was used but depended on plotting positions for small samples sizes. Mean ranks resulted in best coverage for n < 10, while hazen ranks showed clear undercoverage. Median ranks again had intermediate properties. Table 1 further reveals characteristics of Menon-type CI applied to the LS estimators. 95%CIM were not symmetric and

Please cite this article in press as: Bütikofer L, et al. Two regression methods for estimation of a two-parameter Weibull distribution for reliability of dental materials. Dent Mater (2014), http://dx.doi.org/10.1016/j.dental.2014.11.014

DENTAL-2472; No. of Pages 18

ARTICLE IN PRESS d e n t a l m a t e r i a l s x x x ( 2 0 1 4 ) xxx.e1–xxx.e18

xxx.e11

Fig. 7 – Coverage of Weibull modulus m (top half) and characteristic strength s (bottom half) for 95%CIM based on YonX (left half) and XonY (right half) for 10,000 simulated samples. Different combinations of true parameter values m and s, and different sample sizes n are indicated. Numbers represent the average over the respective field.

always positive, as they were calculated on the log-scale. Their widths decreased with increasing sample size. 95%CIM were getting wider from mean to median to hazen ranks as well as from YonX to XonY for m, whereas the opposite is the case for s.

Surprisingly, coverage was very poor for naïve 95%CILS for all sample sizes (Fig. 8b) and the application of these CI is not recommendable at all. In most situations dramatic undercoverage was found and only about 20% of the 95%CILS actually included the true parameter values. 95%CIML , in contrast,

Please cite this article in press as: Bütikofer L, et al. Two regression methods for estimation of a two-parameter Weibull distribution for reliability of dental materials. Dent Mater (2014), http://dx.doi.org/10.1016/j.dental.2014.11.014

DENTAL-2472; No. of Pages 18

ARTICLE IN PRESS

xxx.e12

d e n t a l m a t e r i a l s x x x ( 2 0 1 4 ) xxx.e1–xxx.e18

Fig. 8 – Coverage of Weibull modulus m and characteristic strength s for (a) 95%CIM and (b) 95%CILS (for YonX and XonY) or 95%CIML (for ML). YonX (), XonY (), ML (). True parameter values: m = 2 and s = 10.

show targeted coverage for sufficiently large sample sizes (n > 15).

4.

Discussion

The Weibull distribution is widely used in material science to model fracture probability as a function of applied stress [12]. In particular brittle fracture of ceramics material such as

dental restorative ceramics has been predicted based on the Weibull distribution [5,20]. A three-parameter Weibull model may lead to the best fit as the third parameter is necessary to model strength data collected from specimens that contain residual stress, which e.g. exists in bilayered ceramic–ceramic and metal–ceramic restorations. However, the two-parameter model has been recommended for small sample sizes and in the absence of a physical justification for a non-zero threshold [6,7]. As discussed in [6], when analyzing real data, no strong

Please cite this article in press as: Bütikofer L, et al. Two regression methods for estimation of a two-parameter Weibull distribution for reliability of dental materials. Dent Mater (2014), http://dx.doi.org/10.1016/j.dental.2014.11.014

ARTICLE IN PRESS

DENTAL-2472; No. of Pages 18

xxx.e13

d e n t a l m a t e r i a l s x x x ( 2 0 1 4 ) xxx.e1–xxx.e18

Table 1 – Characteristics of 95%CI for Weibull modulus m and characteristic strength s averaged over 1000 samples of size n simulated from a Weibull distribution with true parameter values m = 2 and s = 10. 95%CIM and 95%CIML are shown for LS (YonX, XonY) and ML, respectively. Method

YonX

n

5

20

100

XonY

5

20

100

ML

5 20 100

Gi

Weibull modulus m 95%CI

Width

Mean Median Hazen Mean Median Hazen Mean Median Hazen

[0.69, 4.32] [0.80, 5.05] [0.92, 5.75] [1.13, 2.82] [1.21, 3.03] [1.28, 3.21] [1.55, 2.34] [1.59, 2.40] [1.63, 2.45]

3.63 4.25 4.84 1.70 1.82 1.93 0.79 0.81 0.83

Mean Median Hazen Mean Median Hazen Mean Median Hazen

[0.76, 4.79] [0.89, 5.61] [1.02, 6.42] [1.19, 2.98] [1.28, 3.20] [1.36, 3.40] [1.58, 2.39] [1.62, 2.45] [1.66, 2.50]

– – –

[1.36, 5.65] [1.52, 3.04] [1.74, 2.37]

Weibull characteristic strength s 95%CI

Width

Coverage

0.93 0.96 0.96 0.92 0.95 0.96 0.92 0.94 0.96

[5.60, 21.55] [5.96, 18.65] [6.25, 16.89] [7.80, 13.65] [7.87, 13.24] [7.93, 12.94] [9.02, 11.30] [9.02, 11.22] [9.01, 11.16]

15.95 12.69 10.64 5.86 5.37 5.01 2.27 2.21 2.15

0.96 0.92 0.89 0.96 0.95 0.95 0.96 0.95 0.95

4.03 4.72 5.40 1.79 1.92 2.04 0.80 0.83 0.84

0.96 0.95 0.95 0.95 0.97 0.96 0.95 0.96 0.95

[5.77, 19.19] [6.11, 16.96] [6.37, 15.55] [7.79, 13.18] [7.85, 12.83] [7.91, 12.55] [8.99, 11.21] [8.99, 11.14] [8.98, 11.08]

13.42 10.85 9.17 5.40 4.97 4.64 2.21 2.15 2.10

0.93 0.90 0.86 0.96 0.94 0.94 0.95 0.95 0.94

4.29 1.52 0.63

0.87 0.93 0.94

[6.71, 14.60] [8.02, 12.47] [9.02, 11.06]

7.89 4.45 2.04

0.83 0.93 0.95

motivation for the use of the third parameter was found. Therefore, we concentrated on the two-parameter Weibull distribution. Estimation of Weibull parameters is typically based on least squares linear regression (LS) or maximum likelihood (ML) methods. However, linear regression can lead to different results depending on whether Y is regressed on X or alternatively X regressed on Y and depending on how failure probability is estimated (choice of plotting position). We decided to use the Monte Carlo simulation technique to evaluate performance of the different methods. A broad range of true Weibull parameters values and sample sizes were assumed for simulations (Eqs. (11)–(13)). The choice of values for m and s reflects the experience gained with Weibull statistical analysis on several dental material data sets and values found in the literature. Sample sizes ranged over small values frequently found in applications to large values that are frequently suggested by statisticians. The performance of all methods greatly improved with increasing sample size and we strongly recommend to rely on large sample sizes (n ≥ 30) for estimation of Weibull parameters. Monte Carlo simulations allowed for determination of bias since the true parameter values were known and could be compared to the expectation, approximated by the average over a large number of simulations. The following main findings have been obtained: ˆ than for sˆ as long as First, relative bias was larger for m the true value of m was in a range typical for dental studies (1 < m < 20). Second, least biased point estimates for m for small samples (n ≤ 10) were obtained by regression of XonY in combination with mean ranks and for large samples (n ≥ 30) by regression of YonX with hazen ranks (Table 2a). These findings are in agreement with previous studies showing that least biased estimates are obtained with hazen ranks when sample

Coverage

size is rather large [11,15]. Median ranks are a compromise between the former two resulting in acceptable estimates over the whole range. Using median ranks, bias tends to be smaller if X is regressed on Y unless the sample size is small. However, since bias is less of a problem for large samples and since YonX results in under-rather than overestimation of m, YonX may actually be the better choice in combination with median ranks. It is important to keep in mind that an overestimation of m is a more serious problem than underestimation, especially for dental material investigations. A larger modulus suggests a smaller variability of strength, leading to overconfidence in the performance of the material and to an increased proportion of specimens that break at a certain expected level of stress. Estimates for m are increasing from mean to median to hazen ranks and from YonX to XonY. That indicates that the least biased method (YonX, hazen for large sample sizes) has also the highest potential for overestimation. Mean and YonX on the other hand is very conservative, most likely leading to an underestimation of m. Median ranks together with YonX may be a clever compromise, rather resulting in underestimation and still providing estimates with a low bias. Third, least biased estimates for s were obtained by regression of YonX and hazen ranks or XonY and median ranks (Table 2a). However, sˆ depended only weakly on estimation method and reasonable estimates were obtained with all methods tested. Fourth, ML estimates were usually more biased than the best linear regression based estimate. In particular for small sample sizes (n < 10), ML could result in dramatic overestimation of m by up to 2-fold which should be avoided for dental materials unless an unbiasing correction is used [5,20].

Please cite this article in press as: Bütikofer L, et al. Two regression methods for estimation of a two-parameter Weibull distribution for reliability of dental materials. Dent Mater (2014), http://dx.doi.org/10.1016/j.dental.2014.11.014

DENTAL-2472; No. of Pages 18

ARTICLE IN PRESS

xxx.e14

d e n t a l m a t e r i a l s x x x ( 2 0 1 4 ) xxx.e1–xxx.e18

Table 2 – (a) Methods to obtain least biased point estimates of Weibull modulus m and characteristic strength s for small (n ≤ 10), medium (10 < n < 30) and large (n ≥ 30) samples. (b) Methods to obtain 95%CIM with coverage of m and s closest to 0.95 for each sample size category. (c) Recommended method to estimate both m and s for each sample size category. n

Weibull modulus m

Weibull characteristic strength s

(a) Point estimates

Small Medium Large

XonY/mean XonY/median YonX/hazen

YonX,XonY/hazen; XonY/median YonX,XonY/hazen; XonY/median YonX,XonY/hazen; XonY/median

(b) Confidence intervals

Small Medium Large

XonY/mean,median; YonX/median,hazen XonY/mean,median,hazen; YonX/median,hazen XonY/mean,median,hazen; YonX/median,hazen

YonX,XonY/mean YonX,XonY/mean,median YonX,XonY/median,hazen

(c) Suggestions

Small Medium Large

XonY/mean; YonX/median XonY/mean; YonX/median; XonY/median XonY/median; YonX/hazen

Fifth, magnitude of true values of m and s did neither significantly influence bias of the estimates nor performance of the estimation methods. For small m bias of sˆ increased rapidly, ˆ remained constant. Thus, it became increaswhereas bias of m ingly difficult to estimate s as m gets smaller. This finding makes intuitively sense, as estimation of the location (s) is more difficult for a distribution with a large spread (small m). However and importantly, relative performance of the estimation method is not affected. ˆ was clearly higher Sixth, coefficient of variation (cv) of m than of sˆ , indicating that the latter can be estimated more precisely. Seventh, cv did only weakly depend on estimation method, indicating that all methods under investigation have a similar precision. Only ML estimates for m may have a slightly lower cv than linear regression estimates, as expected due to their asymptotic efficiency. However, differences were small and linear regression estimates were surprisingly precise. These findings suggest that the choice of regression method should depend on sample size, plotting positions, and the parameter to be estimated. Since estimation of m is both more biased and less precise, and since decisions in dental material investigations are often based on m [13], it may be reasonable to adapt the method for estimation of m. Rather conservative methods may be considered to avoid overestimation of m, in particular for small sample sizes. Another possibility is to report the stress level corresponding to a 5% probability of failure, x = s[−log(0.95)]1/m . This measure is actually less affected by overestimation of m because of the inverse in the exponent. Findings presented here help to critically evaluate the validity of frequently applied Weibull parameter estimation techniques. One of the first methods to be applied to dental materials was YonX together with mean ranks [12]. Unfortunately, this method is still frequently used [13]. We found that it is not optimal at all and discourage its application in basically all situations. Later, it has been strongly recommended to use regression of XonY [7]. However, only median ranks, LS regression and mainly large sample sizes were considered. In such situations, XonY may indeed be beneficial. Our results suggest that this may not hold in general, although median ranks and regression of XonY is among the least biased methods.

We strongly suggest not only computation of point estimates but also corresponding 95%CI. In particular the Menon approach to derive standard errors (seM ) and construct 95%CI was found to perform well. It is based on the large sample distributions of estimators for the smallest extreme value (SEV) parameters [19]. Although an asymptotical justification is missing, we found empirically that the seM can also be applied to the LS estimators YonX and XonY to construct 95%CIM via a Wald-type approach (Appendix B). Interestingly, 95%CIM showed excellent coverage of the true parameter value, even for small sample sizes. 95%CIM for m performed best for XonY regardless whether mean, median or hazen ranks are used. For median and hazen ranks, they may also be applied together with YonX (Table 2b). 95%CIM for s were less influenced by regression method as compared to plotting positions and sample size. Undercoverage was observed for small samples (n ≤ 10) when median and especially hazen ranks were used. Therefore, YonX should not be used with mean ranks and hazen ranks should generally not be applied when sample size is small. However, these combinations also performed poorly with respect to bias and should generally be avoided. Depending on the hypothesis of interest, undercoverage may be more problematic than overcoverage, as it leads to 95%CIM that are too small and potentially to false positive findings. Overcoverage, on the other hand, may increase false negative findings, i.e. reduce the power of the analysis. Importantly, coverage of 95%CIM was not influenced by magnitude of the true parameter value for all values tested motivating their usage for all kind of parameter values typical for dental studies. The empirical properties of 95%CIM for LS estimators were at least as good (or even better) than those of the well-established 95%CIML . Together with the strikingly simple way of construction this makes 95%CIM very attractive for practical use. Therefore, the biggest advantage of the ML over the LS method–the easy way to construct confidence intervals – may be compensated for. Roos and Stawarczyk [6] suggested the use of 95%CILS based on standard errors of the linear regression coefficients. Our findings show that this is definitely not an optimal approach. CILS do not cover the true parameter value in the targeted fraction of cases. Regression based on the linearized Weibull model may not fulfill basic assumption of linear regression, i.e.

Please cite this article in press as: Bütikofer L, et al. Two regression methods for estimation of a two-parameter Weibull distribution for reliability of dental materials. Dent Mater (2014), http://dx.doi.org/10.1016/j.dental.2014.11.014

DENTAL-2472; No. of Pages 18

ARTICLE IN PRESS xxx.e15

d e n t a l m a t e r i a l s x x x ( 2 0 1 4 ) xxx.e1–xxx.e18

uncorrelated, homoscedastic and normally distributed error terms. Actually, residuals were clearly correlated and hardly normally distributed (data not shown). Therefore, standard errors and hence confidence intervals derived from the linear model are not reliable. Taking results from point and interval estimation together some recommendations about which method to use in which situation can be formulated (Table 2c). Recommendations are based on combinations of regression method and plotting positions and are made separately for three sample size categories, small (n ≤ 10), medium (10 < n < 30) and large (n ≥ 30). For small samples, XonY/mean ranks is the least biased estimator for m and results in excellent coverage of m and s for 95%CIM . In addition, m is rather underestimated than overestimated. Bias for s is relatively large but is unproblematic for m in the tested range. An alternative with very similar characteristics is YonX/median ranks. A disadvantage is undercoverage of s for 95%CIM , in particular when sample size is very small (n ≤ 5). For medium sized samples the same two methods, YonX/median ranks and XonY/mean ranks are recommended. The former is better with respect to bias of m whereas the latter results in 95%CIM with overcoverage rather than undercoverage of m and s. However, differences are very small. In some cases XonY/median ranks may also be considered as bias for m and s is actually smaller than for the former two approaches and coverage of 95%CIM is excellent. However, XonY/median ranks tend to over-rather than underestimate m, which can potentially be dangerous. This tendency is clearly decreasing with sample size and may not be a problem for n > 20. For large samples, most methods perform well. XonY/median ranks shows very small bias for m and s as well as nicely covered 95%CIM . m might be slightly overcovered, making the CIM more conservative. YonX/hazen ranks may even be better with respect to bias of m and s and coverage for CIM . However, risk of overestimation of m is somewhat increased and sufficiently large sample size is very important. YonX/hazen ranks is the method of choice for very large samples (n > 40). Finally, if one method would have to be chosen for all sample sizes, we would recommend to use YonX/median ranks since this method has a low bias for m for reasonable sample sizes (n > 5), avoids overestimation of m and allows for application of Menon methodology for construction of confidence intervals. A clear strength of our study is that the findings are generally applicable for a wide range of underlying Weibull distributions and sample sizes thank to the use of Monte Carlo simulations. At the same time, as the Monte Carlo method is capable of generation of merely ideal data, our conclusions are limited to cases when the assumption of the underlying Weibull distribution in the data is correct. In practice, however, the underlying distribution is unknown. There are several competing distributions [6,8] that might approximate the unknown distribution well, with Weibull being only one of several possible choices. Therefore, differences between estimators may be actually more pronounced for data violating the Weibull assumption.

In summary, we found that in case of small sample sizes the simple LS estimators for the modulus and characteristic strength of a two-parameter Weibull distribution perform surprisingly well with respect to both, bias and precision. We verified empirically that adaptation of the method suggested by Menon [19] results in an adequate and easy way to compute 95%CI for Weibull parameters. Following the trend initiated by [6] that the code used should lay open and be accessible to everyone, we presented mathematical formulae explicitly. What is more, we provided a free and ready-to-use Exceltemplate “Weibull YonX XonY Calculator.xlsx” in which point estimates and 95%CI for both YonX and XonY approaches and mean, median and hazen plotting positions can be easily computed.

5.

Conclusion

Within the limitations of this in silico investigation it can be concluded that: - LS linear regression of YonX and XonY are not equivalent. Accordingly, estimates of characteristic strength (s) and modulus (m) obtained by YonX and XonY are not equal. - Sample size and plotting positions influences the relative performance of the two LS methods but magnitude of the true parameter values does not. - Sample sizes smaller than 30 may result in biased and imprecise estimates of the Weibull parameters and should generally be avoided. - LS and ML methods are not equivalent. For most settings, LS is less biased for small n while precision is about equal. - An easy-to-apply Menon-type approach for 95%CI computation has been found to perform excellent for LS estimates for most settings. - Naïve estimation of 95%CI using standard errors of regression coefficients is not feasible for LS methods. - All procedures developed and discussed here have been programmed in a free Excel-template “Weibull YonX XonY Calculator.xlsx”.

Conflict of interest The authors report no conflict of interest. The authors alone are responsible for the content and writing of the manuscript.

Appendix A. Excel template “Weibull YonX XonY Calculator.xlsx” A.1. Equations used in the template Point estimates: 95%CIM :

Eq. (4) (YonX), Eq. (6) (XonY) Eq. (15) (m) and Eq. (16) (s)

A.2. Suggestions to the use of the template - directly insert strength values in the data column on sheet1, - copy formulae used in the template for external usage, ˆ sˆ and n on sheet2 to obtain 95%CIM . - insert m,

Please cite this article in press as: Bütikofer L, et al. Two regression methods for estimation of a two-parameter Weibull distribution for reliability of dental materials. Dent Mater (2014), http://dx.doi.org/10.1016/j.dental.2014.11.014

DENTAL-2472; No. of Pages 18

ARTICLE IN PRESS

xxx.e16

d e n t a l m a t e r i a l s x x x ( 2 0 1 4 ) xxx.e1–xxx.e18

A.3. Example A sample n = 5 of the following strength values 8.456, 9.378, 9.471, 22.750, 9.862 was obtained by Monte Carlo simulation from a two-parameter Weibull distribution with m = 2 and s = 10. Respective point and interval estimates are shown in Table A1.

With the variance stabilizing log-transformation Menon ˆ and Menon 95% confidence standard errors (seM ) for log(m) intervals (CIM ) for Weibull modulus m can be obtained:



1 ˆ ˆ = seM (m) = seM (log(m)) ˆ m



95% CIM (m) = exp

=

1 exp ı

x −  ı



exp −exp

6 1 2 n − 1

 n 

n log(X) −

i=1

n

6 sd(log(X)) 

=

n i=1

ˆ =

log(X)

n

+ 0.5772ıˆ = mean(log(X)) + 0.5772ıˆ

Corresponding large-sample distributions are:



asy

ıˆ ∼ N and asy

ı,

1.1ı2



ˆ ∼ N

,



n

1.168ı2 n



ˆ M and its asymptotic disA Menon estimator for m called m tribution are obtained by the relationship to ı: asy

ˆ M ∼N m

 m,

1.1m2 n





⎤ 1.1 ⎦ n



95% CIM (s) = exp[95% CIM ()]



= exp

1.168 1 n m ˆ





1.168 ıˆ n

ˆ ± z0.975





log(ˆs) ± z0.975

1.168 1 n m ˆ

⎡ ⎣

  sˆ exp





z0.975

,

ˆ (1.168/n)(1/m)



 1.168 1 n m ˆ

2 1/2

log(X)

i=1



1.1/n

1.168 ıˆ = n

exp z0.975

2

 , mˆ exp z0.975





= exp

 = log(s).



ˆ m





Consider estimators for the SEV parameters according to Menon [19]:

ıˆ =



seM (log(ˆs)) = seM (ˆs) =

= and

1.1 n



 x −   ı



1.1 n

ˆ does not depend on m. ˆ A Menon Note that the seM of log(m) estimator for s is given by sˆ M = exp(ˆ ), leading to the following 95%CIM for Weibull strength s:

Note that SEV parameters ı and  are closely related to the Weibull parameters m and s: 1 ı= m

ˆ ± z0.975 log(m)

exp z0.975

The Menon approach is based on asymptotics for the smallest extreme value (SEV) distribution. The SEV distribution relates to the Weibull distribution in a similar way as the normal to the lognormal distribution. If a random variable X is Weibull distributed then log(X) is SEV distributed. Menon [19] suggested estimators and respective large sample distributions for the SEV parameters that can be used to construct CI. Due to the close relationship of SEV and Weibull distribution Menon standard errors (se) and confidence intervals (CI) can be applied to the Weibull parameters m and s after appropriate transformation. The pdf of the SEV distribution denotes: f (x) =







Appendix B. Menon approach for 95%CI B.1. Mathematical derivation of 95%CIM

ˆ2 1 1.1m = n m ˆ

Strictly speaking the above derived 95%CIM do theoretically only hold for the Menon estimators of modulus and characterˆ M and sˆ M , as the asymptotic distributions have istic strength, m only been formulated for the corresponding Menon estimators ıˆ and ˆ .

B.2. Empirical verification of the Menon procedure for calculation of 95%CIM for LS estimators An empirical approach was used to test whether CIM can also be applied to LS estimators. CIM are based on standard errors calculated with the Menon procedures (seM ). The idea is to show empirically that these seM are good estimates of the standard deviation for Weibull parameters estimated not only by Menon but also by LS methods. Therefore, standard deviation computed over a large number of samples (an empirical estimate of the standard error) was compared to the averaged seM . N = 1000 large samples (of size n = 100) were simulated from a Weibull distribution and Weibull parameters were estiˆ M = 1/ıˆ and sˆ M = mated using LS or the Menon estimators (m exp(ˆ )). The Menon estimators were used as a control as the CIM are known to be valid for those estimators [19]. Standard

Please cite this article in press as: Bütikofer L, et al. Two regression methods for estimation of a two-parameter Weibull distribution for reliability of dental materials. Dent Mater (2014), http://dx.doi.org/10.1016/j.dental.2014.11.014

ARTICLE IN PRESS

DENTAL-2472; No. of Pages 18

xxx.e17

d e n t a l m a t e r i a l s x x x ( 2 0 1 4 ) xxx.e1–xxx.e18

Table A1 – Point and interval estimates calculated with the free Excel template “Weibull YonX XonY Calculator.xlsx” for a random sample of size n = 5 generated from a Weibull distribution with modulus m = 2 and characteristic strength s = 10. YonX

Slope Intercept ˆ m ˆ − m| |m ˆ se(log(m)) 95%CIM lower 95%CIM upper sˆ |ˆs − s| se(log(ˆs)) 95%CIM lower 95%CIM upper

Mean

Median

1.646 −4.421 1.646 0.354 0.469 0.656 4.128 14.668 4.668 0.294 8.250 26.078

1.911 −5.091 1.911 0.089 0.469 0.762 4.793 14.350 4.350 0.253 8.742 23.556

ˆ and log(ˆs) were computed based on the errors for log(m) asymptotic procedure from Menon (seM ) for each sample and averaged over the 1000 samples. 1 ˆ i )) = mean(seM (log(m ˆ i ))) seM (log(m N N

ˆ = seM (log(m))



i=1

= mean

1.1 n





=

1.1 n

1 seM (log(ˆsi )) = mean(seM (log(ˆsi ))) N

XonY Hazen

Mean

Median

Hazen

2.164 −5.734 2.164 0.164 0.469 0.863 5.426 14.152 4.152 0.223 9.135 21.924

0.344 2.565 2.907 0.907 0.469 1.159 7.290 12.997 2.997 0.166 9.383 18.004

0.291 2.550 3.431 1.431 0.469 1.368 8.602 12.807 2.807 0.141 9.717 16.880

0.253 2.540 3.948 1.948 0.469 1.575 9.900 12.681 2.681 0.122 9.976 16.119

ˆ does only depend on sample size and Note that seM (log(m)) ˆ not on m. Empirical standard errors (seemp ) were obtained by calcuˆ and log(ˆs) lating standard deviations over N samples of log(m) estimated by LS or Menon procedures.

! ! ˆ =" seemp (log(m))

N

seM (log(ˆs)) =

i=1

= mean



1.168 1 n m ˆi



1 N−1

N 

2

ˆ i ) − mean(log(m ˆ i ))] = sd(log(m ˆ i )) [log(m

i=1

! !

seemp (log(ˆs)) = "

1 N−1

N 

2

[log(ˆsi ) − mean(log(ˆsi ))] = sd(log(ˆsi ))

i=1

ˆ and log(ˆs) determined by YonX Fig. A1 – Ratio of Menon standard errors (seM ) and empirical standard errors (seemp ) for log(m) (, –), XonY (, –) and, as a control, Menon estimators (+, ···). 100 repetitions with corresponding smoothing lines are shown. Standard errors are based on 1,000 samples of size n = 100 simulated from a Weibull distribution with m = 2 and s = 10. Please cite this article in press as: Bütikofer L, et al. Two regression methods for estimation of a two-parameter Weibull distribution for reliability of dental materials. Dent Mater (2014), http://dx.doi.org/10.1016/j.dental.2014.11.014

DENTAL-2472; No. of Pages 18

ARTICLE IN PRESS

xxx.e18

d e n t a l m a t e r i a l s x x x ( 2 0 1 4 ) xxx.e1–xxx.e18

Finally, the ratios of seM versus seemp for log(m) and log(s) were computed for the different estimation methods. Menon estimators were used as a control, for which seM and seemp should not differ. An empirical justification for application of CIM to LS estimators would be given if seM were valid for LS estimates, i.e. if they did not deviate largely from seemp . Simulations from a Weibull distribution with fixed parameters m = 2 and s = 10 were performed. The ratios of seM versus seemp for different estimation methods are shown in Fig. A1 for 100 repetitions together with corresponding smoothing lines. As expected, seM for the Menon estimators of log(m) and log(s) were in close agreement to the seemp , confirming that the asymptotic distribution described by Menon [19] holds. In fact, seM based on YonX and XonY linear regression estimates were almost as close to seemp as those based on the Menon estimates. For Weibull modulus, seM based on YonX were even better estimates of the standard deviation than seM based on the Menon estimator. Deviations between LS and Menon estimators tended to be largest for mean ranks (in particular for estimation of s) and almost disappeared for hazan ranks. These results indicate that seM , and hence CIM , work for both YonX and XonY LS estimators just as well as for Menon estimators. CIM were therefore applied to LS methods for 95%CI computation.

Appendix C. Supplementary data Supplementary data associated with this article can be found, in the online version, at http://dx.doi.org/10.1016/ j.dental.2014.11.014.

references

[1] Rinke S, Fischer C. Range of indications for translucent zirconia modifications: clinical and technical aspects. Quintessence Int 2013;44:557–66. [2] Piconi C, Maccauro G. Zirconia as a ceramic biomaterial. Biomaterials 1999;20:1–25.

[3] Weibull W. A statistical distribution function of wide applicability. J Appl Mech 1951;18:293–7. [4] Hallinan Jr A. A review of the Weibull distribution. J Qual Technol 1993;25:85–93. [5] Quinn JB, Quinn GD. A practical and systematic review of Weibull statistics for reporting strengths of dental materials. Dent Mater 2010;26:135–47. [6] Roos M, Stawarczyk B. Evaluation of bond strength of resin cements using different general-purpose statistical software packages for two-parameter Weibull statistics. Dent Mater 2012;28:76–88. [7] Abernethy R. The new Weibull handbook. 5th ed. North Palm Beach: Dr. Robert B. Abernethy; 2009. [8] Rinne H. The Weibull distribution: a handbook. Chapman & Hall/CRC; 2009. [9] R: a language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2010, 3-900051-07-0. http://www.R-project.org [10] Therneau T. A package for survival analysis in S. R package version 2.37-7; 2014 http://CRAN.R-project.org/package=survival [11] Khalili A, Kromp K. Statistical properties of Weibull estimators. J Mater Sci 1991;26:6741–52. [12] McCabe JF, Carrick TE. A statistical approach to the mechanical testing of dental materials. Dent Mater 1986;2:139–42. [13] Ilie N, Bauer H, Draenert M, Hickel R. Resin-based composite light-cured properties assessed by laboratory standards and simulated clinical conditions. Oper Dent 2013;38:159–67. [14] Bergman B. On the estimation of the Weibull modulus. J Mater Sci Lett 1984;3:689–92. [15] Trustrum K, Jayatilaka S. On estimating the Weibull modulus for On estimating the Weibull modulus for a brittle material. J Mater Sci 1979;14:1080–4. [16] Altman DG, David M, Bryant TN, Gardner MJ. Statistics with Confidence. 2nd ed. BMJ Books; 2000. [17] Minitab 16 statistical software. Minitab Inc.: State College, PA; 2010. [18] Held L. Methoden der statistischen Inferenz. Heidelberg: Spektrum Akademischer Verlag; 2008. [19] Menon M. Estimation of the shape and scale parameters of the Weibull distribution. Technometrics 1963;5:175–82. [20] Steen M, Sinnema S, Bressers J. Statistical analysis of bend strength data according to different evaluation methods. J Eur Ceram Soc 1992;9:437–45.

Please cite this article in press as: Bütikofer L, et al. Two regression methods for estimation of a two-parameter Weibull distribution for reliability of dental materials. Dent Mater (2014), http://dx.doi.org/10.1016/j.dental.2014.11.014

Two regression methods for estimation of a two-parameter Weibull distribution for reliability of dental materials.

Comparison of estimation of the two-parameter Weibull distribution by two least squares (LS) methods with interchanged axes. Investigation of the infl...
4MB Sizes 0 Downloads 11 Views