Accident Analysis and Prevention 74 (2015) 162–168

Contents lists available at ScienceDirect

Accident Analysis and Prevention journal homepage: www.elsevier.com/locate/aap

Motor vehicle drivers’ injuries in train–motor vehicle crashes Shanshan Zhao, Aemal Khattak * Department of Civil Engineering and Nebraska Transportation Center, University of Nebraska–Lincoln, 330 Whittier Research Center, Lincoln, NE 68583-0851, United States

A R T I C L E I N F O

A B S T R A C T

Article history: Received 30 June 2014 Received in revised form 23 October 2014 Accepted 24 October 2014 Available online 14 November 2014

The objectives of this research were to: (1) identify a more suitable model for modeling injury severity of motor vehicle drivers involved in train–motor vehicle crashes at highway–rail grade crossings from among three commonly used injury severity models and (2) to investigate factors associated with injury severity levels of motor vehicle drivers involved in train–motor vehicle crashes at such crossings. The 2009–2013 highway–rail grade crossing crash data and the national highway–rail crossing inventory data were combined to produce the analysis dataset. Four-year (2009–2012) data were used for model estimation while 2013 data were used for model validation. The three injury severity levels—fatal, injury and no injury—were based on the reported intensity of motor-vehicle drivers’ injuries at highway–rail grade crossings. The three injury severity models evaluated were: ordered probit, multinomial logit and random parameter logit. A comparison of the three models based on different criteria showed that the random parameter logit model and multinomial logit model were more suitable for injury severity analysis of motor vehicle drivers involved in crashes at highway–rail grade crossings. Some of the factors that increased the likelihood of more severe crashes included higher train and vehicle speeds, freight trains, older drivers, and female drivers. Where feasible, reducing train and motor vehicle speeds and nighttime lighting may help reduce injury severities of motor vehicle drivers. ã 2014 Elsevier Ltd. All rights reserved.

Keywords: Highway–rail grade crossings Trains Motor vehicles Injury severity Crashes

1. Introduction The objectives of this research were: (1) to identify a more suitable model for injury severity of drivers involved in train–motor vehicle crashes at highway–rail grade crossings (HRGCs) from among three commonly used injury severity models and (2) to investigate factors associated with injury severity levels sustained by motor vehicle drivers involved in train–motor vehicle crashes at HRGCs. This study utilized the most recent five-year (2009–2013) reported train–motor vehicle crashes at HRGCs, which were matched to the national highway–rail crossing inventory data using the US Department of Transportation (USDOT) crossing identification numbers. Both the HRGC crash data and the national inventory were obtained from the Federal Railroad Administration Office of Safety Analysis (2014) (FRA, http://www.safetydata.fra.dog.gov). The FRA crash reporting form F6180.57 reports motor vehicle drivers’

* Corresponding author. Tel.: +1 402 472 8126. E-mail address: [email protected] (A. Khattak). http://dx.doi.org/10.1016/j.aap.2014.10.022 0001-4575/ ã 2014 Elsevier Ltd. All rights reserved.

crash injury severity in three categories: fatal, injury and no injury (i.e., property damage only). Data from 2009 to 2012 were used for model estimation while 2013 data were used for model validation. For the four-year model estimation dataset, the matching procedure produced 5641 train–motor vehicle crashes with complete variable information that comprised of 475 (8.5%) motor vehicle driver fatal crashes, 1610 (28.5%) motor vehicle driver injury crashes and 3556 (63.0%) crashes with no injuries. For the validation dataset, there were 1233 train–motor vehicle crashes that included 92 (7.5%) motor vehicle driver fatal crashes, 383 (31.0%) motor vehicle driver injury crashes and 758 (61.5%) no injury crashes. Thus, the percentages of crashes by severity levels for the two datasets were fairly consistent with each other. Accidents at HRGCs invariably involve a single train and a single motor vehicle although a check of the 2009–2013 data revealed one accident involving two motor vehicles. In this case a motor vehicle was unable to stop for a passing train and struck the side of the lead engine resulting in this vehicle striking another motor vehicle. The report included the injury of the motor vehicle driver that struck the train. All other accidents in the datasets involved a single train and a single motor vehicle.

S. Zhao, A. Khattak / Accident Analysis and Prevention 74 (2015) 162–168

Three commonly used models were considered for motor vehicle driver injury severity in crashes reported at HRGCs: for the ordered response framework – an ordered probit (OP) model; for the unordered response framework – a multinomial logit (MNL) model and a random parameter logit (RPL) model. The nested logit model structure was also investigated but abandoned in favor of the MNL and is therefore, not discussed in this paper. The three models were compared to each other across four aspects: number of statistically significant parameters included in model specification, models’ interpretative power, goodness-of-fit, and classification accuracy. The more suitable model among the three was then used to investigate factors associated with motor vehicle driver injury severity in train-involved crashes at HRGCs. The organization of the remaining paper is as follows. After this introduction, a review of previous studies on injury severity models, model comparisons and train–motor vehicle crash injury severities is provided. Section 3 presents a brief modeling background. Section 4 presents modeling results and comparisons among the models. Section 5 provides a summary and conclusions. 2. Literature review This section presents a review of previous studies on injury severity models, train–motor vehicle crash injury severities and different modeling comparisons reported in the reviewed literature. 2.1. Modeling crash injury severity For dependent variables with multiple response outcomes, models are generally classified as either nominal (ignoring the ordinal nature of injury data) or ordinal (considering the ordinal nature of injury data). Among the nominal models the multinomial logit (MNL), nested logit (NL), and random parameter logit (RPL, also referred to as mixed logit model) are commonly employed. Commonly employed ordinal models include ordered logit (OL), ordered probit (OP) and the generalized ordered logit (GOL) model. For a detailed review, interested readers are referred to Savolainen et al. (2011) and Mannering and Bhat (2014). Savolainen et al. (2011) summarized the evolution of research pertaining to statistical analysis of motor-vehicle crash injury severities. They discussed the strengths and weaknesses of different approaches and listed relevant research for each approach. The approaches discussed included binary outcome models, ordered discrete outcome models, unordered multinomial discrete outcome models and non-parameter models such as artificial neural networks (ANN) and classification and regression tree (CART) approaches. Mannering and Bhat (2014) reviewed the evolution of methodological applications and available data in highwayaccident research and included an update of the literature. 2.2. Factors associated with train–motor vehicle crash injury severity Although published research is available on crash injury severity analysis, the majority of it is about crashes reported on highway segments or intersections and not on crash injury severity of drivers involved in train–motor vehicle crashes at HRGCs. Among the research focused on HRGCs, MNL and OL or OP models were employed. Hu et al. (2010) formulated a generalized logit model using data from 592 highway railway crossings in Taiwan. Railway, highway, crossing, traffic control and land use features were considered in their research. Results showed that an increase in the number of daily trains and daily trucks increased the likelihood of more severe crash injuries. Presence of highway separation and obstacle detection devices were also associated with more severe crashes injuries.

163

Eluru et al. (2012) developed a latent segmentation based OL model using FRA crash data from 1997–2006. The crossings were first assigned probabilistically to different segments based on their attributes. Attributes such as higher number of trains, existence of pavement markings for stop signs, and lower maximum posted train speed limits were associated with low-risk crossing segments. Within each segment, an OL model was applied to analyze crash-related attributes. Comparison of the results across different segments showed different variables associated with crash injury severities. Hao and Daniel (2013) used FRA crash data from 2002 to 2011 and an OP model to determine factors influencing drivers’ injury severity levels at HRGCs. Factors found related to higher injury severities included: crashes reported during peak-hour traffic, adverse weather (e.g., cloudy, rain, fog, sleet and snow), low visibility, vehicular speed greater than 50 mph, highway average annual daily traffic (AADT) of over 10,000, train speed greater than 50 mph, truck and truck-trailer and crashes reported in open areas. Russo and Savolainen (2013) using FRA HRGC crash data from 2011 assessed the effects of rail, highway, traffic and driver characteristics on the frequency and severity of HRGC collisions. Injury severity analysis was investigated by using an OL model. Factors that increased the likelihood of fatal injuries included train speeds greater than 60 mph, driver age over 60 years, females and motorists who did not stop at crossings. Fan and Haile (2014) used 2005–2012 FRA HRGC crash data and a MNL model to explore the impacts of various explanatory variables on crash injury severity levels. Results showed that chances of fatalities increased when rail equipment with high speed struck a vehicle and when crashes were reported at higher air temperatures. Male vehicle drivers with age 25 years and above, pickup trucks, and concrete and rubber crossing surfaces were associated with more severe crash injuries; while truck-trailers, foggy and snowy weather conditions, certain land development types and higher daily vehicle traffic volumes were associated with less severe crash injuries. 2.3. Comparison of models Abdel-Aty (2003),Abdel-Aty and Abdelwahab (2004), and Haleem and Abdel-Aty (2010) reported comparisons related to injury severity estimation of accidents reported on roads or intersections. Abdel-Aty (2003) developed OP models for roadway sections, signalized intersections, and toll plazas in Central Florida. Besides comparing factors that affect injury severities at different locations, comparisons between OP, MNL and NL models were reported. Results showed that the OP approach was simple and produced better results than the MNL approach in terms of number of variables entered the model specification and the models’ goodness of fit. NL method was more complicated compared to the other two and did not produce better results than the OP model. In Abdel-Aty and Abdelwahab (2004) research, comparison between two artificial neural network (ANN) paradigms and the OP model using the test for the difference in two proportions showed a more accurate prediction capability and lower misclassification rate for ANN than the OP method. Subsequent research (Haleem and Abdel-Aty, 2010) on injury severity at unsignalized intersections showed that the aggregate binary probit model offered superior goodness-of-fit compared to the disaggregated OP and NL models. Yasmin and Eluru (2013) compared the ordered response and unordered response models in the context of driver injury severity in traffic crashes. Alternative modeling approaches included: for the ordered response framework – OL model, GOL model and for the unordered response framework – MNL, NL and ordered generalized extreme value logit (OGEV) model. They reported that

164

S. Zhao, A. Khattak / Accident Analysis and Prevention 74 (2015) 162–168

the NL and OGEV models collapsed to the MNL model. GOL model had clear superiority in terms of data fit compared to the OL and MNL models. The criteria used to compare performances of different models included Bayesian Information Criterion (BIC), Akaike Information Criterion corrected (AICc) and Ben-Akiva and Lerman’s adjusted likelihood ratio (BL) test. To identify the appropriate ordered response structure for modeling pedestrian injury severity, Yasmin et al. (2014) compared three alternative ordered response models: OL model, GOL and latent segmentation based ordered logit model (LSOL). Their results showed that the LSOL model was better than the other two for examining pedestrian injury severity. The comparison criteria were similar to Yasmin and Eluru (2013). Ye and Lord (2014) examined the effects of sample size on three commonly used crash severity models: the MNL, OP and RPL models, using a Monte-Carlo analysis based on simulated and observed data. Criteria used for comparison of the three models included mean of absolute-percentage-bias (APB), maximum APB and total root-mean-square-error (RMSE). The authors concluded that the RPL model had a better interpretive power than the MNL model, while the latter had superior interpretive power compared to the OP model. The OP model had a slightly better goodness-of-fit than that of the MNL and RPL models, but the RPL model had a better fit than the MNL model. The OP, MNL, and RPL models had increasing sample size requirements, respectively. 2.4. Literature summary In summary, nominal and ordinal models are available for applications to HRGC crashes. Each type of model has its own unique strengths and weaknesses while some are improved versions of others. Comparisons between ordinal and nominal models and between fixed parameter and random parameter models on motor-vehicle crash severities sustained at HRGCs were not found in the reviewed literature. HRGCs have unique characteristics compared to ordinary roadway segments and intersections. Therefore, there is a need to compare the effects of commonly used models on injury severity. As mentioned before, the models chosen in this research included the OP model for the ordered response framework and the MNL and RPL models for the unordered response framework. These models are typical examples of ordinal model, nominal model and non-parameter models. Section 3 briefly describes the modeling background; readers familiar with the models may go to Section 4. 3. Modeling background This section briefly presents the background for the MNL, RPL, and OP models. The unordered response framework of the MNL and RPL implies that the models treat the dependent variable as discrete outcomes and neglecting the ordering in injury severities. The ordered response framework of the OP model takes into account the information reflected by the ordering of the outcomes. 3.1. Multinomial logit model (MNL)

coefficients to be estimated for severity level j, X ij is a vector of independent variable values for driver i and injury severity level of j, eij is a random error term; the errors are assumed to be independently and identically distributed with identical type 1 extreme value distribution. Based on the above specification, let Pi(j) as the probability of driver i experiencing injury severity level of j, then the MNL probability is expressed as:   exp @j þ bj X ij   (2) Pi ðjÞ ¼ J Sj¼1 exp @j þ bj X ij

3.2. Random parameter logit model (RPL) RPL probabilities are the integrals of standard logit probabilities over a density of parameters. It shares the same structure of injury severity propensity function as the MNL. The probability of driver i experiencing crash injury severity level of j is expressed as:   1 0 Z exp @j þ bj X ij @  Af ðbjuÞdb Pi ðjÞ ¼ (3) Sj exp @j þ bj X ij where f ðbjuÞ is a density function with b and u referring to the mean and variance of the density function. RPL is a mixture of the logit function evaluated at different b’s with f ðbjuÞ as the mixing distribution; f ðbju Þ is generally specified as continuous. Normal or any other distribution can be used as a density function for b. The RPL becomes the standard MNL when f ðbÞ is degenerate at fixed parameters b, that is, f ðbjuÞ ¼ 1 for b ¼ b and f ðbÞ ¼ 0 for b 6¼ b. 3.3. Ordered probit model (OP) The basic ordered choice model uses a latent regression to determine injury severity levels experienced by a driver: 0

yi ¼ b X i þ ei

(4)

where ei is a random error term, following the standard normal distribution in the OP model. Xi is a vector of explanatory variables 0 for the crash i and b is a vector of the coefficients for the explanatory variables. The value of the dependent variable y is then determined by the following: yi ¼ 0ifyi  m0 ; ¼ 1if m0 < yi  m1 ; ¼ 2if m1 < yi  m2 ; . . .

(5)

where m0 ; m1 ; m2 . . . are threshold values for all injury severity categories. The probability functions are:   Pn ð1Þ ¼ F a1  bj X n     Pn ðjÞ ¼ F aj  bj X n  F aj1  bj X n ; j ¼ 2; . . . J  1 (6) J1 X Pn ðJ Þ ¼ 1  Pn ðjÞ j¼1

The MNL is a special case of a general model of utility maximization. In the context of injury severity, assume driver i experiencing an injury severity level of j. The severity propensity function for the outcome is: U ij ¼ @j þ bj X ij þ eij

(1)

where Uij is a function of covariates that determines the severity, @j is a constant parameter for injury severity level of j, bj is a vector of

where F is the cumulative standard normal distribution function. A limitation of the ordered probit is the difficulty with interpretation of intermediate categories as it is not clear what effect a positive or negative estimated parameter has on the probabilities of the interior categories. To obtain a sense of the direction of the effects on interior categories marginal effects are computed for each category (Washington et al., 2011).

S. Zhao, A. Khattak / Accident Analysis and Prevention 74 (2015) 162–168

4. Model estimation results and discussion 4.1. Modeling results The variable representing motor vehicle drivers’ injury severity levels was named “severity” and used as the dependent variable in the analysis. Table 1 presents a summary of 10 independent variables used in the modeling effort. Dataset was from 2009 to 2012. Selection of the independent variables for inclusion in a model was based on guidance from previous research findings and/or statistical results from the modeling process. NLOGIT version 4.0 (Greene, 2007) was used for model estimation. Various independent variables were tried in the model specifications and those not showing statistical significance at the 5% level were excluded. In the OP model, the first cutoff point (the cutoff point between no-injury and injury) was set to zero as an identification constraint, which is standard practice for discrete outcome models. The second cutoff point between injury and fatality, m(1), was estimated as 1.282 (see Table 2). The positive parameter estimates in the OP model indicate increasing likelihood of the most severe injury outcome, or more specifically of a fatal injury outcome, while negative estimates indicate increasing likelihood of no-injury outcome. In the MNL model, the category of no injury was set as the baseline category. Negative parameter estimates indicate decreased likelihood toward a particular crash injury severity compared to the no injury category. During the MNL model estimation the coefficients for a particular variable appearing in the severity propensity functions of different injury severity levels were set to be different from each other unless a likelihood ratio test showed that they were not statistically significantly different from each other at the 5% significance level (in which case the coefficients were set to be the same across different severity propensity functions). In the RPL model estimation all parameters were first assumed random and both the normal and the uniform distributions were

165

tested for randomness. The parameter estimation was based on a simulation-based maximum likelihood method and 200 Halton draws were used. Two parameters for the fatal crash level—estimated vehicle speed and truck vehicle indicator, were found to follow the normal random distribution, implying that the parameter estimates for these two variables could vary across individual crashes. The remaining independent variables in the model were restricted to fixed parameters. Table 2 presents modeling results for the three models; they are compared and discussed in the following paragraphs. 4.2. Model comparisons Based on the previous research on model comparisons (Abdel-Aty and Abdelwahab, 2004; Abdel-Aty, 2003; Haleem and Abdel-Aty, 2010; Yasmin and Eluru, 2013; Yasmin et al., 2014; Ye and Lord, 2014), the following criteria were used in this study: number of statistically significant parameters, models’ interpretation power, goodness-of-fits, and classification accuracy. These criteria were employed to identify a more suitable model for modeling injury severity of motor vehicle drivers involved in train–motor vehicle crashes at HRGCs. The RPL model had the highest number of statistically significant parameters (18) compared to the MNL model (16) and the OP model (12). Greater number of statistically significant parameters in a model leads to an apparently better model in terms of higher adjusted R-square (0.170, 0.169 and 0.162, respectively) and helps identify additional explanatory variables impacting the dependent variable. In terms of interpretation power, compared to the MNL model, the RPL model overcomes individual variation issues and does not exhibit independence of irrelevant alternatives (IIA) property. Two parameters for the fatal crash level, estimated vehicle speed and truck vehicle indicator, were found to be normally randomly distributed implying that these two parameters vary across individual crashes rather than remain the same

Table 1 Descriptive statistics for the variables included in the injury severity models. Variable type

Description and coding

Mean

Standard deviation

Frequency 1 = yes

Dependent variable Injury severity level

Railway characteristics Train speed Actual train speed Train type Freight indicator

Fatal (2) Injury (1) No injury (0)

Actual speed of train in mph (0–88 mph) Freight train involved (1 = yes; 0 = no)

Motor vehicle, driver and crash characteristics Motor vehicle speed Motor vehicle speed Estimated speed of vehicle in mph (0–80 mph) Motor vehicle type Truck vehicle indicator Highway motor vehicle type was truck, including truck and truck-trailer (1 = yes; 0 = no) Driver information Driver aged over 70 indicator Driver’s age was above 70 (1 = yes; 0 = no) Female driver indicator Driver was female (1 = yes; 0 = no) Crash type Rail struck highway user indicator Circumstance of the accident was rail equipment struck motor vehicle/driver (1 = yes; 0 = no) Driver was inside vehicle (1 = yes; 0 = no) Driver inside motor vehicle indicator Environment characteristics Visibility Day indicator Dusk indicator

Daylight (1 = yes; 0 = no) Dusk (1 = yes; 0 = no)

0 = no

0.454

0.645

2 (475) 1 (1610) 0 (3556)

30.009

18.568

N/A

N/A

0.744

0.436

4197

1444

8.547

11.804

N/A

N/A

0.425

0.494

2396

3245

0.079 0.257

0.270 0.437

445 1449

5196 4192

0.817

0.387

4608

1033

0.834

0.372

4707

934

0.644 0.042

0.479 0.200

3634 236

2007 5405

166

S. Zhao, A. Khattak / Accident Analysis and Prevention 74 (2015) 162–168

Table 2 Driver injury severity: OP, MNL and RPL models. Variable

OP model

MNL model Fatal

Injury

Fatal

Injury

Constant

2.992 (30.01)

8.727 (28.67)

5.135 (24.69)

8.999 (26.03)

5.161 (24.66)

Rail characteristics Actual train speed Freight train indicator

0.025 (25.13) 0.113 (2.81)

0.070 (21.75) 0.249 (3.36)

0.031 (15.86) 0.249 (3.36)

0.074 (19.04) 0.252 (3.38)

0.030 (15.81) 0.252 (3.38)

Vehicle characteristics Estimated vehicle speed Standard deviation of distribution Truck indicator Standard deviation of distribution

0.016 (10.96) N/A 0.287 (6.58) N/A

0.041 (9.13) N/A 0.802 (5.54) N/A

0.025 (8.91) N/A 0.426 (5.30) N/A

0.036 (5.17) 0.025 (2.03) 1.818 (2.83) 1.771 (2.90)

0.025 (8.85) N/A 0.435 (5.41) N/A

Driver characteristics Driver aged over 70 indicator Female driver indicator

0.330 (5.41) 0.180 (4.40)

0.954 (6.27) 0.349 (4.65)

N/S 0.349 (4.65)

0.989 (6.16) 0.346 (4.60)

N/S 0.346 (4.60)

Crash characteristics Rail struck highway user indicator Driver in vehicle indicator

0.293 (6.17) 1.696 (21.46)

1.290 (6.93) 3.132 (18.25)

0.256 (2.99) 3.132 (18.25)

1.392 (6.69) 3.162 (18.25)

0.256 (2.99) 3.162 (18.25)

Environment characteristics Day indicator Dusk indicator

0.100 (2.42) 0.187 (2.05)

0.392 (3.23) 0.893 (2.82)

N/S N/S

0.415 (3.24) 0.942 (2.84)

N/S N/S

1.282 (45.24)

N/A

N/A

N/A

N/A

12 4834.878 4042.062 1.437 1.451 1.442 8108.124 8108.179 8187.778 0.162 1585.632 df = 10

16 4834.878 4007.084 1.426 1.445 1.433 8046.168 8046.265 8152.373 0.169 1655.587 df = 14

RPL model

Threshold parameter

m(1)

Model characteristics Number of parameters Log likelihood function – constant only Log likelihood function Inf. Cr. AIC Inf. Cr. BIC Inf. Cr. HQIC AIC AICc BIC Adjusted R-square Likelihood ratio chi-square

18 4834.878 4004.478 1.426 1.447 1.434 8044.956 8045.078 8164.437 0.170 1660.799 df = 16

Values in parentheses are Student’s t-statistics for each estimated parameter. N/A is not applicable. N/S implies not significant at the 5% level. All other values are statistically significant at the 5% level.

for all motor vehicle–train crashes at HRGCs. The variability in parameters is reflective of individual variation as well as unobserved heterogeneity, which could be due to a variety of factors for which there might be no data available. Further discussion of this issue is provided in Section 4.3. Compared to the MNL and the RPL model, the OP model did not offer better interpretation power although it took into consideration the ordinal information of crash severities (Ye and Lord, 2014). In fact, the OP model restricted the effects of explanatory variables by using identical coefficients for a variable across all crash injury severity levels. Overall, the RPL model had more flexible parameter estimates, and thus, had better interpretation power than the MNL model, which had a better interpretation power than the OP model. The likelihood ratio test, AICc and BIC were compared to judge model fit. The standard MNL is a special case of RPL where the mixing distribution of parameters is degenerate at fixed parameters. Therefore, the MNL model is nested within the RPL model and the likelihood ratio test (LRT) can be used to compare the two models. From the model estimation results presented in Table 2, the log-likelihood at convergence for the RPL model was 4004.478 with 18 parameters, compared to 4007.084 for the MNL model with 16 parameters. Therefore, the log-likelihood ratio statistic was 2(4004.478  (4007.084)) = 5.212 with 2 degrees of freedom, which was smaller than the chi-square critical value of 5.99 at the 5%

significance level, showing that the RPL model was not statistically better than the standard MNL model in this case. For the OP model, because it is not nested with the MNL model or the RPL model, the likelihood ratio test was not appropriate. Therefore, AICc and the BIC were used to compare model fit between the OP and the other two models. The AICc (BIC) values for the OP model, MNL model and RPL model were 8108.179 (8187.778), 8046.265 (8152.373) and 8045.078 (8164.437), respectively. Models with lower AICc and BIC values are preferable therefore, the RPL model and the MNL were superior to the OP model. The RPL model had slightly better model fit than the MNL model based on the AICc criteria but the MNL model was better based on the BIC criterion. Overall, the different criteria for goodness-of-fit of the models showed that the RPL and MNL model were preferable to the OP model, while there was no compelling evidence for preference between the RPL and the MNL models. All three models were classifier models and their prediction accuracy was compared using the 2013 HRGC crash data. The severity outcomes of the 2013 crashes were consistent with the 2009–2012 crashes; as mentioned before, there were 63.0% no-injury crashes, 28.5% injury crashes and 8.5% fatal crashes in the 2009–2012 dataset, while the corresponding percentages in the 2013 dataset were 61.5%, 31.0% and 7.5%, respectively. Table 3 presents prediction successes and failures for the three models for 2013. In the table the row value is the actual injury outcome while the column value is the predicted result. A model’s prediction was

S. Zhao, A. Khattak / Accident Analysis and Prevention 74 (2015) 162–168

characterized as a success if the observed 2013 injury level was assigned the highest probability. Comparison of the predications showed that the OP model correctly classified 65.0% of the 2013 observations while the MNL and the RPL models correctly classified 57.4% and 57.5% of the observations, respectively. However, for fatal crashes the OP model classification was significantly less accurate than the other two models; given the importance of fatalities, this appears to be a limitation of the OP model. Statistical tests may be used to measure differences in performance between classifier models. The test for the difference of two proportions is popular (Abdel-Aty and Abdelwahab, 2004); therefore, it was utilized for comparing the three models in this study. For the OP and RPL models, the null hypothesis was that the two correct classification proportions (65.0% versus 57.5%) were equal. The calculated z-statistic was 3.845, which was greater than the critical value of 1.96 for the 5% significance level. The null hypothesis was therefore rejected implying that the OP model had a higher overall prediction capability than the RPL model. The same test between the MNL (57.4%) and RPL (57.5%) models did not reveal a statistically significant difference at the 5% significance level. Therefore, notwithstanding underestimation of the critical fatal category, the OP model had higher overall prediction capability than the other two models, while the MNL model and RPL model had similar prediction capabilities. Model comparisons showed that the RPL model had the most significant parameters included in its specification and had the best interpretation power compared to the other two models. The RPL model, however, did not show significantly better goodness-of-fit than the MNL model but these two were significantly better fitted than the OP model. In terms of prediction capability, the OP model had better overall classification accuracy but underestimated fatal crashes compared to the other two models. Overall, the RPL model and MNL model were deemed more suitable for injury severity analysis of motor vehicle drivers involved in crashes at highway–rail grade crossings than the OP model. The factors associated with drivers’ injury severities in crashes at HRGCs identified by the RPL model are discussed in the remaining paper. However, it is worth noting that this study only compared two unordered response models (RPL and MNL) with the traditional OP model and did not include other ordered response models such as the generalized ordered logit model Table 3 Prediction success table for OP, MNL and RPL models using 2013 data. Model type and category

Predicted Fatal

Injury

No injury

OP model Fatal 2 70 20 Injury 1 161 221 No injury 2 117 639 Total 5 348 880 Percent correctly classified = 802/1,233 = 65.0%

Total/actual observed 92 383 758 1233

MNL model Actual

Fatal

Injury

21 37 Fatal Injury 42 141 No injury 40 172 Total 103 350 Percent correctly classified = 708/1,233 = 57.4% RPL model Fatal 22 37 Injury 42 141 No injury 39 172 Total 103 350 Percent correctly classified = 709/1,233 = 57.5%

No injury

Total

34 200 546 780

92 383 758 1233

34 200 546 780

92 383 758 1233

167

(GOL), which relaxes the restrictive thresholds in the ordered response model by allowing for individual level exogenous variable impacts on the threshold parameters (Yasmin et al., 2014). It is possible that the GOL model would outperform the MNL or RPL models under the same circumstances. Such a comparison is suggested for a future study. 4.3. Model interpretation For the fatal crash level in the RPL model, the parameters for estimated vehicle speed and for truck vehicle indicator were found to follow the normal random distribution implying that the parameters can vary from crash to crash. This variability in the parameters may reflect unobserved heterogeneity, the possible sources of which may be determined through interactions of each random parameter with other potential attributes or variables. Several variables that might be the sources of heterogeneity were tried; the investigation revealed that randomness in the vehicle speed parameter maybe, in part, explained by differences in drivers’ gender as well as the circumstances of the accidents (whether rail equipment struck highway user or rail equipment was struck by the highway user). Similarly, the investigation revealed that randomness in the parameter for the truck indicator may also be partially due to the differences in the circumstances of the accidents. The two random parameters, however, became either fixed or statistically not significant after the interaction terms were added into the model specifications. There might be a variety of other factors that caused heterogeneity for which data were not available. Therefore, the original RPL model was retained and the interactions terms were left out from the model specification. There were 10 independent variables found to have positive or negative associations with different levels of drivers' injuries in train–motor vehicle crashes at HRGCs based on the RPL model at the 95% significance level. Drivers’ injury severity increased with higher actual train speed and higher estimated motor vehicle speed; both findings being rational as higher vehicular speeds are commonly known to result in more severe injuries. Freight train involvement in a crash increased the driver’s injury severity, again rational because of the large mass of transported freight. Drivers aged above 70 years and female drivers sustained more severe injuries; this finding is consistent with previous studies (Khattak, 2013). The likelihood of severe injuries also increased in the following situations: when trains struck motor vehicles rather than being struck by motor vehicles (e.g., when a motor vehicle hit the side of a train, usually under low visibility conditions) and when drivers were inside their motor vehicles. Both findings are reasonable as drivers’ injuries would be more severe when trains strike motor vehicles rather than the other way around and when drivers are inside their vehicles (in some crashes motor vehicle drivers get out of their vehicles that are stuck on the crossing, resulting in less severe injuries). Drivers of trucks and truck-trailers experienced relatively less severe injuries in train-involved crashes at HRGCs likely due to the large size of the vehicles. Finally, fair visibility (day and dusk compared to dawn and dark) was associated with less severe injuries of motor vehicle drivers possibly due to evasive maneuvers on part of the drivers. 5. Summary and conclusion The objectives of this research were to identify a more suitable model for injury severity of drivers involved in train–motor vehicle crashes at HRGCs among three commonly used models and to investigate the associations of various factors with injury severity levels of motor vehicle drivers involved in train–motor vehicle crashes at HRGCs. Three models were estimated and compared to

168

S. Zhao, A. Khattak / Accident Analysis and Prevention 74 (2015) 162–168

each other based on the number of statistically significant parameters included, the models’ interpretative power, classification accuracy and overall goodness-of-fit. The RPL model included greater numbers of statistically significant parameters than the other two models. The RPL model, which showed that parameters for train speed and truck vehicle indicator were normally randomly distributed rather than fixed, had the best interpretation power compared to the to the other two models. As to the goodness-of-fit, the MNL and RPL models performed better than the OP model while there was no evidence that the RPL model was statistically significantly better than the MNL model. In terms of prediction accuracy, the OP model had the highest overall accuracy rate; however, it underestimated the category of fatal injuries. Based on the findings, the conclusion was that the RPL model was more appropriate for analyzing drivers’ injury severities in train– motor vehicle crashes reported at HRGCs. The possibility that other models may outperform the RPL model considered in this study is acknowledged especially a GOL framework might improve the OP fit beyond the RPL level. Results from the RPL model showed that factors such as higher train and vehicle speeds, freight train and non-truck vehicles involvement, older and female motor vehicle drivers, train struck highway vehicles, highway users were in vehicles, dawn and night time were associated with higher levels of drivers’ injury severities. Where feasible, reducing train and motor vehicle speeds and provision of nighttime lighting may help reduce injury severity of motor vehicle drivers. References Abdel-Aty, M., 2003. Analysis of driver injury severity levels at multiple locations using ordered probit models. J. Saf. Res. 34 (5), 597–603.

Abdel-Aty, M.A., Abdelwahab, H.T., 2004. Predicting injury severity levels in traffic crashes: a modeling comparison. J. Transp. Eng. 130 (2), 204–210. Eluru, N., Bagheri, M., Miranda-Moreno, L.F., Fu, L., 2012. A latent class modeling approach for identifying vehicle driver injury severity factors at highway– railway crossings. Accid. Anal. Prev. 47, 119–127. Fan, W.D., Haile, E.W., 2014. Analysis of Severity of Vehicle Crashes at Highway–Rail Grade Crossings: Multinomial Logit Modeling, Transportation Research Board 93rd Annual Meeting, No. 14-058. Federal Railroad Administration Office of Safety Analysis, 2014. FRA Safety Data. Retrieved from http://safetydata.fra.dot.gov/OfficeofSafety/Default.aspx. Greene, W.H., 2007. NLOGIT Version 4.0: Reference Guide. Econometric Software, Plainview, NY. Haleem, K., Abdel-Aty, M., 2010. Examining traffic crash injury severity at unsignalized intersections. J. Saf. Res. 41 (4), 347–357. Hao, W., Daniel, J., 2013. Motor Vehicle Driver Injury Severity Study at Highway–Rail Grade Crossings in the United States, Transportation Research Board 92nd Annual Meeting. Hu, S.-R., Li, C.-S., Lee, C.-K., 2010. Investigation of key factors for accident severity at railroad grade crossings by using a logit model. Saf. Sci. 48 (2), 186–194. Khattak, A., 2013. Severity of Pedestrian Crashes at Highway–Rail Grade Crossings, Transportation Research Board 92nd Annual Meeting, No. 13-458. Mannering, F.L., Bhat, C.R., 2014. Analytic methods in accident research: methodological frontier and future directions. Anal. Methods Accid. Res. 1, 1–22. Russo, B., Savolainen, P.T., 2013. An Examination of Factors Affecting the Frequency and Severity of Crashes at Rail-Grade Crossings, Transportation Research Board 92nd Annual Meeting, No. 13-016. Savolainen, P.T., Mannering, F.L., Lord, D., Quddus, M.A., 2011. The statistical analysis of highway crash-injury severities: a review and assessment of methodological alternatives. Accid. Anal. Prev. 43 (5), 1666–1676. Washington, S., Karlaftis, M., Mannering, F., 2011. Statistical and Econometric Methods for Transportation Data Analysis, second ed. Chapman and Hall/CRC, Boca Raton, FL. Yasmin, S., Eluru, N., 2013. Evaluating alternate discrete outcome frameworks for modeling crash injury severity. Accid. Anal. Prev. 59 (1), 506–521. Yasmin, S., Eluru, N., Ukkusuri, S., 2014. Alternative ordered response frameworks for examining pedestrian injury severity in New York City. J. Transp. Saf. Secur. 6 (4), 275–300. Ye, F., Lord, D., 2014. Comparing three commonly used crash severity models on sample size requirements: multinomial logit, ordered probit and mixed logit models. Anal. Methods Accid. Res. 1, 72–85.

Motor vehicle drivers' injuries in train-motor vehicle crashes.

The objectives of this research were to: (1) identify a more suitable model for modeling injury severity of motor vehicle drivers involved in train-mo...
323KB Sizes 5 Downloads 6 Views