This article was downloaded by: [New York University] On: 03 May 2015, At: 04:17 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

International Journal of Injury Control and Safety Promotion Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/nics20

Prediction of road traffic death rate using neural networks optimised by genetic algorithm a

b

c

d

Seyed Ali Jafari , Sepideh Jahandideh , Mina Jahandideh & Ebrahim Barzegari Asadabadi a

Civil Engineering Department, University of Sistan and Baluchestan, Sistan and Baluchestan, Iran b

Alzahra Heart Charity Hospital, Shiraz University of Medical Sciences, Shiraz, Iran

c

Department of Applied Mathematics, Faculty of Science, University of Zanjan, Zanjan, Iran d

Click for updates

Department of Biophysics, Faculty of Biological Sciences, Tarbiat Modares University, Tehran, Iran Published online: 04 Dec 2013.

To cite this article: Seyed Ali Jafari, Sepideh Jahandideh, Mina Jahandideh & Ebrahim Barzegari Asadabadi (2013): Prediction of road traffic death rate using neural networks optimised by genetic algorithm, International Journal of Injury Control and Safety Promotion, DOI: 10.1080/17457300.2013.857695 To link to this article: http://dx.doi.org/10.1080/17457300.2013.857695

PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http:// www.tandfonline.com/page/terms-and-conditions

International Journal of Injury Control and Safety Promotion, 2013 http://dx.doi.org/10.1080/17457300.2013.857695

Prediction of road traffic death rate using neural networks optimised by genetic algorithm Seyed Ali Jafaria*, Sepideh Jahandidehb, Mina Jahandidehc and Ebrahim Barzegari Asadabadid a

Civil Engineering Department, University of Sistan and Baluchestan, Sistan and Baluchestan, Iran; bAlzahra Heart Charity Hospital, Shiraz University of Medical Sciences, Shiraz, Iran; cDepartment of Applied Mathematics, Faculty of Science, University of Zanjan, Zanjan, Iran; dDepartment of Biophysics, Faculty of Biological Sciences, Tarbiat Modares University, Tehran, Iran

Downloaded by [New York University] at 04:17 03 May 2015

(Received 2 July 2013; final version received 14 October 2013) Road traffic injuries (RTIs) are realised as a main cause of public health problems at global, regional and national levels. Therefore, prediction of road traffic death rate will be helpful in its management. Based on this fact, we used an artificial neural network model optimised through Genetic algorithm to predict mortality. In this study, a five-fold cross-validation procedure on a data set containing total of 178 countries was used to verify the performance of models. The best-fit model was selected according to the root mean square errors (RMSE). Genetic algorithm, as a powerful model which has not been introduced in prediction of mortality to this extent in previous studies, showed high performance. The lowest RMSE obtained was 0.0808. Such satisfactory results could be attributed to the use of Genetic algorithm as a powerful optimiser which selects the best input feature set to be fed into the neural networks. Seven factors have been known as the most effective factors on the road traffic mortality rate by high accuracy. The gained results displayed that our model is very promising and may play a useful role in developing a better method for assessing the influence of road traffic mortality risk factors. Keywords: road traffic death rate; prediction; artificial neural network; genetic algorithm

Introduction Road traffic injuries show a steep socioeconomic gradient, with those from more disadvantaged backgrounds at higher risk than their more affluent counterparts (WHO, 2009). The mortality rate of traffic injuries depends upon several contributing factors. Many researchers have attempted to establish a predictive model that identifies possible contributing factors, such as traffic characteristics, road environment and human error. Identification of these factors has greatly decreased the risk of traffic accident related to mortality (Lascala, Gerber, & Gruenewald, 2000). To date, the majority of studies in this field have focused mainly on prediction of road traffic mortality number based on a limited number of input variables (Bastos, de Andrade, Soares, & Matsuo, 2005; Pereira et al., 2011). Besides the previous variables including number of registered vehicles and population, some other factors are effective; as World Health Organization (WHO) declares demographic and socioeconomic statistics, registered vehicles, national legislation, institutional framework, policy, and emergency care are effective on mortality (WHO, 2009). In this study, we have attempted to use a comprehensive set of predicting parameters as inputs of our models. The other drawback of previous studies is that they investigate the road fatalities as case studies in a small area, but not at the global scale (Bastos et al., 2005; Holmgren, Holmgren, & Ahlner, *Corresponding author. Email: [email protected] Ó 2013 Taylor & Francis

2005; Pereira et al., 2011; Ziyab & Akhtar, 2012). Our study aims to predict the fatalities in the global scale using a database provided by the WHO. Artificial neural networks (ANNs) have been suggested as a supplement or alternative to standard statistical methods for predicting complex phenomena (Sargent, 2001). In the past decade, ANN models have been used in different types of problems in transportation systems and traffic engineering. There are many uses of ANNs in the fields of travel behaviour, traffic flow and traffic management (Himanen, Nijkamp, Reggiani, & Raitio, 1998). Some researchers have suggested the use of neural network models in highway safety researches (Cansiz, Calisici, & Miroglu, 2009; Mussone, Ferrari, & Oneta, 1999; Ozgan & Demirci, 2008). In addition, neural networks were employed for modelling the relationship between the number of fatalities and population or the number of registered vehicles (Cansiz et al., 2009). In our study, the ANN is trained using genetic algorithm (GA) by adjusting its weights and biases in each layer. A GA tries to simulate the natural evolution process. Its purpose is to optimise a set of parameters. Thus, the GA finds the parameter set that gives the best fit to valuable data. In this study, genetic algorithm-based neural network (GANN) approach as an alternative has been utilised to

2

S.A. Jafari et al. carried out to avoid any possible bias in selecting the testing set individuals. Through the cross-validation procedure, 36 cases (here called testing set) were removed from the data set and the training procedure was performed using the remaining cases; then the testing set was examined by the resulted model. This procedure is repeated until all cases within the data set are tested. Five models were, thus, built and evaluated by each time removing 36 countries from the data set, and training the model with the remaining 142 ones. In this way, the average results of five different simulations were reported. In this study, the GA was used to optimise the input feature set of the neural networks and then the optimised network as a predictor model was applied to predict the road traffic death rate based on different factors.

Downloaded by [New York University] at 04:17 03 May 2015

Figure 1. Distribution of different countries based on HDI.

Definition of the input parameters predict road traffic death rate. Prediction of risk factors will be helpful in assessing the comprehensive impact that a set of a data has on road traffic mortality. Methods Data set A total of 178 countries with different Human Development Index (HDI) values (very high, high, medium and low) have been participated in this study. The data of these countries have been used as the input data. In this way, we have it with appropriate distribution and the result of our study is more accurate. The HDI is a worldwide scale; we used it to make our research valid and global. The collected data were reported by WHO (2007). All of the countries have been defined as shown in Figure 1. Distribution of different countries based on HDI was 18%, 23%, 46% and 13% for very high HDI, high HDI, medium HDI, and low HDI countries, respectively (United Nations Development Program, 2009). The average road traffic death rate (per 100,000 population) of countries with different HDI values is shown in Table 1.

Non-linear model development using artificial neural network

Model development ANNs as a non-algorithmic model were used. In order to train and test the models, five-fold cross-validation was Table 1. The average road traffic death rate (per 100,000 people) based on HDI of countries. HDI Very high High Medium Low

Road traffic death rate of mentioned countries has been reported by WHO (2007). Among different parameters which affect the mortality rate of road traffic, 22 factors were selected as the most effective ones, including the population, income level, number of registered vehicles, existence of a formal pre-hospital care system, existence of national speed limits, adaptation of national speed limit at a local level, specification of national speed law by vehicle type, maximum speed limit (urban/rural), existence of a national drink-driving law, blood alcohol concentration (BAC) limit (for general population, young/ novice drivers, professional/commercial drivers), existence of a national seat-belt law, applicability of seat-belt law to all occupants, seat-belt wearing rate (front seat, rear seat), existence of a national child-restraint law, existence of a national motorcycle helmet law, requirement of motorcycle helmet use to adhere to standard, applicability of national motorcycle helmet law to all occupants (drivers, adult passengers, child passengers), which were reported by WHO for each country in 2007.

Death rate (per 100,000 population) 10 17 21 32

In this study, the networks were trained perfectly over three layers of neurons. Networks with one, two or three hidden layers, different learning constants and hidden nodes were examined to train with inputs. Estimated road traffic death rate (per 100,000 population) was used as the dependent variable in the neural network which was considered into three classes (Class 1: 0–9, Class 2: 10–19 and Class 3:  20 deaths per 100,000 people). We considered our target variable in three categories, because the number of the samples is limited and does not allow to consider it in more than three classes. On the other hand, categorising the sample in two groups leads to highly dispersed classes.

International Journal of Injury Control and Safety Promotion Table 2. Optimised neural network parameters. Parameter

Downloaded by [New York University] at 04:17 03 May 2015

Learning rate Error goal Number of input nodes Number of output nodes Number of hidden layers Number of neurons (layer 1) Number of neurons (layer 2) Number of neurons (layer 3) Training function Response accuracy RMSE (of GANN model)

3

Model evaluation Value 0.1 0 22 1 3 10 5 10 ‘Trainrp’ 62% 0.0808

The optimised network as a predictor model was applied to predict the road traffic death rate based on different factors. The properties of the optimised neural network are shown in Table 2.

Feature selection using genetic algorithm-based neural networks GAs are a type of search algorithms designed to mimic the principles of biological evolution in natural genetic system. GAs are also known as stochastic sampling methods. These algorithms maintain and manipulate a population of solutions and implement their search for better solutions based on ‘survival of the fittest’ strategy. GAs solve linear and non-linear problems by exploring all regions of the feature space and exploiting promising areas through mutation, crossover and selection operations applied to individuals in the population. Using the GAs, each individual in the population needs to be described in chromosome representation. In this study, GAs were used as a feature selection tool rather than as parameter value optimisation tool. For this purpose, chromosome with bit string nature were used in which 1’s indicated the input parameters that would participate in the predictive model, and 0’s indicated those not entered into the model. Such a chromosome was composed of 22 genes, representing the 22 factors which affect the mortality rate of road traffic. All chromosome bits were mutated with the probability of 0.01 and twopoint crossovers were imposed to chromosomes with the probability of 0.8. Selection of individuals to produce successive generations plays an extremely important role in a GA. Selection means that two individuals are selected as ‘parents’, and the selection is dependent on the individual’s fitness function value. There are several selection schemes. In this paper, the roulette wheel selection was used. This selection simulates a roulette wheel with the area of each segment proportional to its expectation. The algorithm then used a random number to select one of the selections with a probability equal to its area.

In order to evaluate the performance of models, two statistical indices are used: prediction accuracy (PA) and RMSE. The first index is called PA (Hajmeer & Basheer, 2003), which shows the fraction of data set samples correctly classified: PA ¼ ððNTN þ NTP Þ=Ntotal  100; where NTP is true positive predictions, and NTN is true negative predictions. This index was used for performance evaluation of the ANN model. The prediction error of the GANN model was calculated in terms of root mean squared error (RMSE). RMSE is a frequently used measure of the differences between values predicted by a model or an estimator and the values actually observed. This value was regarded as a measure of each country’s fitness in the corresponding generation. RMSE is defined as sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi n 1X RMSEðtÞ ¼ ðWo  Wp Þ2 ; n i¼1 where Wo is the observed road traffic death rate (per 100,000 population), and Wp is the predicted value of road traffic death rate (per 100,000 population). Results Results of ANN model ANN-based models were fed with the 22 mentioned factors. Different learning constants of .08, 0.1 and 0.2 were tested, and the learning constant of 0.1 was found to give the best accuracy. For each network, the optimised structure of four-layer neural networks included an input layer, three hidden layers and one output neuron. The ANN was used as the predictor model on the data set by using ‘trainrp’ (resilient propagation) method. The optimal network was trained and tested using five-fold crossvalidation method, in which 142 of subjects were entered into the network as a training set and the remainders (36 subjects) were considered as a testing set. The average PA of the five implementations was considered as the total accuracy of the model. The PA using such a network was improved to 62% (Table 2). Results of GANN model Properties of the GA which was used in this study as a feature selection tool are shown in Table 3. Seven factors out of 22 factors have determined as the most important ones (income level of country, existence of formal pre-hospital in country, existence of national speed limit, adaptation of

4

S.A. Jafari et al.

Table 3. Parameters of the genetic algorithm feature selection tool used in this study. Parameter

Downloaded by [New York University] at 04:17 03 May 2015

Population type Population size Creation function Selection function Elite count Crossover fraction Mutation function Mutation rate Crossover function Migration direction Migration fraction Migration interval Generation (stopping criterion)

Value Bit string 35 Uniform Roulette 2 0.8 Uniform 0.01 Two point Forward 0.2 20 100

national speed limit at a local level, existence of a national motorcycle helmet law, applicability of national motorcycle helmet law to all occupants, applicability of national motorcycle helmet law to adult passengers). The lowest RMSE obtained was 0.0808. There is not a consensus on which RMSE values are acceptable. Generally, closer to zero is this value, more acceptable it is. Regarding this fact, our obtained RMSE has an acceptable value.

Discussion The aim of study was to predict road traffic death rate accurately by using ANNs and assessing the influence of road traffic mortality risk factors, by using GA model. Achieved results indicate the certain importance of seven parameters in predicting road traffic death rate and such satisfactory results could be attributed to the use of GA as a powerful optimiser which selects the best input feature set to be fed into the neural networks. The summary of optimised neural network parameters is given in Table 2. Based on the obtained results, ANNs optimised by GA indicate high performance in assessing the quality in terms of accuracy. Pervious investigations have also shown the same parameters as the most efficient factors in the level of injury severities. For example, an investigation has used driver’s age and gender, the use of a seat belt, the type and safety of a vehicle, weather conditions, road surface, speed ratio, crash time, crash type, collision type and traffic flow as input variables and has gained satisfactory results (Metin Kunt, Aghayan, & Noii, 2011). Another study indicated that gender, vehicle speed, seat belt use, type of vehicle, point of impact, and area type (rural vs. urban) affect the likelihood of injury severity level (Abdel-Aty & Abdelwahab, 2004). Many of these parameters, though expressed in different types of quantities, are in accordance with our selected parameters. Thus, our results are in agreement with those of the previous studies.

Many factors affect the road traffic death rate; hence a comprehensive approach must be applied about it. Many studies have been performed for predicting road traffic death rate so far (Bastos et al., 2005; Pereira et al., 2011), but all of them have used a limited number of input variables. Therefore, in this research, for lessening the drawback of previous researches, we have used a comprehensive set of input parameters provided by WHO to feed input of our models. In the next step, for increasing the accuracy of model, the neural networks optimised by GA was used. Therefore, our technique shows a clear advantage over previous studies.

Conclusion Accurate prediction of the road traffic death rate plays an important role in the mortality management system. Therefore, this research was devoted to offer a suitable model to predict this quantity. In this paper at first, ANNs were used for predicting the rate of road traffic death. Response accuracy was calculated as a measure of the model performance. Up to now, despite the great potential of non-linear models such as the ANNs which can consider many factors at the same time, application of these models has not been reported to this extent in prediction of road traffic death rate. From obtained results, it is clear that the effective parameters on the road traffic death rate show a non-linear relationship with this factor, and application of artificial intelligence models can provide high PA. Although the limitations of gradient search techniques applied to complex non-linear optimisation problems, such as the ANN, are well known, many researchers still choose to use these methods for network optimisation (Okyay, 2003). The gained result from GANN has emphasised on employing efficient feature selecting tools and correct feature selection strategies to find out what factors determine road traffic death rate. In conclusion, our results are promising and prediction of road traffic death rate using neural networks optimised by GA may play a useful role in establishing a proper road traffic death rate management service.

References Abdel-Aty, M., & Abdelwahab, H. (2004). Predicting injury severity levels in traffic crashes: A modeling comparison. Journal of Transportation Engineering, 130(2), 204–210. Bastos, Y.G., de Andrade, S.M., Soares, D.A., & Matsuo, T. (2005). Seat belt and helmet use among victims of traffic accidents in a city of southern Brazil, 1997-2000. Public Health, 119, 930–932. Cansiz, O.F., Calisici, M., & Miroglu, M.M. (2009, December). Use of artificial neural network to estimate number of persons fatally injured in motor vehicle accidents. Paper

Downloaded by [New York University] at 04:17 03 May 2015

International Journal of Injury Control and Safety Promotion presented at the meeting of 3rd International Conference on Applied Mathematics, Simulation, Modelling, Circuits, Systems and Signals, Vouliagmeni, Athens, Greece. Hajmeer, M., & Basheer, I. (2003). Comparison of logistic regression and neural network-based classifiers for bacterial growth. Food Microbiology, 20, 43–55. Himanen, V., Nijkamp, P., Reggiani, A., & Raitio, J. (1998). Neural networks in transport applications. USA: Ashgate, 311–340. Holmgren, P., Holmgren, A., & Ahlner, J. (2005). Alcohol and drugs in drivers injured in traffic accidents in Sweden during the years 2000–2002. Forensic Science International, 151, 11–17. Lascala, E.A., Gerber, D., & Gruenewald, P.J. (2000). Demographic and environmental correlates of pedestrian injury collisions: A spatial analysis. Accident Analysis and Prevention, 32, 651–658. Metin Kunt, M., Aghayan, I., & Noii, N. (2011). Prediction for traffic accident severity: Comparing the artificial neural network, genetic algorithm, combined genetic algorithm and pattern search methods. Transport, 26(4), 353–366. Mussone, L., Ferrari, A., & Oneta, M. (1999). An analysis of urban collisions using an artificial intelligence model. Accident Analysis and Prevention, 31(6), 705–718.

5

Okyay, K. (2003). Artificial neural networks and neural information. Berlin Heidelberg: Springer. Ozgan, E., & Demirci, R. (2008). Neural networks based modeling of traffic accidents in interurban rural highways, Duzce sampling. Journal of Applied Sciences, 8(1), 146–151. Pereira, R.E., Perdona, G. da S., Zini, L.C., Cury, M.B., Ruzzene, M.A., Martin, C.C., & De Martinis, B.S. (2011). Relation between alcohol consumption and traffic violations and accidents in the region of Ribeir~ao Preto, S~ao Paulo State. Forensic Science International, 207, 164–169. Sargent, D.J. (2001). Comparison of artificial neural networks with other statistical approaches: Results from medical data sets. Cancer, 91(8 Suppl), 1636–1642. United Nations Development Program (2009). Human development reports. Retrieved from http://hdr.undp.org/en/media/ HDR_2009_EN_Complete.pdf. World Health Organization. (2007). World health statistics. Retrieved from http://apps.who.int/gho/data/node.main. A989?lang=en World Health Organization. (2009). Global status report on road safety: time for action. Retrieved from http://whqlibdoc. who.int/publications/2009/9789241563840_eng.pdf Ziyab, A.H., & Akhtar, S. (2012). Incidence and trend of road traffic injuries and related deaths in Kuwait: 2000–2009. International Journal of the Care of the Injured, 43, 2018–2022.

Prediction of road traffic death rate using neural networks optimised by genetic algorithm.

Road traffic injuries (RTIs) are realised as a main cause of public health problems at global, regional and national levels. Therefore, prediction of ...
136KB Sizes 0 Downloads 0 Views