Accident Analysis and Prevention 73 (2014) 274–287

Contents lists available at ScienceDirect

Accident Analysis and Prevention journal homepage: www.elsevier.com/locate/aap

A model to identify high crash road segments with the dynamic segmentation method Amin Mirza Boroujerdian a, *, Mahmoud Saffarzadeh a , Hassan Yousefi b,c , Hassan Ghassemian d a

Faculty of Civil & Environmental Engineering, Tarbiat Modares University, Tehran, Iran School of Civil Engineering, Faculty of Engineering, Tehran University, Tehran, Iran Institute of Structural Mechanics (ISM), Bauhaus-University Weimar, 1599423 Weimar, Germany d Faculty of Electrical & Computer Engineering, Tarbiat Modares University, Tehran, Iran b c

A R T I C L E I N F O

A B S T R A C T

Article history: Received 20 November 2013 Received in revised form 6 September 2014 Accepted 11 September 2014 Available online xxx

Currently, high social and economic costs in addition to physical and mental consequences put road safety among most important issues. This paper aims at presenting a novel approach, capable of identifying the location as well as the length of high crash road segments. It focuses on the location of accidents occurred along the road and their effective regions. In other words, due to applicability and budget limitations in improving safety of road segments, it is not possible to recognize all high crash road segments. Therefore, it is of utmost importance to identify high crash road segments and their real length to be able to prioritize the safety improvement in roads. In this paper, after evaluating deficiencies of the current road segmentation models, different kinds of errors caused by these methods are addressed. One of the main deficiencies of these models is that they can not identify the length of high crash road segments. In this paper, identifying the length of high crash road segments (corresponding to the arrangement of accidents along the road) is achieved by converting accident data to the road response signal of through traffic with a dynamic model based on the wavelet theory. The significant advantage of the presented method is multi-scale segmentation. In other words, this model identifies high crash road segments with different lengths and also it can recognize small segments within long segments. Applying the presented model into a real case for identifying 10–20 percent of high crash road segment showed an improvement of 25–38 percent in relative to the existing methods. ã 2014 Elsevier Ltd. All rights reserved.

Keyword: High crash road segment Segmentation Prioritization Wavelet theory Multiple resolutions

1. Introduction Accident prevention is the most effective method to improve the safety of road networks. Due to wide-spread and complex nature of accident causes, identifying high crash road segments and proposing countermeasures are difficult to analyze. In order to evaluate the high accident-proneness of a road, it is required to divide this road into certain segments and then predict the accident risk probability by collecting and studying physical and traffic characteristics of the road. The process of safety assessment gets more costly and time consuming as the number of segments increase. Additionally, it is probable that inaccurate evaluation arises as the number of segments increases. In the literature

* Corresponding author. Tel.: +98 21 82884367; fax: +98 21 82884915. E-mail addresses: [email protected], [email protected] (A.M. Boroujerdian), [email protected] (M. Saffarzadeh), hyosefi@ut.ac.ir (H. Yousefi), [email protected] (H. Ghassemian). http://dx.doi.org/10.1016/j.aap.2014.09.014 0001-4575/ ã 2014 Elsevier Ltd. All rights reserved.

review, some of the current segmentation methods are as the following: 2. Current segmentation methods Identifying high crash road segments is a very critical stage in road safety studies. Using segmentation, one can assign accidents to specific road segments and identify high crash road segments. The previous researches in segmentation show that for identifying high crash road segments in many countries the first step involves dividing the road of interest into equal length segments and then studying the accidents in each segment using one of the identification methods of high crash road segments. Although, the segment lengths are defined differently in different countries, the length for evaluating a specific road is unique. Kononov and Allery (2003) studied the level of road safety service. In their study after separating some parts of the road, they divided the road into 2 mile segments to identify high crash road segments (Kononov and Allery, 2003). According to Federal

A.M. Boroujerdian et al. / Accident Analysis and Prevention 73 (2014) 274–287

Highway Administration report, the length of high crash road segments is equal to 0.3 miles in road segmentation (Federal Highway Administration, 1981). According to Texas Transportation Institution the length of high crash road segments should be at least 0.1 miles (Bonneson and Zimmerman, 2006). In Ohio, Segmentation method is applied differently. According to this method, segmentation is a procedure in which a road is divided into segments with the same characteristics. In this research, the length of road segments is equal to 0.25 miles. Also, each segment should not be too long or too short. Depending on conditions, the segments of less than 0.25 miles and 0.25–0.5 miles are defined in the list of segments (Pant et al., 2003). A different method for road segmentation is proposed by Torke. According to this method, the road is divided into 0.2 km segments which may be continued along the highway or included in the other segments or intersections (Troche, 2007). In a project undertaken by the European researchers, the current methods of high crash road segment management and road network safety analysis are evaluated (Elvik, 2008). In Austria, a fixed segment of 2.5 km moves along the road as a template. The segments which are defined along this template and meet the specific criteria of high accident-proneness level are defined as high crash road segments (Troche, 2007). High crash road segment in Denmark is defined by dividing road systems into different kinds of road segments and intersections (Vistisen, 2002). A test based on the Poisson distribution is done to identify high crash road segments. The minimum number of accidents that is considered for a high crash road segment is 4 accidents in a period of 5 years. Accordingly, segmentation is achieved to identify high crash road segments using the defined template. The length of this template depends on the number of normal accidents in each segment (Vistisen, 2002). In Belgium, based on police report, every segment in which three or more accidents occur during 3 years is defined as a high crash road segment. In this method, a 100 miles template is used to identify high crash road segments. Therefore, the segments with the maximum length of 100 miles and 3 accidents are recorded (Geurts, 2006). In Romania, there are two definitions to identify high crash road segments: (1) with the exclusion of the residential areas, a high crash road segment is a location in which at least 4 accidents occur in 3 years in a length of less than 1000 m (2) in residential areas, a segment is identified as a high crash road segment if at

275

least 4 accidents occur in 3 years in a length of less than 100 m. In this method, the template (100 m or 1000 m) is used for segmentation (Elvik, 2008). In Iran, the road is divided into 1 km segments and then the accidents in each year are counted. Table 1 summarizes the common definition of high crash road segments in each country (Elvik, 2008). However, in addition to various lengths of high crash road segments in different methods, another difference between the segmentation methods is the definition of starting point. There are basically three starting point definition: (1) some fixed and successive segments are defined from the beginning of the road (2) the segments are moved in half of the length of fixed segments and then the accidents in the new segments are studied so that analysis errors in this method may be reduced (3) high crash road segments are identified by floating the fixed template segment along the road. The last definition is the most accurate. 3. Evaluation of current segmentation methods The current segmentation methods fall into two categories: static and dynamic segmentation methods. In static segmentation methods the length of each segment is fixed; high crash road segments are identified by dividing the road into segments with specific lengths, and by counting the accidents in these segments according to the definition of high crash road segments. Then the segments with high priority are identified in terms of high accident-proneness. With regards to accident distribution along the road and their causes, it can be concluded that the risk index or probability of accident occurrence along the road may vary due to interaction between safety factors of the road, vehicles, and humans. For example, the friction coefficient may not be suitable along the road and lead to an accident occurrence at time intervals in a year or the accidents may occur in part of a road due to inadequate sight distance such as in a road curve. Considering the aforementioned examples, it is concluded that the length of high crash road segments may vary along the road depending on the extent of accident causes. Also, using the static segmentation methods may lead to some errors in results analysis and high crash road segment identification or may even fail to identify some high crash road segments. Three main deficiencies of static segmentation methods are as follows:

Table 1 Definition of high crash road segments in some countries (Elvik, 2008). Country

Definition

Germany

- 300 m road segments - More than 3 accidents during 1 year and more than 5 accidents during 3 years

United Kingdom

- 300 meters road segments - The location in which the total number of accidents is more than 12 during 3 years.

Portugal

- The segments with the length of 200 m - More than 5 accidents during one year

Spain

- 1 km segments - More than 5 injury accidents or 2 fatal accidents during 1 year - More than 10 injury accidents or 5 fatal accidents during 3 years

Norway

- 100 meter road segments and more than 4 fatal accidents during one year

Czech Republic

- A road segment of 250 meters - At least 3 injury accidents during one year or 3 similar injury accidents during 3 years - At least 5 similar accidents during 1 year

Netherland

- At least 10 total accidents or at least 5 accidents with certain specifications

276

A.M. Boroujerdian et al. / Accident Analysis and Prevention 73 (2014) 274–287

Static Segmentation Method

5

8

9

2

14

1

5

10

Dynamic Segmentation Method Fig. 1. The number of accidents counted by two segmentation methods.

a Probability of omitting some road segments with high accident

density;In this method, the boundary of some segments may fall in the middle of a high accident density region. In this case, half of the accidents are located within one segment and half in the other. Therefore, with depreciation of the accident density in these two segments, the high crash road segments may not be identified and thus the analysis may not be accurate. The deficiency is illustrated in Fig. 1.As it is indicated in Fig. 1, the number of accidents does not exceed 10 in any segments with the static segmentation method, but by applying the dynamic method (based on the accident density), some segments with more than 10 accidents are identified. b Not identifying the segments in different levels.If there is one general problem or there are several local problems along the specific length of the road, the current methods are not able to identify this problem. The problem is shown in Fig. 2.Regarding the distribution of accidents along the road, it can be concluded that a specific cause may be the main cause of high accidentproneness in the longer length of the road while the severity of that cause may be greater in the shorter length of that road. Therefore, in this location, the probability of accident occurrence is greater than in the other locations of the road. The current segmentation models may not be able to identify this location. c Mismatch between the specified length of segments with the current static segmentation methods and the length of the real high crash road segments.If the length of high crash road segments is more or less than the length of each specified segment, the suitable length of each segment can not be identified by the current method. The high crash road segment length should be defined with regards to the proportional distribution of accident causes along the road. When evaluating road safety, assuming fixed lengths of segments may lead to errors identifying high crash road segments. Considering the given explanations and examples, it is concluded that most of the current methods for identifying high crash road segments have serious deficiencies. However, after reviewing the studies in several countries, the segmentation method was improved to decrease the mentioned deficiencies. Therefore, instead of simple segmentation, a fixed length template is created based on the length of the road and moves along the road

incrementally with a specific step. This method is a bit similar to the dynamic segmentation method, but the new proposed model in this research, in addition to solving all of the aforementioned problems in this section, is able to develop some new analytical capabilities which are mentioned in the next sections. 4. The algorithm of wavelet theory In this research, the data is processed after specifying the accident locations along the road based on the accident data reported by the police. Then considering the relative accident locations (accident widespread or density) the road segments are identified based on the accident-proneness. In this method, the appropriate models are used for analyzing the accident data by changing them to discrete and analyzable data through wavelet theory. In this method, the accident density in each segment of a road is considered as the local response of that segment to the passing traffic. Furthermore, the length of accident-prone segment is measured based on accident distribution. One of the common signal processing tools is the Fourier transform. There, it is assumed that a smooth function can be decomposed to infinite harmonic waves of infinite support. Hence, frequency content of the function can easily be captured regarding the harmonic waves. This transform provides only information about the frequency content; the positions (time) of the frequencies are omitted. If a process is stationary, then the Fourier transform is enough to study the data. In case of non-stationary data (data with transient features during forming), however, recognizing both the frequency content and corresponding spatial positions are important. One of the non-stationary processes is road accident data, where the accident frequency and corresponding spatial position are equally important. In these cases, the features obtained by the famous Fourier transform are no longer useful. Some researches on a non-stationary data simultaneously with spatio-frequency spaces have been undertaken. The well-known one is the windowed Fourier transform; in this research, the data is locally studied by means of the Fourier transform. To meet the purpose, the data is observed by a spatially compact support window (with constant width in spatial domain) and the data of corresponding range is used for the Fourier transform. The

Static Segmentation Method

3

7

9

4

8 3

20

Dynamic Segmentation Method Fig. 2. The number of accidents counted by two segmentation methods.

A.M. Boroujerdian et al. / Accident Analysis and Prevention 73 (2014) 274–287

277

b) is evaluated. By changing the scale values, the transform process is repeated again. Finally the initial data {x,f(x)} is mapped to space {a,b,WC f(a,b)} , a 3D surface; there spatio-frequency features are presented simultaneously. In this theory to capture high frequency and smooth features, scales of small and large values are used, respectively. In the following the “Mexican hat” wavelet is illustrated in Fig. 3, and is defined as (Lcwalle, 1995): 2

x2

cðxÞ ¼ pffiffiffip4 ffiffiffiffi ð1  x2 Þe 2 3 p Fig. 3. Mexican hat wavelet.

resulting frequency content, then, is considered to associate with the window center. This locally watching-transforming process is continuously repeated throughout the data, and finally the spatiofrequency features of the data are captured. In this transform, the window width (support) is assumed to be constant. In brief, the transform can be expressed as (Yousefi and Noorzad, 2004): Z þ1 f ðxÞgðx  bÞeiwx dx (1) WFTðb; wÞ ¼ 1

where f(x) and g(x) are the data and compact support window, respectively. The parameter b denotes the center of the window g (x) in spatial domain. The window Fourier transform is inefficient in case of data having features of different frequency contents with different supports (in spatial domain); it can not detect all the features simultaneously. This is because, the window has constant support size; hence this transform is useful in cases where feature supports are approximately equal to the window support size. To remedy the aforementioned drawback, the wavelet theory is developed (Mallet, 1998; Lcwalle, 1995). Here, the window based approach is used too, but the size of support is no longer constant; the window denotes with C (x). Regarding the window support, in the wavelet theory, there are two kinds of window functions: (1) completely compact support windows: in this case the window width is spatially limited (it has non-zero values only in a compact range), like the Mexican hat wavelets (Mallet, 1998; Lcwalle, 1995); (2) windows with infinite width: in this case however values of the window vanish rapidly from center of wavelet function, like the Newland wavelet family (Mallet, 1998). In both cases, the wavelet R1 functions have finite energy; i.e.,: 1 cðxÞ2 dx < C where C is a finite positive constant. The support size of a scaled wavelet function varies regarding frequency content of data; the windows of narrow support are used in high frequency zones, while the ones with wide supports are employed in smooth portions. The continuous wavelet transform then can be defined as: Z þ1 1  xb Þdx f ðxÞc ð (2) W c f ða; bÞ ¼ pffiffiffiffiffiffi a jaj 1 where f(x) is the data; C (x) denotes the wavelet (a small wave: the window); a is the scale number controlling window width; b represents spatial position where the wavelet is shifted there; the symbol * shows imaginary conjugate of a function. The shifted-scaled version of the wavelet C (x) denotes as C a, b(x), and is as follows: 1

xb Þ a

ca;b ðxÞ ¼ pffiffiffiffiffifficð jaj

(3)

considering above formulation, the shifting and dilatation sizes are b and a, respectively. In the wavelet transforms, by considering a constant scale, the variable b is continuously varied and the wavelet transform, WC f(a,

(4)

where it satisfies the conditions: Z ca;b ðxÞdx ¼ 0 Z h

i2

ca;b ðxÞ dx ¼ 1

(5a)

(5b)

The condition (5a) means that the scaled wavelets have zero average, or in other words they measure variations (details) of data; the condition (5b) shows that the scaled wavelets have unit energy (for such condition, a scaled wavelet is normalized as Eq. (3)), hence by multiplying them to a function, energy of the function does not change during the wavelet transform (see Eq. (2)) (energy preserving transform). From Fig. 3 it is clear that the region of positive C (x) values equals to 2a; this property will be used in this work to identify accident zones: variation of accident with different frequencies, both low and high frequency zones, simultaneously. In the following, the performance of the wavelet-based approach is confirmed by some simple examples; in which, accident zones can easily be distinguished. 5. The identification model of high crash road segments Mathematical models can be used for analyzing logical concepts in nature. Nowadays, different kinds of theories and models are proposed to analyze accidents and identify high crash road segments each of which has their own advantages and disadvantages. This research aims at finding a suitable mathematical model that reduces the aforementioned errors of current models used in application studies. According to the data

Collecting accident data along the road including location of accident occurrence

Converting Discrete Spatially Non-uniform data to a Spatially Uniform One

Analysis of Produced Signal using Wavelet Theory

Omitting Disturbances from the Accident Signal Analysis of Wavelet Converting and Identifying the Center and Length of High Crash Road Segments Dividing the Road into some Segments According to the Accident Probability

Prioritizing High Crash Road Segments Based on the Magnitude of Wavelet Conversion Fig. 4. The procedure of the identifying model of high crash road segments using wavelet theory.

278

A.M. Boroujerdian et al. / Accident Analysis and Prevention 73 (2014) 274–287

Fig. 5. The instance of accident arrangement along the road.

distribution of accidents along the road, firstly, it is assumed that accumulation of accidents along the limited length of the road indicates that there is one or several accident causes in that location. Therefore, in this stage, the goal is to identify the locations of meaningful accident accumulation along the road. When considering the variation of real length of high crash road segments, which by definition is proportional to the risk factors along the road, the model, which defines the initial length of segmentation according to the distribution of accident locations, is more efficient in identifying high crash road segments. The structure of the proposed model is shown in Fig. 4.

uniform grid. The important point is that in such mapping the total number of accidents should not be altered. Therefore, here we use linear interpolation for remapping data of magnitude F located between two surrounding grid points. Assume distances of an accident from two neighbor points to be: x (from the left grid point) and D  x (from the right grid point); here D is distance of uniform grid points from each other (sampling step in spatial domain) and 0  x  D. Then by using the following simple linear interpolation formula the data is mapped:     x X FandFR ¼ F FL ¼ 1 

6. Collecting the accident data along the road including each accident location

where FL and FR are maps of F on the left and right points, respectively. Please note that at this mapping, total number of accidents are preserved, since: FL + FR = F.

Determining the location and length of high crash road segments depends on the determination of the exact location of accident occurrence. Since the duration of data collection affects analysis output, the accident data must be considered during the entire 3–5 year period regarding the random nature of crash occurrence and to get a regression to the phenomena. 7. Converting discrete spatially non-uniform data to a spatially uniform one In the above flowchart, one point should be explained in more details. In this study, the considered wavelet transform works on uniform grids (this kind of wavelet transforms is known as the first generation wavelets). In this regard, it is necessary to pre-process data (which are actually distributed on non-uniform locations). This pre-processing is done to remap the irregular data on a

D

D

8. Analyzing accident signal and identifying high crash road segments The segments in which some accidents occur due to some causes could be identified by both dilation and shifting of the mother wavelet window as described before. Finally the wavelet transform values are represented as function of the scale (which measures frequency) and spatial position (locations of wavelet function centers) in a contour plot graph. Here, values of transform coefficients are in accordance with the accident density and the risk index. This helps to improve safety in such high risk portions. Sometimes, one factor may be the cause of the accidents in a longer length of road, while in a smaller segment of this road there may be accidents with a different cause. Regarding the multiple analysis capability which is done by changing the window scale in

Fig. 6. The output of wavelet model to analyze the example in Fig. 6 (Boroujerdian, 2011).

A.M. Boroujerdian et al. / Accident Analysis and Prevention 73 (2014) 274–287

279

Fig. 7. The output of wavelet conversion after removing the effect of scattered data (Boroujerdian, 2011).

each point of the road, it is possible to identify a high crash road segment with a limited length of that segment. As it was mentioned in the previous section, different windows can be used to analyze a signal based on the wavelet theory. This wavelet could be used to identify high crash road segments with regard to the characteristics of Mexican hat analytical window. Using the Mexican hat window enables the analyzer to identify the center of the high crash road segments and the effective region. In this method, the locations which are identified as the peaks in the wavelet theory analysis diagram are the center of high crash road segment. Since the sign of accident data is positive, the maximum wavelet transform value occurs in the scale where the local information of interest lies within the positive extent of the wavelet window. Therefore, the length of the high crash road segment could be identified by doubling the scale of the wavelet transform size in the center of the high crash road segment. This capability will be evaluated using an example in the following: In the example shown in Fig. 5, the capabilities of the proposed model are introduced by reconstruction of accident data in a manner similar to what happens in a real road. The manual analysis of the example can be done to show that a general evaluation using the capabilities of the proposed model is possible. Some issues should be taken into account when developing this example; the issues include: high crash road segments with different lengths, high crash road segments with limited length along the longer segments, and scattered accident data which does not show accident causes in the road and may be due to irrelevant causes of road characteristics. The accident arrangement of the previous example and the output of the wavelet analysis method are shown in Figs. 5 and 6 respectively. This model can be evaluated by comparing these two figures.

In the proposed example in Fig. 5, the length of road where the accidents happen is 26,000 m and corresponds to the arrangement in this figure. The horizontal axis represents the number of samples or the ratio of length of the road to the length of sampling unit (length of sampling unit is 100 m) and the vertical axis represents the number of accidents in each sampling unit. The definition of the horizontal axis in Figs. 5 and 6 is the same and shows the length of the road and the vertical axis represents the wavelet scale. Therefore, the value of x in the extreme points in Fig. 6 represents the situation of points along the road and the value of y represents the scale in which the wavelet transform value is at maximum and the contour line passed from point y represents the wavelet transform size at that point. As Fig. 5 shows, wherever the accident happens in Fig. 5, its effect can be recognized. The centers of high crash road segments are the contour line peaks in Fig. 6 and the size of the peaks represents the length of high crash road segments. However, there is a problem in this analysis with existing extra data caused by scattered accidents. Identifying high crash road segments as the output of wavelet transform is performed after de-noising and analysis of the new signal. De-noising is discussed in the next section. 9. De-noising In this method, the analysis output is de-noised by calculating the amount of wavelet transform of the high accident-proneness threshold (for e.g., one accident along the sampling stage) and by omitting some areas with a wavelet transform that is less than the amount of wavelet transform of the high accident-proneness threshold. Considering that the applied wavelet in this analysis is Mexican hat its function is formulated as follows:

280

A.M. Boroujerdian et al. / Accident Analysis and Prevention 73 (2014) 274–287

Fig. 8. Segmentation of the road through the output of corrected wavelet conversion (Boroujerdian, 2011).

2

x2

cðxÞ ¼ pffiffiffip4 ffiffiffiffið1  x2 Þe 2 3 p

(Repetition 4)

The wavelet transform of data including one accident is calculated as follows:   Z Z x 1 xb 1 b¼0 W ða;bÞ ¼ pffiffiffi c dðxÞdx ! W cr ¼ pffiffiffi c dðxÞdx a a a a  Z  1 x¼0 1 2 ¼ pffiffiffi  pffiffiffiffiffiffipffiffiffiffi ¼ pffiffiffi (6) a a a 34 p In the above equation, d(x) shows the delta function whose characteristics are defined based on the following equations:  dðxÞ ¼ 0 x 6¼ 0 (7) dðxÞ ¼ 1 x ¼ 0 þ1 Z

dðxÞdx ¼ 1

(8)

1

Z f ðxÞdðx  x0 Þdx ¼ f ðx0 Þ

(9)

Since the least value of “a” is equal to 1, the least value of wavelet transform is 1.73 based on the Eq. (7). The values of wavelet

transform that are more than the threshold value for identifying high crash road segments are shown in Fig. 7. As the above figure shows, the noises have been removed in Fig. 7b as compared with Fig. 7a. It should be considered that the p values of wavelet transform are divided into a. This performance will be discussed in the next sections. 10. Defining the length of high crash road segments In Fig. 8, the road is divided into segments based on the output of wavelet transform without the scattered accidents. In this figure, the high crash road segments are identified by the dynamic segmentation method based on the location of each accident. For better analysis, the axes which specify the center of high crash road segments are labeled with numbers and the specified segments in this example have been shown in two separate rows to represent the multi-scale property for recognizing high crash road segments with different length. The aforementioned scales are labeled largescale segmentation (LS) and small-scale segmentation (SS). As it was discussed in the previous sections, the center and length of high crash road segments are identified by the line which passes through the peak of the wavelet transform surface and the peak scale size respectively. The SS and LS sections reveals themselves as local peaks in the wavelet transform representation (b,a,W(a,b)) with effective length 2a; where a denotes the scale of the wavelet function (see Fig. 3); in

A.M. Boroujerdian et al. / Accident Analysis and Prevention 73 (2014) 274–287

281

Fig. 9. Artificial peaks in the Mexican hat wavelet transform for a constant function with bounded spatial support.

the length 2a, the Mexican hat wavelet is positive. The SS and LS sections locate around small and large a values, respectively. Smaller a value is a more localized feature. In this regard, the axes 3, 4, 6, 8, 9, 10, 12, and 13 specify the locations of high crash road segments with limited length (with slightly different support lengths: 2a). As it can be seen in the arrangement of accidents, many accidents have happened along the short length next to these axes. These segments may be located within the confines of other longer high crash road segments. This situation may happen in real roads when the road safety is

inappropriate due to a general cause (such as a nonstandard friction coefficient in the long segment of a road) and when more accidents occur along this segment or along one or more shorter parts of the road due to another cause (such as an unsuitable curve radius). These kinds of segments could be identified by the multiple clarity application which is involved in this method. As it can be seen in Fig. 8, the axes 1, 2, 3, 5 and 13 indicate the high crash road segments with different lengths where the occurring accidents are more than the threshold value of high accidentproneness. The model is sensitive to axis 9, which is a long segment

Fig. 10. Segmentation of the road through the output of corrected wavelet conversion (Boroujerdian, 2011).

282

A.M. Boroujerdian et al. / Accident Analysis and Prevention 73 (2014) 274–287

Fig. 11. The arrangement of recorded accidents in more than 50 km of Shahrood–Sabzevar road (Boroujerdian, 2011).

with a fixed high accident-proneness level. Thus it is clear that the hazardousness of specified segments depends on the amount of the indicated wavelet transform and the segment length which is discussed in Section 15. As it can be seen in the output of wavelet transform, segment 9 is made up of three distinct high crash road segments of shorter length. The axes 7 and 11 are (small) peaks, which are artificially detected by the wavelet transform. Mathematically, it can be explained by Fig. 9. There, a constant function f(x) with finite

support (defining on a bounded spatial domain) is considered. In Fig. 9(a), the constant function and the wavelet are shown; the wavelet function locates completely in support of f(x). The corresponding wavelet transform, therefore, is zero; since positive and negative parts of transforms eliminates each other. By approaching the wavelet to the edge of f(x) (end of support), at first, the transform value W(a,b) increases (Fig. 9(b)). This is due to unbalancing of the positive–negative parts of W(a,b). The maximum value is achieved when one negative part of W(a,b) is

Fig. 12. The segmentation of Shahrood–Sabzevar road (Boroujerdian, 2011).

A.M. Boroujerdian et al. / Accident Analysis and Prevention 73 (2014) 274–287

283

Fig. 13. The signal wavelet conversion amount of Shahrood–Sabzevar road accidents.

completely outside the support (Fig. 9(c)). After this point, by further shifting of the wavelet, the W(a,b) value decreases until it becomes zero (Fig. 9(d)). By further shifting of the wavelet it continues to decrease (Fig. 9(e)) until W(a,b) reaches a minimum value. Thereafter, negative value of W(a,b) increases until the wavelet is completely outside the support of f(x). Beyond this point, W(a,b) will be zero (Fig. 9(f)). In this regard, the wavelet transform detects a maximum value near the edge of f(x) artificially (Fig. 9(c)).

With regards to the division of high crash road segments into LS and SS segments and depending upon the amount of maximum accident density index in each segment (Fig. 10; here, the numbers in the rectangles are the W(a,b) values), it may be concluded that the segments 10, 12, 9, 6, 4 and 8 are the SS high crash road segments, respectively, in the road of interest and also the segments 3, 9, 5, 2, 13 and 1 are identified as the LS high crash road segments, respectively, from the most to the least hazardous.

11. The relationship between size of the wavelet transform in different segments and the probability of accident occurrence in each segment

12. Genuine example

Based on the rules of the wavelet method for analyzing accident data, it is concluded that for every segment of the road which has a higher accident density, the improvement priority of that segment is higher. Therefore, by this method, the length of high crash road segments may be identified and its relative prioritization is determined. Naturally, if the data is corrected, for example by using the equivalent coefficient of accident severity, the effect of accident severity may be taken into account in the analysis. Prioritization by this method may be done provided that the fundamentals of the theory used for prioritizing high crash road segments coincides with the hypothesis of the wavelet transform method. As seen in Eq. (10), in order to calculate the wavelet transform an accident frequency index around a point is divided by the second root of the wavelet size scale that matches with the length of effective area of this point. Z 1 xb ÞdðxÞdx W ¼ pffiffiffi cð (10) a a Therefore, a density index is developed by dividing the wavelet transform size at each point by the second root of the wavelet scale size for that point which is applicable for the relative prioritization of the segments by high accident-proneness probability. Table 2 The priority of segments based on crash density. Segment No.

1

2

3

4

5

6

7

8

9

10

11

12

Priority

6

5

4

10

8

12

1

9

2

7

11

3

In this section, the capabilities of the proposed model are evaluated using a genuine example. In this example, the accident data is collected along the road which is 50 km in length between Shahrood and Sabzevar for a duration of 3 years i.e., 2004–2007 (Boroujerdian, 2011). The figurative pattern of reported accident data in the case study is illustrated in Fig. 11. In this figure, the horizontal and vertical axes present the length of the road (km) and the number of accidents respectively. As it can be seen in the figure, no accident has been reported in the first 3 km of the road. Also, the reported accidents in the end of 20 km section of the road under study are scattered. One of the most important characteristics of these data is that they have accuracy of 1 km; this means that the least length of identifiable high crash road segment is about 1 km. The data is collected for the accidents occurred during 2004–2007. It is necessary to be rigorous in developing the analysis process for threshold value definition of high crash road segment. It is expected that the model can remove the effect of scattered data from the analysis and identify the high crash road segments matched with the length of segment.

13. Converting the accident signal wavelet and de-noising Accident signal is analyzed by the wavelet transform method and the results are shown in Fig. 12. As already discussed in the previous section, by omitting the contour Lines having the amount of less than the threshold value, there is a possibility of de-noising the data for accidents.

284

A.M. Boroujerdian et al. / Accident Analysis and Prevention 73 (2014) 274–287

Table 3 The number of accidents by different segmentations (Boroujerdian, 2011). km The number of Fixed segmentation Dynamic accident per segmentation kilometers SS 2 km 3 km 5 km 2 km LS segment segment segment segment (floating) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51

0 0 0 25 21 20 14 35 7 20 16 4 24 4 14 8 6 47 1 14 35 7 41 9 21 12 6 34 5 4 1 1 3 0 1 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0

0

0

46

0 0

0

0

46

66

142

25 66 41 96 49

56

20 49

40

7 36

27 20

62

76 28

28

42 4 22

22 61

76

53 15

50

1 49 113

57

7 50

50 39

33 61

40 9

47 15

227

35 7 41 9 21 18

9 5

6

34 17

17

1 4

3 1

1

1 0 0

0

0 0

0 1 1

1 1 0

0

0 0

0 1

1

0

1 1 0

Table 4 The accident density in different segments (Boroujerdian, 2011). km Fixed segmentation

40 43

2

4 24 4 14 14

53

42

33

14 35 7 36

them the segments 2 and 10 have larger lengths. In order to display the segments more clearly, the results are shown in two axes. As it is illustrated, around the axis 1, considering the approximation of these three segments with high number of accidents, the dynamic segmentation model defines the length of segment so that it includes these three segments. Also, around the axis 4, there is a similar situation and the corresponding model to the location of accidents defines the length of the segment. Apparently, the minimum length of segment which is identifiable in this method is the same as the length of sampling step which is defined in many SS high crash road segments. Fig. 12 illustrates the segmentation of Shahrood–Sabzevar road. Approximation of the many occurred accidents may indicate that the there might be a special cause along particular part of the road, thus the problem can be solved along the segments 2 and 10.

0 0

With regards to the step length of this information which is 1 km, the threshold value of wavelet transform has been calculated based on 10 accidents per kilometer. Clearly, accident-proneness threshold can be defined in each area based on the safety policies of local road organizations. 14. Road segmentation using the wavelet transform output In this section, high crash road segments are identified using the modified wavelet transform surface of the segments. As shown in Fig. 12, twelve high crash road segments were identified with different lengths in the example, which among

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51

Dynamic segmentation

2 km segment

3 km segment

5 km segment

2 km segment (floating)

SS

LS

0.0

0.0

9.2

0 0

0.0

0.0

23

22.0

20.3

12.5 22.0 20.5 19/2 24.5

18.7

20 24.5

13.3

7 18

13.5 10.0

12.4

10.9 14

14.0

14.0 4 11

11.0 20.3

15.2

26.5 7.5

16.7

1 24.5 22.6

19.0

7 25

25.0 13.0

16.5 12.2

20.0 4.5

4.5 1.7

1.2

1 2

1.5 0.3

0.5

0.5 0.0 0.0

0

0.0 0

0.0 0.3 0.5

0.5 0.2 0

0.0

0.0 0

0.0 0.3

0.2

0

0.5 0.5 0.0

47.0 7.5

20.6

35.0 7.0 41.0 9.0 21.0 9.0

20 14.3

1.0

4.0 24.0 4.0 14.0 7.0

26.5

21.0

16.5

14.0 35.0 7.0 18.0

0.0 0

34.0 0.7

0.7

A.M. Boroujerdian et al. / Accident Analysis and Prevention 73 (2014) 274–287

285

Table 5 Comparing the statistical index segmentation by static and dynamic segmentation methods (Boroujerdian, 2011). Fixed segmentation

The The The The

number of identified segments maximum density total segments density variance of segments density

Dynamic segmentation

2 km segment (floating)

5 km segment

3 km segment

2 km segment

LS

SS

30 26.5 250.5 94.4

11 22.6 92.4 64.5

17 22.0 154 73.0

26 26.5 231 86.3

5 21.8 55.1 91.3

21 47.0 354.9 177.3

15. Prioritizing high crash road segments by the wavelet transform Since the amount of wavelet transform illustrated in counter form of Fig. 13 indicates the accident density index along the road, the priority of segments may be identified using it. It must be noted that the priority of SS and LS high crash road segments can be compared to each other. Table 2 shows the result of prioritization. 16. Dynamic segmentation model evaluation Two mathematical and economic factors are used to study the validity of segmentation model. These two factors are defined before the analysis. As length of high crash road segments is defined according to the accidents arrangement in this research, so the best segmentation procedure is to divide the road in a way that the difference in accidents number in adjacent segments is at maximum. This means maximizing the accident density along some of high crash road segments and minimizing the accident density along the other segments. Thus, more variance of accidents density along the segments indicates that the segmentation process has been performed well. The other parameter to compare various types of segmentation method is to calculate the number of accidents in the segments with higher priority in the length unit. First of all, considering budget limitations, it is assumed that safety improvement is possible in the shorter length of the road. Therefore, some segments with high priority (with maximum accident density along the segment) are chosen through every segmentation

method that the total length of them is equal to the maximum of safety improvable length considering the economic limitations. Considering the conditions, the best segmentation style is the one that recognizes the maximum total number of accidents in the specified segments. To compare fixed and dynamic segmentation methods, the previous road is divided into fixed segments of 2, 3 and 5 km in length and also into floating segment of 2 km in length. Then the number of accidents is recognized along the segments by each method. After that, the accident density of all segments is calculated by both fixed and dynamic segmentation method. The number of accidents along the road is shown in Table 3. The calculated accident density in each segment is based on the proposed example in Table 3 which is shown in Table 4. The maximum density, total accident density and the variance of accident density in the segments defined by each segmentation method are indicated in Table 4. Identifying the segments with more maximum accident density by a segmentation method indicates identifying the highest crash road segment along the road of interest by that method comparing to the other methods. If the total accident density in a segmentation style is more than the others it means that the defined length of segments by this style is more compatible with the real length of high crash road segments of the road. Finally, the segmentation style which has more variance of accident density comparing to the others is better than the others (Boroujerdian, 2011). As it can be seen in Table 5, considering each of the three introduced indices, the dynamic segmentation method is more suitable. Considering the characteristics of fixed segmentation

120 110 ( Sn/nt ) Percentage of Accident

100 90 80 70 60 50 2 Fixed Segmentatio n (2km)

40

3 n (3km) Fixed Segmentatio

30

5 n (5km) Fixed Segmentatio

20

2 on(2km) Floating Segmentati

10 0

Dynamic Segmentation

0

10

20

30

40

50

60

70

80

90

( SL/Lt) Percentage of Road Length Fig. 14. Comparing the accidents per the percentage of road length (Boroujerdian, 2011).

100

286

A.M. Boroujerdian et al. / Accident Analysis and Prevention 73 (2014) 274–287 Table 6 The percentage of counted accidents of high crash road segments by selecting the 15 percent of road length for safety improvement (Boroujerdian, 2011). Percentage of covered accidents

Segmentation method

34 36 41 42 54

Fixed segmentation (segments length is 5 km) Fixed segmentation (segments length is 3 km) Fixed segmentation (segments length is 2 km) Floating segmentation (segments length is 2 km) Dynamic segmentation

method, it usually compares with the dynamic SS segmentation model. As seen, none of the methods has identified the density equal to 47 except dynamic segmentation method. This indicates there is a high number of accident along the shorter length of the road. Therefore, due to not fixed length of high crash road segments in dynamic method, the segment length in this part of the road is defined almost the same as the real length of high crash road segment in this area while the density in the fixed segmentation methods with different length is about half of the density of dynamic segmentation method. Generally, total density of segments in different methods indicates that the segments length matches with the accident arrangement. This index is 354.8 by dynamic SS segmentation method that is much higher than the other methods. At last, variance index that indicates the variation of accident density in different segments is 177.3 by dynamic segmentation which means the dynamic segmentation model is more suitable than the fixed segmentation model (Boroujerdian, 2011). As already mentioned another method is used to evaluate the validity of the proposed model for applicable comparison of segmentation methods that is based on the limitation index of road repair and maintenance costs for road safety improvement. In order to suit the purpose, the segments of interest are ordered by different methods based on their priority (accident density along the segment). Then, the number of accidents is calculated in the specified part of the road by different segmentation styles. Fig. 14 describes the relationship between aggregate percentages of accidents and the percentage of studied length in each method. Considering the budget limitation for road repair and maintenance, the segmentation which can identify the most accident number in the shortest length of road, is the best segmentation method. According to Table 6, it is concluded that when there is budget limitation for safety improvement, by improving 15 percent of the road length, 54 percent of accidents along the road can be considered along the specified segments, if the dynamic segmentation method is used. This is 42 percent when using the floating segmentation method with the segments of 2 km in length (Boroujerdian, 2011). As seen in Fig. 14, for preventing the 70 percent of accidents, the total length of segments involving these accidents is different in different methods. For example, the length of the road that includes these accidents is 21 percent of total length of the road when using dynamic segmentation method, 27 percent when using the floating segmentation method (the segment length is 2 km) and 34, 30, 35 when using the fixed segmentation method assuming the length of segments is 2, 3 and 5 respectively. It must be noted that in dynamic method the segments are selected based on priority not LS or SS scales. Thus, the number of accidents in different segment length of the road in dynamic segmentation method is more than the number of accident when using the other segmentation methods. Therefore, the dynamic segmentation method is more efficient and accurate than the other method for solving such problems.

It is concluded that the studied hypothesis in this paper is a suitable replacement for the current segmentation models. 17. Conclusion The brief output of this research is categorized as the following: - The length of high crash road segments can be identified

-

-

-

-

-

matches with the accident arrangement along the road by dynamic segmentation model. Identifying high crash road segments and their priorities can be achieved simultaneously by dynamic segmentation method. Dynamic segmentation model leads to improvement of budget assignment process for road repair and maintenance and also may optimize the budget assignment process. The amount of improvement is studied in a test in which the number of accidents is identified in the segments with higher priority and specified length based on different models. The diagram which compares them indicates that the dynamic segmentation model considers 40–65 percent of accidents for safety improvement of 10–20 percent of the road length, while this amount is 29–52 percent of accidents when using the floating segmentation method. Therefore, the results of identifying high crash road segments by dynamic segmentation model are improved by 25–38 percent comparing to the floating segmentation method. The wavelet conversion signal analysis method can be used in dynamic segmentation method. It is possible to identify high crash road segments with shorter lengths along a high crash road segment by dynamic segmentation method. Mexican hat window can be used as wavelet for identifying high crash road segments by wavelet conversion method. For de-noising the signals the amounts of wavelet transform coefficient which are less than the amount of wavelet conversion related to the threshold value of high accidentproneness can be omitted from the output. The final accident cause may be taken into account in the prioritization process by caused-based prioritization model. Identifying high crash road segments with higher priority is implemented by the caused-based model from aspect of accident aggregation. By using the caused-based prioritization model, high crash road segments can be identified as well as the accident cause.

References Bonneson, J., Zimmerman, K., 2006. Procedure for Using Accident Modification Factors in the Highway Design Process Report No. 0–4703-P5., Texas Transportation Institute. Boroujerdian, A., 2011. Developing Evaluation Model of Road Safety Based on the Dynamic Segmentation and Caused Based Prioritization, Ph.D. Dissertation. Faculty of Civil and Environmental Engineering, Tarbiat Modares University. Federal Highway Administration, 1981. Highway Safety Improvement Program, FHWA-TS-81-218. US Department of Transportation, Washington, DC December.

A.M. Boroujerdian et al. / Accident Analysis and Prevention 73 (2014) 274–287 Geurts, K., 2006. Ranking and Profiling Dangerous Accident Locations Using Data Mining and Statistical Techniques. Doctoral Dissertation. Faculty of Applied Economics, Hasselt University, Hasselt. Lcwalle, J., 1995. Toturial on Continuouc Wavelet Analysis of Expremental Data. Syracuse University April. Kononov, J., Allery, B., 2003. Level of Service of Safety Conceptual Blueprint and Analytical Framework, Transportation Research Record 1840, Paper No. 03–2112, TRB, Washington, D.C. Troche, L.R., 2007. Methodology to Identify Hazardous Locations for Highways in Puerto Rico, Thesis submitted in partial fulfillment of the Requirements for the Degree of Master of Science.

287

Pant, P.D., Rajagopal, A.S., Cheng, Y., 2003. Rational Schedule of Base Accident Rates for Rural Highways in Ohio (Phase II), (June). Report No. FHWA/OH-2003/008. Elvik, R., 2008. A survey of operational definitions of hazardous road locations in some European countries. Accid. Anal. Prev. 40, 1830–1835. Mallet, S., 1998. A Wavelet Tour of Signal Processing. Academic Press. Vistisen, D., 2002. Models and Methods for Hot Spot SafetyWork. PhD Dissertation. Department for Informatics and Mathematical Models, Technical University of Denmark, Lyngby. Yousefi, H., Noorzad, A., 2004. The Application of Wavelet Theory in Solving the Linear Vibration Equations, M.Sc. Dissertation. Technical Faculty, Tehran University.

A model to identify high crash road segments with the dynamic segmentation method.

Currently, high social and economic costs in addition to physical and mental consequences put road safety among most important issues. This paper aims...
3MB Sizes 1 Downloads 5 Views