Accident Analysis and Prevention 71 (2014) 327–336

Contents lists available at ScienceDirect

Accident Analysis and Prevention journal homepage: www.elsevier.com/locate/aap

Latent risk and trend models for the evolution of annual fatality numbers in 30 European countries Emmanuelle Dupont a,∗ , Jacques J.F. Commandeur b , Sylvain Lassarre c , Frits Bijleveld b , Heike Martensen a , Constantinos Antoniou d , Eleonora Papadimitriou d , George Yannis d , f ˜ Elke Hermans e , Katherine Pérez f , Elena Santamarina-Rubio , Davide Shingo Usami g , Gabriele Giustiniani g a

BRSI, Belgian Road Safety Institute, Belgium SWOV Institute for Road Safety Research and VU University Amsterdam, The Netherlands c IFSTTAR, French Institute of Science and Technology for Transports, Development, and Networks, France d NTUA, National Technical University of Athens, Greece e Hasselt University, Transportation Research Institute, Belgium f ASPB, Agència de Salut Pública de Barcelona, CIBER de Epidemiología y Salud Pública (CIBERESP), Institut d’Investigació Biomèdica Sant Pau (IIB Sant Pau), Barcelona, Spain g La Sapienza University of Rome, Research Centre for Transport and Logistics, Italy b

a r t i c l e

i n f o

Article history: Received 18 January 2014 Received in revised form 10 May 2014 Accepted 12 June 2014 Available online 9 July 2014 Keywords: Risk Exposure Structural time series models Latent risk model Stochastic trend model Forecasting

a b s t r a c t In this paper a unified methodology is presented for the modelling of the evolution of road safety in 30 European countries. For each country, annual data of the best available exposure indicator and of the number of fatalities were simultaneously analysed with the bivariate latent risk time series model. This model is based on the assumption that the amount of exposure and the number of fatalities are intrinsically related. It captures the dynamic evolution in the fatalities as the product of the dynamic evolution in two latent trends: the trend in the fatality risk and the trend in the exposure to that risk. Before applying the latent risk model to the different countries it was first investigated and tested whether the exposure indicator at hand and the fatalities in each country were in fact related at all. If they were, the latent risk model was applied to that country; if not, a univariate local linear trend model was applied to the fatalities series only, unless the latent risk time series model was found to yield better forecasts than the univariate local linear trend model. In either case, the temporal structure of the unobserved components of the optimal model was established, and structural breaks in the trends related to external events were identified and captured by adding intervention variables to the appropriate components of the model. As a final step, for each country the optimally modelled developments were projected into the future, thus yielding forecasts for the number of fatalities up to and including 2020. © 2014 Elsevier Ltd. All rights reserved.

1. Introduction The temporal evolution of the number of accidents and victims (fatalities, severely injured, injured) is a major topic of interest in many road safety studies (see, e.g., COST 329, 2004; Lassarre et al., 2012; Commandeur et al., 2013). These quantities are counted on a monthly or yearly basis in all European countries. Both basic and more sophisticated statistical models have been proposed to capture these evolutions. Many models assume only a (log) linear

∗ Corresponding author. Tel.: +32 22441540; fax: +32 22164342. E-mail address: [email protected] (E. Dupont). http://dx.doi.org/10.1016/j.aap.2014.06.009 0001-4575/© 2014 Elsevier Ltd. All rights reserved.

function of time for the modelling of the number of fatalities, for instance some of the models discussed in Elvik (2010). Accidents and their consequences in terms of victims are occurrences of failures in the road transportation system. Each time a road user makes a trip, he or she is exposed to harm with a certain probability of being involved in an accident or being killed. The number of traffic fatalities in a certain period is obviously dependent on the total exposure resulting from the amount of traffic in the traffic system in that same period. The amount of exposure determines the scale of the road safety problem and is therefore an essential factor in the assessment of road safety. Many models include some measure of traffic volume as an approximation of exposure. Frequently (e.g., Oppe, 1989, 1991) the total number of

328

E. Dupont et al. / Accident Analysis and Prevention 71 (2014) 327–336

motor vehicle kilometres travelled is used as a measure of traffic volume, which is only an approximation of exposure (because it is based on survey data, for example). It is then used to calculate the empirical risk as the number of fatalities divided by the number of kilometres travelled. This paper tries to overcome the limitations of previous models by considering three issues. First, it is acknowledged that the actual exposure is latent and can only be approximated by whichever measure of traffic volume is chosen. Consequent on the assumption of latent exposure, the risk derived analogous to that reported by Oppe (1989, 1991) now becomes a latent risk by definition. Second, latent exposure and the latent risk are simultaneously used for the modelling of the number of fatalities instead of in two separate steps as was done in Oppe (1989, 1991). When seeking to understand the evolution in the number of fatalities or accidents this approach makes it easier to decide whether a change is to be attributed to a change in exposure or to a change in the risk road users are exposed to. Finally, as indicated in Elvik (2010), the trends underlying the developments of the number of traffic fatalities need not be stable over time, with the consequence that modelling them as stable may yield poor in-sample predictions as well as poor forecasting results. Another aspect paramount to the analysis of data spanning an extended period of time, therefore, is that relations need not be invariant over the whole observed period. One way to handle such changes is to allow (model) parameters to be time varying as well, thus yielding improved in-sample predictions and forecasting results. The model described and applied in this paper – called the Latent Risk Time series model (Bijleveld et al., 2008; Bijleveld, 2008; COST 329, 2004) – is designed to accommodate all three of the above-mentioned properties. The aim of the analyses presented in this paper is to obtain forecasts for the number of traffic fatalities in each of the European countries in 2020 in a similar way by means of the structural time series approach, using comparable data as much as possible. As the purpose is to provide some comparable robust forecasts to help policy makers develop long-term targets and strategies for successful road safety policies, the models are focused to the analysis of two basic components: exposure and risk, and the introduction of idiosyncratic explanatory factors has been avoided whenever possible. In total, the results for some 30 countries are presented in this paper that are based on the work performed for the EC FP7 project DaCoTA (see Martensen and Dupont, 2010; Dupont and Martensen, 2012; Lassarre et al., 2012). The use of the homogeneous modelling technique allows to compare the past and future developments of the various countries and to address the following important questions: • Has there been a continuous, smooth development of road safety or were there abrupt changes in these developments? • If there were changes, can these be attributed to changes in the risk, or rather to changes in exposure? • Where does the past development tend to (if continued)? This last issue is particularly important for the setting of realistic road safety targets by policy makers. The European Commission has set the target to halve the number of road deaths in 2020 as compared to 2010. 2. Methodology One of the most important outcomes of road safety – quantified as the number of fatalities – is a joint function of the “level of dangerousness” of the traffic system, or road risk, and of the extent in which road users are confronted with this risk, here defined

as the exposure to risk. This framework, where the fatality trend is decomposed into a risk and exposure trend, was made popular by Oppe (1989, 1991). This decomposition implies that two series of observations have to be analysed in parallel in order to model the development of road safety: one for the road safety indicator, the other for the exposure indicator. In the models presented here, the number of fatalities is the road safety indicator. The indicator for exposure is related to traffic volume and either the number of vehicle kilometres travelled or the size of the vehicle fleet can be used, depending on the availability of mobility data in the different European countries. The assumption that the development in traffic safety is the product of the respective developments in exposure and risk can be summarised as follows: Vehicle kilometres = Exposure Fatalities = Exposure × Risk

(1)

Except for the time dependent specification, these two equations define the Latent Risk Time series model (LRT). In the LRT model both traffic volume and fatalities are treated as dependent variables. Traffic volume is modelled as a measure of “exposure” which can be subject to error. The number of fatalities, on the other hand, is defined as the product of “exposure” and “risk” and is also subject to random variation. Traffic volume and the number of fatalities are considered to be the manifest counterparts of “exposure” and “exposure times risk”, respectively, where “exposure” and “risk” are treated as latent (i.e., unobserved) variables. By taking the logarithm of the two equations in (1) (thus turning the multiplicative model into an additive one), and adding an error term (also known as a disturbance term) to the latent variables, we obtain: log(Traffic Volume) = log(Exposure) + error(Exposure) log(Fatalities) = log(Exposure) + log(Risk) + error(Fatalities)

(2)

This implies that the disturbances in the original model formulation (1) should also be considered a multiplicative variable. This may seem a questionable assumption. One should note, however, that the additive Gaussian noise model (with constant variance) might not be appropriate for fatality count data, and possibly even not for traffic volume data. The implicit assumption of multiplicative errors is actually quite commonly applied, if only for practical reasons, as it substantially simplifies modelling. Because the equations in (2) define the way in which the latent variables exposure and risk can be inferred from the observations, they are called the measurement equations. When observed over time, these equations can therefore be interpreted as a decomposition of an observed time series (e.g., log(Traffic Volumet )) into a trend, which is the latent variable log(Exposuret ), and an error term, which is then also known as an irregular component (the error term error(Exposuret )). As can be seen in (2), the log(Exposure) is present in both measurement equations. There is a trend in this bivariate process, which depends in both equations on the exposure plus a specific trend related to the risk for the number of fatalities. In order to specify the dynamics of the model, two linear state equations are introduced for each of the latent variables log(Exposure) and log(Risk) in addition to the measurement equations in (2). One of these state equations is called the level equation, and the other the slope equation. The equations are linear, and define that the slope component at a certain time point is equal to the slope component at the previous time point plus some additive random disturbance, while the level at a certain time point is equal to the level at the previous time point plus the slope at the previous time point plus some additive random disturbance. In the absence of any random disturbance, this means that the level follows a straight

E. Dupont et al. / Accident Analysis and Prevention 71 (2014) 327–336

line, and the slope is constant. Introducing a positive random disturbance in the slope component means that the level component will start to increase faster (the slope will become steeper), and continue to do so until a new random disturbance changes the slope. Likewise, introducing a positive random disturbance in the level component means that the level component will stay at a higher level until a new random disturbance changes it. Both disturbances have a long-term effect, as opposed to the disturbances in the measurement equations, which have a short-term effect. This specification is identical to the specification of the local linear trend model (see Harvey, 1989; Commandeur and Koopman, 2007; Durbin and Koopman, 2012).1 We can now provide the complete formulation of the LRT model as it was presented in Bijleveld et al. (2008): log(Traffic Volumet ) = level(log(Exposuret )) + εet level(log(Exposuret+1 ) = level(log(Exposuret ) e + slope(log(Exposuret )) + t=1 e slope(log(Exposuret+1 )) = slope(log(Exposuret )) + t+1

(3) f

log(Fatalitiest ) = level(log(Exposuret )) + level(log (Risk)t ) + εt f

level(log(Riskt+1 )) = level(log(Riskt )) + sloope(log (Risk)t ) + t slope(log(Riskt+1 )) =

f slope(log(Riskt )) + t+1

where

– εet is the irregular component (disturbance term) for the measurement equation of traffic volume, f – εt is the irregular component (disturbance term) for the measurement equation of the fatalities, e – t+1 is the disturbance term for the state equation of the level of the latent exposure, r – t+1 is the disturbance term for the state equation of the level of the latent risk, e – t+1 is the disturbance term for the state equation of the slope of the latent exposure, r is the disturbance term for the state equation of the slope of – t+1 the latent risk.

These terms constitute the random components of the model. Because the two dependent variables are both in logarithms, the two slope components in (3) can be interpreted as a yearly rate of change.

329

The covariance matrix of the disturbances specifies the dynamics in and between the components of the models due to their randomness:



A

0

0



˝ = ⎣0

B

0⎦

0

0

C

with



A= 2e

(4)

ε2e cov(εe , εf )

cov( e ,  r )

cov(εe , εf )  2f

cov( e ,  r ) 2r





,

and

C=

B=



2e cov( e ,  r )

cov( e ,  r ) 2r



The values of the variances of the disturbances on the diagonal of matrices B and C determine whether the corresponding level and slope components are stochastic or deterministic. If a disturbance variance is zero then the corresponding component is deterministic and fixed: in this case the slope component does not change over time. If the value of the variance is larger than zero, on the other hand, then the corresponding component is stochastic, and its value therefore changes over time. The tests conducted in order to determine whether components should be considered as deterministic or stochastic are described in the “Model selection” section below. 3. Investigation of correlations between components As indicated in (4), some of the random components in the model may be correlated, such as the disturbances of the slope components. When such a correlation happens to be (close to) plus or minus one, then we have the special situation that the level and/or slope components are – as is said – common. When unobserved components are common this means that the changes or “shocks” driving the dynamics of the (two or multiple) time series are perfectly linearly related. Stated differently, the level and/or slope components then change in the same way at the same points in time, and the corresponding time series therefore display common behaviour. The identification of common factors is important because it allows the improvement of models to make them more efficient, but also because factor components are informative in themselves of the dynamics governing the evolution of the trends considered. To explore this structure, a bivariate SUTSE (Seemingly Unrelated Time Series Equations) model (Harvey, 1989) of the traffic volume and the number of fatalities with a complete covariance structure of the disturbances of the levels, slopes, and irregulars can be applied: log(Traffic Volumet ) = level(log(Traffic Volumet )) + εt e level(log(Traffic Volumet+1 )) = level(log(Traffic Volumet )) e +slope(log(Traffic Volumet )) + t+1 slope(log(Traffic Volumet+1 )) = slope(log(Traffic Volumet )) + t+1 e (5)

1

ARMA and ARIMA models or their extensions are frequently applied analysis techniques for the forecasting of fatality numbers. When it comes to linear Gaussian time series models, there are many equivalencies between the ARIMA and the structural time series framework. It can be shown, for example, that the local level model and the ARIMA (0 1 1) model yield identical forecasts; the same applies to the local linear trend model and the ARIMA (0 2 2) model. For a comprehensive overview of the equivalencies between ARIMA and structural time series models we refer to Appendix 1 in Harvey (1989). The reasons that we prefer the structural time series approach is that these models do not require the time series to be stationary, and that missing observations and multivariate time series are easily handled in the latter framework while this is relatively difficult in a pure ARIMA modelling context. Moreover, the structural time series framework yields interpretable components such as trends, cycles, and seasonal patterns, which moreover can be inspected for additional model validation; this is not the case in ARIMA models.

log(Fatalitiest ) = level(log(Fatalitiest )) + εt f level(log(Fatalitiest+1 )) = level(log(Fatalitiest )) + slope(log(Fatalitiest )) + t+1

f

slope(log(Fatalitiest+1 )) = slope(log(Fatalitiest )) + t+1 f

Note that SUTSE model (5) is different from LRT model (3) and (4) because it no longer contains a latent risk variable. The disturbances of the unobserved level components of the fatalities and traffic volume can be correlated and the disturbances of the unobserved slope components of the fatalities and traffic volume can be correlated as well. There is an intimate relation between such common component models and what is known in the time series literature as cointegration.

330

E. Dupont et al. / Accident Analysis and Prevention 71 (2014) 327–336

Table 1 Rules for the use of a LRT model depending on the correlation between the slope components of the bivariate SUTSE model. Type of correlation between the slope disturbances

No correlation (0)

Full correlation (1)

Medium correlation (0.1–0.9)

Consequences for the model

Independence between fatalities and exposure

Strong dependency: common components (same stochastic slope), cointegration Long-term linear relationship

Weak dependency

E(fat.|exp.) = E(fat.)

log(Fatt ) = b level(log(Expt )) + a + ct + εt

Univariate LLT

Bivariate LRT with deterministic risk trend

Model

When the correlation is negligible, there is no relationship between fatalities and exposure, meaning that knowledge of exposure does not bring any additional information to predict the number of fatalities. In that case, it can be proven that the trend for the exposure can be ignored altogether and that a univariate Local Linear Trend model (LLT) applied to the fatalities yields very similar results. When the correlation between the slope disturbances is equal to 1 and the level components are deterministic, both time series share the same stochastic slope and are called trend stationary with a deterministic linear trend for the risk. In that case, we obtain a LRT model by constraining the b coefficient for exposure to 1 (and modifying the risk component accordingly). Otherwise, there is a weak correlation between the two time series, and a LRT model provides a solution through the estimation of a covariance between the slope disturbances of risk and exposure. A systematic overview of all the possibilities is given in Table 1. 4. Model selection Irrespective of whether the LRT model (3) and (4) or the SUTSE model (5) is applied, various types of models can be obtained depending on whether the unobserved components – the level and the slope components – are treated stochastically or deterministically, i.e., as random or as fixed. In order to select the best model for the country data at hand, the analysis was started with a full model where all the components were treated stochastically. Then, for each disturbance variance the log-likelihood of the full model was compared with the log-likelihood of the model where the disturbance variance had been fixed on zero, and a likelihood ratio test was used to decide whether to treat the component stochastically or deterministically. When a stochastically treated component was thus found to yield a better model, it was next also explicitly tested whether its covariance term with the other corresponding component in the model (which would be either a level or a slope) should be fixed on zero or not. The software for all the analyses presented in this paper was written in R version 2, with the dlm package of Petris et al. (2009) as its central routine for filtering and smoothing. 5. Interventions Interventions are implemented using variables that model the effect of a known particular event on the level or on the slope of the time series (e.g., the introduction of a law, the beginning of a crisis, a change in counting methods, etc.). Interventions that are assumed to have an immediate and permanent effect are coded 0 for all time points prior to the event and 1 for the time point of the event and those following. These variables allow testing whether a significant change indeed took place at the assumed time of the intervention. Several types of interventions can be defined depending on the particular model equation they are inserted into. The intervention is included into the measurement equation when it is suspected that some change in the series reflects a change in the way it has been

Bivariate LRT with stochastic risk trend

measured and not a change in the phenomenon itself. An example is the change in definition where fatalities were first defined as “victims who died within 24 h after the accident” and later defined as “victims who died within 30 days after the accident”. An intervention is included into the level equation if it is thought to have caused a permanent reduction (or increase) in either the fatality risk (e.g., the seat-belt law) or in the exposure (e.g., the introduction of taxes, an economic crisis). A level intervention considered here takes the form of a step: the fatality risk, for example, increases or decreases at the moment of the intervention and it remains at that level afterwards. An intervention is included into the slope equation when something is suspected to have caused a change in direction – or steepness – of the development of either fatality risk or exposure. This could, for example, be an increased commitment to road safety improvement in a country, due to which the fatality risk decreases at a faster rate than before. The selection of “candidates for interventions” should be based on actual knowledge about past developments (for example, the moments at which laws have been introduced or the method for the estimation of traffic volume has been adapted), as well as on the results of the analyses of the data (large, visible changes in the series). 6. Forecasts Once an appropriate model is identified, forecasts are obtained by projecting the trends that are observed in the past into the future. This does not necessarily mean that the forecasts will correctly predict what is going to happen. But when assessing the quality of the model, we can check the accuracy of the forecasts by forecasting a certain number of years (5–10) and then compare these forecasts with the observations. The implementation of interventions in the model at time points where the series displays extreme values or changes reduces the error variance and consequently the confidence interval around the forecasts. When one knows that a change occurred because of a particular event, one can also be confident that the change in the value of the relevant component is no part of the random variations in the series, and that such changes are thus unlikely to happen again in the future. However, when the reasons for the changes in the past are not really understood, one may expect that similar changes can happen in the future as well. In the latter case, correcting the models for past unexplained disturbances by introducing exclusively data driven interventions artificially reduces the confidence intervals for the forecasts; the implementation of such exclusively data driven interventions in the models should therefore be avoided. To summarise, time series models give us an estimate of the future development, under the assumption that the model is adequate and that the past development is continued. Moreover, they quantify the uncertainty of the forecasts: the weaker the predictive performance in the past, the larger the uncertainty in the forecasts will be as reflected in the corresponding confidence intervals.

E. Dupont et al. / Accident Analysis and Prevention 71 (2014) 327–336

331

Table 2 Type of exposure indicator selected and length of the data series for the different countries, along with the types of correlations identified between the development of the exposure and fatality series. Unless otherwise specified, the exposure and fatality series are of identical length. Exposure indicator Vehicle kilometres 20 countries

Vehicle fleet seven countries

Fuel consumption one country

None available

Austria (1990–2010) Belgium (1970–2010) Czech Republic (1990–2010, exp.: 1995–2010)) Denmark (1980–2010) Finland (1975–2010) France (1957–2010) Germany (1991–2010) Hungary (1993–2010) Iceland (1975–2010, exp.: 1980–2010) Ireland (1970–2010) Italy (1980–2010) Norway (1973–2009) Poland (1975–2010,exp.: 1996–2008) Romania (1990–2010) Slovenia (1970–2010) Spain (1961–2009) Sweden (1970–2009) Switzerland (1975–2010) The Netherlands (1950–2010) UK (1983–2010) 5 “common slopes” 4 correlated series

Bulgaria (2001–2010) Estonia (1991–2010, exp.: 1997–2008) Greece (1960–2010) Latvia (1975–2010, exp.: 1996–2009) Luxembourg (1975–2010) Portugal (1979–2008) Slovakia (1990–2010, exp.: 1991–2002)

Cyprus (1991–2010)

Lithuania (2001–2010) Malta (1990–2010)

2 “common slopes”

No correlation

7. Data Road safety fatalities – although by no means the only interesting measure – are the key measurement to analyse and compare the development of road safety across countries, because they are less susceptible to underreporting than other measures. The yearly number of road traffic fatalities in the different European countries is available in the CARE database. CARE includes harmonised fatality data, which conform to the European definition of fatalities within 30 days from the accident. The EU-15 Member States have fatality data that span the entire period 1991–2008. For countries that became a member of the European Union later on, however, data availability is limited to a smaller number of years (e.g., from 2005 onwards for Estonia and Slovakia, from 2003 onwards for Hungary, from 2000 onwards for Slovenia etc.). Various exposure indicators were investigated, including the number of vehicle kilometres travelled, the vehicle fleet, the fuel consumption and the road length. The analyses could not be carried out on the basis of vehicle kilometres data for all European countries and in these cases alternative exposure indicators have been used. In 20 countries, the most preferable exposure indicator was available, which is vehicle kilometres. In seven countries the size of the vehicle fleet was the only available exposure indicator. Fuel consumption has been used as exposure indicator in the case of Cyprus. Finally, for two countries (Lithuania and Malta) no exposure indicator was available at all.

model (5).2 For each country, the length of the series of observations available and included in the analysis is also mentioned. It is clear from this table that a correlation was identified more often when vehicle kilometres could be used as exposure indicator than for other types of exposure indicators. In all instances where a non-zero correlation was established between the two series, this correlation applied to the slope components (not to the level components) and was positive. Given that the values of the slope component for the exposure tend to be positive (indicating that exposure is always increasing) while those for the risk are most often negative (indicating that the risk decreases over time), positively correlated slope components imply that the decrease in the annual fatality numbers slows down when the increase in the annual number of vehicle kilometres becomes stronger. Common slopes (i.e., a correlation close to 1) were observed in five of the nine countries where a relationship could be identified on the basis of vehicle kilometres (i.e., Denmark, Finland, France, the Netherlands, and the UK). Figs. 1–4 contain the results of the SUTSE model for the Netherlands. The common slopes identified for this country are illustrated in Figs. 2 and 4. Common slopes were also observed for Portugal and Estonia, where vehicle fleet was used as the exposure indicator. The identification of a satisfactory relation between the exposure indicator and fatality series was one condition that determined whether the LRT model or the LLT model was used to capture the past developments in the annual fatality numbers of a country, and to project these developments into the future. In all eleven countries where a relationship between the exposure indicator and the fatality series could be demonstrated with the SUTSE model (5),

8. Results A relationship between the exposure and fatality series could not be identified in all countries. Table 2 summarises the different types of exposure indicators that have been used for each country, and also indicates the countries for which correlated series or even common slopes could be established on the basis of the SUTSE

2 The correlations range in the interval [0.1, 0.9]. This interval is explained by the fact that the correlations between the slope and level disturbances of the two series have been tested in two steps: (1) a test of whether H0 : (e ,f ) = 0, ( e , f ) =   1. This testing 0 and H0 : (εe ,εf ) = 0 can be rejected and (2) a test of whether  ∼ = strategy consequently only informs whether correlation coefficient(s) are significantly different from 0 and from 1, without providing information of its (their) exact estimated value.

332

E. Dupont et al. / Accident Analysis and Prevention 71 (2014) 327–336

Fig. 1–4. Developments of the state components for the exposure (upper graphs) and the fatalities (lower graphs), as estimated on the basis of the SUTSE model for The Netherlands. The trend (level) developments are shown in the left-hand graphs, the slope developments in the right-hand graphs (both in the anti-logs, the scale 1.05 for the slope means an increase of the series with 100*ln(1.05) = 5% per year), including their 95% confidence intervals. Note the similarity between the slope components which have a correlation of 1 (and are therefore common).

the development of the annual fatality numbers was modelled and defined as the result of the combined developments of the risk and of the exposure according to the LRT model (3). When no significant relationship between the exposure indicator and fatality series could be detected, on the other hand, the forecasting accuracy for the fatalities obtained with the bivariate LRT model was compared with the forecasting accuracy of the univariate LLT model applied to the fatalities series only, thereby disregarding the development of the exposure indicator. For 5 of the 19 countries for which no significant relation between the exposure indicator and the fatality series was established, the LRT model still resulted in better forecasts than the LLT model and

these five countries were therefore analysed with the LRT model. The remaining fourteen countries with no significant correlation between exposure indicator and fatalities were all analysed with the LLT model. In addition, as already mentioned in the section on Model selection, various types of LRT and LLT models can be obtained depending on whether the unobserved components – the level and the slope components – are treated stochastically or deterministically, i.e., as random or as fixed. Table 3 provides an overview of the temporal structure of the optimal models that were identified for the 16 European countries to which the LRT model was applied. Whenever common – or highly correlated – slopes were observed

E. Dupont et al. / Accident Analysis and Prevention 71 (2014) 327–336

333

Table 3 Overview of the optimal LRT model types for the different countries. Most frequently selected models

Other models

Exposure trend Level fixed, slope random

Level fixed, slope random

Level fixed, slope random

Risk trend Level random, slope fixed

Level fixed, slope fixed

Level fixed, slope random

Austria (no component fixed)

Cyprus

UK Italy

Finland (only slope risk fixed) Slovenia (only level exposure fixed)

Denmark France The Netherlands Spain Switzerland Norway Portugal Estonia Belgium Germany ⇒10/16 countries

for the fatality and exposure indicator series, the slope component of the risk was treated deterministically (i.e., as fixed). As the table indicates the most common type of LRT model is one where the slope component for exposure is random and its level is fixed, and where the level component of the risk is random and its slope is fixed. A typical example is given in Fig. 5-8, which contains the results of the LRT model obtained for the Netherlands. For the exposure, the slope changes reflect the fact that the rate of change is decreasing over the years, i.e., the exposure continuously increases, but not as fast as it did in the past. In contrast, the risk is characterised by ups and downs (due to random variations in the level component), but–as is illustrated in the bottom graph on the right of Fig. 8 – the general direction of the year-to-year changes in the number of fatalities per unit of exposure is constant. For the

Table 4 Overview of the optimal univariate LLT model types identified for the different countries. Fatality trend Level random, slope fixed

Level and slope fixed

Level fixed

Bulgaria Greece Luxembourg Lithuania Ireland Poland Sweden Latvia Slovakia ⇒9/14 countries

Hungary Iceland Malta

Czech Republic Romania

Table 5 Overview of the intervention variables identified for the different countries, including their type and cause. Country

Year

Type of intervention

Reason

Greece

1986 1991 1996 2002 2008 1991 1999 2009

Fatalities level Fatalities level Fatalities slope Fatalities level Fatalities level Exposure level Exposure level Risk level

1989 1974 1974 2004 1989 1990 2008 1983 2003 1975 1982 1984 1989 1993 1994 2004 2007

Fatalities level Exposure level Risk level Risk level Fatalities level Exposure level Fatalities slope Risk level Exposure level Fatalities level Risk slope Exposure slope Risk slope Level risk Risk slope Risk slope Exposure slope

2008

Risk level

1993 1991–1993 1991–1993 2008–2012 2008–2012

Exposure level Exposure slope Risk slope Exposure slope Risk slope

Economic recession Change in the vehicle fleet (car exchange scheme) 30-days definition introduced for fatalities Increase of motorway length by 19% Economic recession, large set of road safety measures introduced Fatalities registration form adapted 30-days definition introduced Change in the vehicle fleet (trailers and semitrailers

Latent risk and trend models for the evolution of annual fatality numbers in 30 European countries.

In this paper a unified methodology is presented for the modelling of the evolution of road safety in 30 European countries. For each country, annual ...
1MB Sizes 0 Downloads 0 Views