Road Fatality Model Based on Over-Dispersion Data Along Federal Route F 0050

According to The World Health Ranking 2011 has ranked Malaysia as 20 in its list of countries with the most deaths caused by road accidents. Road accidents also have been identified as the prime cause of death in Malaysia after the heart disease, stroke, influenza and pneumonia. To date, previous researches from Malaysian Institute of Road Safety (MIROS) have reported that averages of 18 people were killed on Malaysian road daily. There are many kinds of models that have been developed in modelling the circumstance of accidents. The most widely applied was Poisson and Negative Binomial regression models while Zeroinflated Poisson and Zero-inflated Negative Binomial are the modification of Poisson and Negative Binomials regression models. This study interested to focus on road F0050 as statistic data from Royal Malaysian Police 2014 list F0050 as one of the high accident road in Malaysia from kilometre 0 until kilometre 58. R programming was chose to analyse the relationship between road fatality and its factor (annual average daily traffic (AADT), speed, shoulder width, lane width). Negative binomial and Zero-inflated negative binomial (ZINB) were shown to be preferred modelling methods for this study. Significant positive relationships were also identified between road fatality and annual average daily traffic (AADT) and lane width. This relationship can be a helpful support to the decision making of accident management for road F0050.


Introduction
Traffic accidents have been recognized as one of the major causes for human and economic losses both in developed and developing countries.In 2014, Malaysia launched a new road safety plan, aligned with the Global Decade of Action for Road Safety which aims at having the predicted number of fatalities by 2020 [1].Most recent safety data from Royal Malaysian Police found that there were 6674 road deaths in 2014 which were 3.5% less compared to 2013.It has been conclusively shown that this is the largest reduction in the number of road fatalities in the past 10 years [2].Traffic accidents represent a significant cost for the Malaysian society which based on a willingness-to-pay estimation, traffic crashes cost each year around 1.6% of the national gross domestic product.In a following-up study, [3,4] found that the statistical value of life in Malaysia is RM 1.3 million, around EUR 330 000.Zero-Inflated models has a pivotal role in predicting probability of crashes and identify the most relevant contributory factors at the hazardous sections as this models can handle an over-dispersion data and high excess zero properly [5][6][7][8][9][10][11][12][13][14][15].Malaysia has it owns plan and target that set by the Global Decade of Action for Road Safety 2011-2020 to reduce the predicted number of road deaths in 2020 by 50%.MIROS present a comprehensive review on the absence of a comprehensive road safety programme, there would be 8760 road fatalities in 2015 and 10716 in the year 2020 based on their 2015 annual report.This shows an effective reduction of 22% in the number of road fatalities when compared to 2010 level.Over dispersion in Poisson models occurs when the response variance is greater than the mean [16].The main issue in need of future effort is the reliability test of some statistical models (Poisson, Negative Binomial and Zero Inflated) in yielding better results in traffic safety.Although these types of model have been well coordinated in many other fields such as economics and sociology, it is only less than 10 years when traffic safety analysts first reported relevant results in well-known publications.Especially, while most applications are limited to types of vehicles, types of crashes, varying-intercept model, and varying-slope model deserves more effort because of the complexity in model estimation and result interpretation.Therefore, this study is the new approach of quantifying road safety performance at the heterogeneous traffic conditions at route F0050.Up to now, there are three types of road accident that applied in Malaysia which was fatal, serious injured and slight/minor injuries.Road fatality was defined as death resulting from a road crash within 30 days after the crash.While serious injured is when a person injured as a result of a road crash.Lastly, a slight injury is any injury that does not fit under death or serious injury.This paper attempts to show an interest only to road fatality and investigates the relationship with the factors that determine the fatality.According to [2], F0050 was listed among the highest accident road in Malaysia.As shown in Figure 6 and Figure 7, there are the hotspots locations of accident happen.F0050 is a four lane two ways divided road that runs from Batu Pahat through Ayer Hitam to Kluang.Most of the hazardous locations have high access points and annual average daily traffic(AADT).

Poisson, Negative Binomial, Zero-inflated Poisson, and Zero-inflated Negative Binomial
The data of crashes was under characteristic of discrete, sporadic and random.Stochastic regression models were applied to describe the occurrence of accidents instead of deterministic models [11].Stochastic regression model was used to analysis all of the collected data which was Poisson model, Negative Binomial model, Zero-inflated Poisson and Zero-inflated Negative Binomial.Table 7 above illustrates the estimation results of each model that was analyzed using R programming.In order to see the performance of the model the estimation value of variables has been examined.Models with logical algebraic signs of the variables were selected.As shown in Table 7 provides the summary of how the dependent variable associates with independent variables.Interestingly as can be seen, mostly from all model that the variables speed (SP) and shoulder width (SW) have negative association with fatality crashes.While for the variables AADT (LN.AADT) and lane width (LW) have majorly positive association with the accident.Taken together, these results suggest that there is preliminary evidence that in route F0050 with too many access points with increase of average annual daily traffic (AADT) have more contribution to fatality crashes.This proved by the most hazardous sections along this route was at main point by local people mostly around business and residential area such at kilometre 2,4,7,19,22,23,48,57 and 58.High AADT leads to lower speeds and trigger driver to take over the front car and this increase the accidents in federal route F0050.It is interesting to note that in most study found that lane width (LW) is negative association with fatality but incompatible with result found in Table 7. Lane width (LW) increases the likelihood of fatality in route F0050, although when lane width (LW) of lanes increases the traffic flows gets more convenient but the traffic conflicts and driving maneuvers such as lane changing increase and hence increases the probability of accident happened.As comparison of the shoulder width, a negative sign of shoulder width suggests that more is the shoulder width less will be the accident and it is logically acceptable.
Table 8 presents the results of model evaluation and comparison between the models.The best log-likehood has the maximum value.In Table 8, negative binomial (NB) has the higher value which was -127.69 followed by zero-inflated negative binomial (ZINB) (-127.7),zero-inflated Poisson (ZIP) (-131) and Poisson (P) (-133.57).While for the akaike information criteria (AIC), the lesser the fittest.Zero-inflated negative binomial (ZINB) has the least akaike information criteria (AIC) value (263.most points of criteria for the best and fittest model followed by zero-inflated negative binomial (ZINB), zero-inflated Poisson (ZIP) and lastly Poisson (P).Although negative binomial (NB), has the most points, the different value of NB performance and ZINB performance in this model just 0.01.Interestingly, these summarised that for this data was being under over-dispersion but not have high excess zero.In the final part for these studies both NB and ZINB model was choose as the best model.

Conclusions
This study set out to determine a model that can properly handle the highly stochastic nature of crash event and finding the most relevant contributory factors at the hazardous road sections at Johor federal route.In general, Zero-inflated regression and negative binomial was found to determine an empirical association between fatality road accidents, route geometric and traffic parameters.The results suggest that valuable insight into the underlying association between probability factors and vehicle crash.matecconf/201 103

Table 2 .
Variables in the final model The most widely applied was Poisson and Negative Binomial regression models while Zero-inflated Poisson and Zero-inflated Negative Binomial are the modification of Poisson and Negative Binomials regression models.They were proposed to handle over dispersion and excess zeros in the data set.

Table 3 .
Results of Pr(>|z|) modeling accidents for model

Table 4 .
Summary statistics of model

Table 5 .
Correlation summary F0050 for model

Table 6 .
Over-dispersion test for model

Table 7 .
Estimation results for model

Table 8 .
Results of model evaluation and comparison model

Table 9 .
Comparison for best performance for model 4); followed by negative binomial (NB) (267.39),zero-inflated Poisson (ZIP) (270) and Poisson (P) (275.14).Comparisons of best performance of model are set out in Table 9.Data from this table can be compared with the results from Table 8 which shows the comparison to choose the best and fittest performance of model.What is interesting in this data that negative binomial (NB) has the