A Partial Proportional Odds Model for Pedestrian Crashes at Mid-Blocks in Melbourne Metropolitan Area

Pedestrian crashes account for 11% of all reported traffic crashes in Melbourne metropolitan area between 2004 and 2013. There are very limited studies on pedestrian accidents at mid-blocks. Mid-block crashes account for about 46% of the total pedestrian crashes in Melbourne metropolitan area. Meanwhile, about 50% of all pedestrian fatalities occur at mid-blocks. In this research, Partial Proportional Odds (PPO) model is applied to examine vehicle-pedestrian crash severity at mid-blocks in Melbourne metropolitan area. The PPO model is a logistic regression model that allows the covariates that meet the proportional odds assumption to affect different crash severity levels with the same magnitude; whereas the covariates that do not meet the proportional odds assumption can have different effects on different severity levels. In this research vehicle-pedestrian crashes at mid-blocks are analysed for first time. In addition, some factors such as distance of crashes to public transport stops, average road slope and some social characteristics are considered to develop the model in this research for first time. Results of PPO model show that speed limit, light condition, pedestrian age and gender, and vehicle type are the most significant factors that influence vehicle-pedestrian crash severity at mid-blocks.


Background
In order to reduce air pollution and obtain better public health outcomes, efforts to encourage non-motorized transport modes have increased in recent years (Wey & Chiu 2013).To increase the number of walking trips, concerns about pedestrian safety has to be addressed.Pedestrians are more likely to be harmed or killed in traffic crashes.They are 23 times more likely to be killed than vehicle occupants [1] and more than 22% of traffic deaths in the world are pedestrians [2].Every year, 34 pedestrians are killed in traffic crashes in Melbourne metropolitan area, representing 24% of the total traffic fatalities.Mid-block crashes account for 46% of total pedestrian crashes in Melbourne metropolitan area and 49% of pedestrian fatalities occur at mid-blocks.
Many studies have been conducted to examine the factors contributing to frequency and severity of vehiclepedestrian crashes [3][4][5][6] .Whereas many of the studies have chosen to focus on crashes at intersections [7], there are limited studies on vehicle-pedestrian crashes at midblocks.Since the factors contributing to vehicle crashes at intersections and mid-blocks are significantly different [8][9][10][11][12][13], more research needs to be done to develop a model for vehicle-pedestrian crashes at mid-blocks.
Identifying and predicting trends in crash severity is one of the main steps to improve pedestrian safety.In this approach crash severity modelling assist in distinguish influencing variables on pedestrian crashes.Review of literature shows many different statistical models have been applied in crash studies.PPO model that is one of regression models, allows covariates to affect different levels of severity injuries.Therefore, the proportional odds assumption will be relaxed for independents variables that have different effects on different level of injuries.

Objective and scope of study
The main objective of this research is to identify the factors contributing to the severity of vehicle-pedestrian crashes at mid-blocks.Whereas previous studies have mainly focused on pedestrian crashes at intersections or applied a model for crash risk at mid-blocks for special group of pedestrian (e.g.children) or specific study area (e.g.pedestrian crossing), this research will examine all vehicle-pedestrian crashes in Melbourne metropolitan area.In addition, this research will consider different social-economic variables, such as place of birth, level of pedestrian education and percentage of labour force participates.Also, the distance of crash location from/to public transport stops is another variable used in this This paper begins with a review of previous studies on pedestrian crash modelling and then presents the methodology of this research.Description of the variable data is presented then.It is followed by presenting the model development in this research.Afterward, the results are presented and discussed.Finally, the outcomes are summarized and direction for future research is presented.

Literature review
The published literature on pedestrian crash modelling shows that binary models, ordered discrete models and unordered multinomial discrete models are three main statistical techniques which have been widely used to study pedestrian crash severities.In binary models for crash severity, outcomes are injury against non-injury crashes or fatal against non-fatal crashes.These studies used common discrete models such as the binary Logit and binary Probit models.
In recent studies, Sarkar et al. [14] and Ballesteros et al. [15]developed a logistic regression model for pedestrian crashes to predict influencing variables for this type of crashes.However, according to the ordinal nature of crash severity, ordered probability models are popular in traffic crash studies.Lee and Abdel-Aty used this approach to estimate the likelihood of pedestrian injury severity at intersections [7].In ordered logit and ordered probit models, it is assumed that the parameter estimates are constant across severity level [16].Eluru et al. [6] developed a generalized ordered probability model to examine pedestrian and bicyclist injury severity levels in traffic crashes in the USA.
Limitation of the traditional ordered Logit and Probit models leads to developing unordered models to find important variables in traffic accidents.MultiNomial Logit model (MNL), Mix Logit model and random parameter Logit model are most common unordered models that are used in many pedestrian crash studies [17][18][19][20] .However, these models have similar limitations to logit models.For instance, all of explanatory variables must be independent to each other.
The ordered nature of the traffic crash severity cannot be ignored completely.Furthermore, it is not appropriate to assume all of predictors have same effect on different levels of crash severity.PPO model proposed by Peterson and Harrell [21] combines the ordered arrangement in ordinal models and allows certain independent variables to affect different levels of dependent variable differently [22].Because, PPO models allow a combination of the two ordered and multinomial modelling frameworks, the proportional odds assumption will be relaxed for independent variables that show different effects on different severity level.Sasidharan and Menedez [22] compared PPO models, ordered logit models and multinomial logit models for pedestrian crash injury severities in Switzerland.They found that PPO models have better results than two other models type based on different evaluation criteria.
In summary, there are many studies that analysed pedestrian crash severity.However, there is no study about pedestrian crash at mid-blocks.In addition, independent variables in most of studies were limited to only personal or traffic characteristics.The objective of this paper is to analyse vehicle-pedestrian crash severity at mid-blocks in Melbourne metropolitan area using PPO models.Furthermore, to explore the effects of different variables on this type of crash different socio-economic, traffic and environmental characteristics are analysed in this study.

Data description
CrashStats is a database contains road crash statistics of Victoria, Australia from 1987 onwards.The dataset includes crashes where at least 1 person was injured.This dataset includes personal characteristics such as driver and pedestrian age and gender, vehicle characteristics like vehicle type and weight, road and environment condition, including surface, light and pavement condition, and temporal parameters such as date, day and time of the crashes.In addition, this study considers the influence of social and economic characteristics on pedestrian crashes.These social and economic variables include religion, income, occupation type, percent of labour force participation in suburbs, level of education and place of birth that are extracted from Australian Urban Research Infrastructure Network (AURIN).AURIN is the largest single resource for accessing diverse types and sources of data, spanning the physical, social, economic and ecological aspects of Australian cities, towns and communities.Also, AURIN dataset is used to extract geospatial data, such as road network, elevation (which is used to find the road slope), public transport stops, and other data like traffic volume and land-use categories.
To investigate the variables influencing the pedestrian crash severity, data for all pedestrian crashes on public roadways in the Melbourne metropolitan area from 2004 to 2013 are used in this research.A total of 12,279 pedestrian crashes were recorded in Melbourne metropolitan area over the 10-years period where 5,760 of them occurred at mid-blocks.Traffic volume is an influencing variable on accident analysis and many studies emphasized the influence of traffic volume on pedestrian crashes [1,23,24].However, this variable is ignored in many pedestrian crash studies, especially in crash severity models.To consider traffic volume in this research the location of accidents are joined to the Melbourne road network using ArcMap GIS.
In CrashStat, crash severity is divided into three levels (base on Victoria police data source): fatal crashes, serious injury crashes and other injury crashes.A fatal crash refers a crash that at least one person died immediately or death from injuries within 30 days of a collision.In serious injury accidents, at least one person is sent to hospital.In addition, other injury crashes refer to crashes that at least one person injured in accident but

Methodology
In this study vehicle-pedestrian crash severity is categorized into 3 levels; fatal crashes, serious injury crashes and other injury crashes.The probability of crash severity level (j) for a given crash (i) can be specified as where X i is a vector of explanatory variables, β j is a vector of estimable parameters, and α j and α (j-1) are the upper and lower thresholds for injury severity j.The difference between this equation and the General ordered logit model (Gen) is that β j is free to vary across severity levels [16].In proportional models (PO) parallel-lines assumption may be violated by one or more variables and in PPO models these variables are specified by β j which differs across equations.
The PPO model for this study is fitted using SAS 9.3.To interpret the results of PPO models it should be cautious that the sign of β does not always determine the direction of the effect of the intermediate outcomes [26].
In these models the marginal effects are useful for interpretation of variables.The marginal effects show how any change in the independent variables can effect on the dependent variable.Figure 1 shows the framework for this study.The final outcomes of this model identify the variables that influence on vehicle-pedestrian crash severity at mid-blocks in Melbourne.

Results
This section shows the results of general logistic regression and PPO model for pedestrian crash severity at mid-blocks.Moreover, the marginal effect of PPO model illustrates to show influence of variables on crash severity.Table 3 shows the results of likelihood ratio tests for General model (Gen), Proportional Odds model (PO) and PPO.According to this table models reject the proportional odds model in favour of both the nonproportional odds model and the partial proportional odds model.In addition, Table 5 indicates that the partial proportional odds model fits as well as the nonproportional odds model.
Table 6 shows the goodness of fit for tree statistical tests.The goodness of fit of a statistical model describes how well it fits a set of observations.According to Table 4, Null Hypothesis is rejected for all testes, thus the PPO model is fitted to all of observation well (ChiSq < 0.0001).Also,  8 shows results of maximum likelihood estimation and marginal effect analysis for explanatory variables.This table displays that roads with low speed limitation increase the probability of fatal or serious vehicle-pedestrian crashes at mid-blocks more than roads with high speed limits.In addition, roads with speed limit higher than 70 km/h increase the probability for sever pedestrian crashes.Road with speed limits under 60 km/h are local roads and in these roads drivers drive with speed higher than speed limits.Furthermore, in these roads people use more jay walk to cross the roads and these cause to increase the probability of mid-block crashes.Moreover, in roads with high speed limits, vehicle move faster and the braking distance is higher than low speed limits road.In addition, intensity of crash is higher in vehicle with high speed.Thus, the severity of pedestriancrash in road with high speed limit is usually high.
Table 8 reveals that accident at day condition decrease the severity of pedestrian crashes (marginal effects -0.06).In nights traffic is less and speed is higher than days.In addition, darkness could impacts on vision and reaction of pedestrian and drivers.Therefore, vehiclepedestrian crashes during the nights are more sever.Furthermore, the results indicate that pedestrian gender and age could be an influencing parameter in pedestrian crashes.Table 8 shows that the probability of sever crash is decreased for female pedestrian (marginal effect -0.02).Usually females drive carefully and with lower speed than males, thus the crash severity of female driver could be lower than male drivers.Against vehicle crash could be another reason for high severity crash of these pedestrian.8 reveals that pedestrian crashes with light vehicle (e.g.passenger cars or vans) and heavy vehicle (e.g.trailer or trucks) are more sever and the probability of these crashes to be sever are more than other type of vehicles.Light vehicles usually move in road with speed close to speed limit therefore the pedestrian-vehicle crash in this situation is more likely to be sever.Furthermore, braking distance and intensity of crash could be the reasons of this level of severity for with heavy vehicles-pedestrian crash.
The results of maximum likelihood estimation analysis show that suburbs social characteristics can be an influencing variable in pedestrian crashes.Culture can impact on walking and traffic behaviour.People's traffic behaviour may vary among cultures and this study consistent with the results from previous researches that showed culture and family can be an important factor influencing on traffic crashes [27,28].
Furthermore, this result shows that land use is a significant variable in pedestrian crashes at mid-blocks.Table 8 indicates that commercial land use increase the likelihood of an accident to be a sever crash.In this land use pedestrian jaywalk and has less attention to traffic condition when want to cross the roads.This lack of attention to traffic condition and vehicles could be one reason that increasing the severity of pedestrian crashes in these areas.
Finally, this table reveals that suburb's population density is a significant explanatory variable in pedestrian crashes at mid-blocks.Since this variable was significant in Linear Hypothesis Results (Pr less than 0.1) in the PPO model is allowed to be vary across the response function.It means that the impacts of this variable on fatal and serious crashes were different.Thus, there are 2 probability results for population density.Table 5 shoes that population density has no significant impact on fatal crashes; however increasing population density in suburbs can increase the probability of serious injury crashes for pedestrian at mid-blocks.It is clear that increasing the population in suburbs increase the traffic and the probability of pedestrian crashes, however this density can decrease the vehicle speed and probability of fatal crashes.

Results
PPO model is a type of logistic regression that allows certain individual predictor variables that meet the proportional odds assumption to affect different level of response variable with the same magnitude, while other predictor variables that do not meet the proportional odds assumption can have different effects on different level of response variable.To analysis the likelihood estimation of explanatory variables in pedestrian crash severity at mid-blocks pedestrian crashes on public roadways in the Melbourne metropolitan area from 2004 to 2014 used in this research.This research applied crash severity (fatal, serious injury and other injury crashes) as target variables and 38 different socio-economic variables (such as population, income, religion, occupation, and etc.),environment variables (such as light condition, land use, surface condition and etc.), traffic and road characteristics (including road slope, vehicle type, traffic volume, distance from public transport stops and etc.), personal characteristics (like age and gender) and temporal variables (such as time, day and date of the crashes) were used to develop the model.Results of proportional odds assumption analysis indicated that some culture variable and population density in suburbs reject the proportional odds assumption.Thus, in final model these two variables allowed to be vary across the model response.This study showed that Speed, Light Condition, Pedestrian Gender, pedestrian age, vehicle type, traffic volume, culture and distance of public transport stops to location of accidents, and suburb's population density were the variables that significantly improved the model fit.However, in maximum likelihood estimation analysis the Speed, Light condition, Pedestrian characteristics, culture, distance of accidents to public transport, land use and population density were influencing variables.This research indicated that risk of severity crashes at roads with high speed limit, and day condition are more than other roads.Furthermore, personal characteristics of pedestrian had significant impacts on severity of crashes.This study showed that pedestrian under 15 years old is more likely to be killed or harmed in crashes.In addition, this study revealed that commercial area increase the probability of sever crash for pedestrian at mid-blocks.Also, population density in suburbs had two different impacts in crash severity model.Increasing the population density increased the probability of serious crashes, however this variable decrease the likelihood of fatal pedestrian crashes.This study showed the ability of PPO model in severity crash analysis for pedestrian crashes.However, it can be useful to compare the results of this model and other models and techniques.

Table 1 .
Time and personal characteristics variables

Table 2 .
Traffic, road and environmental and social

Table 3 .
Descriptive statistics for continues variables a The present of social variable are extracted for each suburb

Table 7
shows the hypothesis tests for all variables in the model individually.The chi-square test statistics and associated p-values shown in the table Light Condition, Pedestrian Gender, pedestrian age, vehicle type, traffic volume, per-cent of Islamic affiliation in suburbs, distance of public transport stops to location of accidents, and suburb's population density in the model significantly improved the model fit.

Table 4 .
Likelihood ratio for three Gen, PO and PPO models

Table 5 .
Model fir statistic

Table 6 .
Hypothesis test for each variable in the model

Table 7 .
Hypothesis test for each variable in the model (continue)

Table 8 .
Maximum likelihood estimation and marginal effect analysis.