Crossing Behaviour of Pedestrians Along Urban Streets in Malaysia

Road crossings are considered as an unavoidable part of walking in which the desirable route of pedestrians interacts with vehicles. These interactions may expose the pedestrians to risks or delays. In Malaysia, road accident statistics show that pedestrian casualties are fairly high. Inappropriate gap acceptance when pedestrians cross roads is a main contributing element to this situation. In this context, the purpose of this study was to develop realistic models for pedestrian road crossing behaviour using the regression technique for mid-block street crossing. A choice model was produced to capture the decision making process of pedestrians whereas rejected or accepted vehicular gaps was based on the discrete choice theory. Gap acceptance data under real mix traffic conditions was collected using video camera on a typical unsignalised two lane one way urban street section in the city center of Kuala Lumpur, Malaysia. The lognormal regression model developed for the crossing behaviour model shows that traffic speed, pedestrian waiting time, gender, crossing distance, age group, frequency of attempts and pedestrian number are the significant factors which are able to predict 77.0% of variance or changes in accepted gap size at 0.05 significance level. Higher traffic speed, lower waiting time, being a male, wider crossing distance, older age group, lower frequency of attempts and higher number of pedestrian were found to influence pedestrians to accept a bigger gap size. The binary logistic regression developed for the crossing choice model was found to be influenced by traffic speed, driver yield, pedestrian number and age group. Furthermore, lower traffic speed, willingness of drivers to slow down, more pedestrian crossings at the same time and a younger age group lead to a higher chance or probability of crossing roads. The model was validated again using 100 isolated samples and an accuracy of 98% was obtained compared to the calibrated model which yielded an accuracy of 98.9%.


Introduction
The concept of "pedestrian" can have many different meanings. The Concise Oxford dictionary defines it as "a person walking in the street and not travelling in a vehicle".
Crossing facility users are always at risk when they make their daily trips. Therefore, crossing facilities, roadway designs and traffic control devices must consider factors of safety to protect pedestrians [1]. In developing countries, pedestrians face difficulty when they are attempting to cross the street due to the drivers' behaviour towards them. Pedestrians usually have to wait for a long time before finding a volunteer who stops for them to allow them to cross safely otherwise they are forced to wait until there are no more vehicles. This phenomenon may be caused by a lack of awareness on the rule of the right of way in such situations [2]. Pedestrians' choice of suitable gaps in vehicular stream flow depends on their demographics, pedestrian and vehicle behavioural characteristics and street geometry [3]. The identification of safe gaps between passing cars when crossing a road is a task most pedestrians accomplish successfully on a daily basis [4]. In the previous decades, pedestrian behaviour has been placed in the spotlight. Many research studies incorporating different aspects of the minimum gap from vehicles that is accepted by pedestrians who intend to cross streets at mid-blocks have been performed by various researchers at different durations across the world [5]- [8]. The space between the pedestrians and motor vehicles appear to affect the most minimum gap accepted by pedestrians. Moreover, an increase in traffic volume leads to smaller accepted gaps. These gaps are often described by estimating the means of liner regression modeling or by probability distributions. Indicatively, it can be declared that the minimum accepted gap has been estimated to be 2 seconds while the mean accepted gap has been estimated to be 8 seconds [17].

Literature overview
In the past, many studies investigating the behaviour of pedestrians, particularly those crossing the street, have been done. Several factors which influence pedestrian behaviour have been identified. These factors can be categorised as pedestrians, road environment and traffic factors. Delays due to long waiting time significantly affect pedestrian behaviour. Pedestrians have a higher tendency to jaywalk after a long waiting time. They lose their patience while waiting to accept safe gaps. Instead of waiting for safer gaps (greater loss of time), a pedestrian may opt to use rolling gaps across several lanes [10], [11]. The time spent waiting for safer gaps depends on whether the pedestrian is alone or in groups. Pedestrians tend to cross illegally (cross on red) if a member of the group initiates the illegal cross. From observations made, male pedestrians were more likely to cross on red compared to females [12]. In addition to this, the number of pedestrians waiting in a group also affects the behaviour of pedestrians in that group. Smaller groups tend to make more illegal crosses compared to larger ones [13]. In accepting gaps to cross, each pedestrian has a different perception on choosing the safest gap. Their physical characteristics affect their movement, i.e. walking speed. Taller pedestrians accept smaller gaps compared to shorter pedestrians due to them generally being able to walk faster [1].
The distance between vehicles and pedestrians has a significant effect on the acceptance of safe gaps to cross [7], Interestingly, male pedestrians made the most inaccurate choices where they accepted smaller gaps to cross compared to their female counterparts [14]. Pedestrians' decisions on crossing the road, when to stop or wait, and where to cross the road are very complex. The decision is influenced by many factors such as comfort level, the convenience to cross and safety. Pedestrian flow, traffic volume, footpath width, road width, surface condition and walkway obstructions were found to significantly affect pedestrian comfort and safety [15].
Pedestrian behaviour is not constant. It often changes according to the surroundings. Nevertheless, pedestrians naturally adapt to their surroundings [16]. The mean accepted gap for pedestrian to make a cross is estimated to be 8 seconds [17].

Methodology
Jalan Sultan Ismail, located in the center of Kuala Lumpur, Malaysia, has been chosen due to the considerable volume of vehicles and pedestrians. It was also observed that the selected location shows great interaction between traffic and pedestrians. The data was collected using a high resolution video camera. Data analysis was conducted to compute the descriptive statistics including standard error, t-value, and p-value, under Multiple Linear Regression and Binary Logic. Basic steps and function under Multiple Linear Regression and Binary Logic were used for the data analysis. However, pedestrian crossing behaviour was recorded in real traffic situations. Factors considered include pedestrian waiting time, traffic waiting time, gap sizes rejected or accepted by pedestrians, the related number of crossing attempts and each vehicle's speed. In addition to pedestrian individual characteristics such as gender, it is significant to indicate that illegal parking on Jalan Sultan Ismail occurred frequently and the existence of illegally parked vehicles was videotaped during the data collection. Vehicle types in the traffic stream are classified as Malaysian standard cars, motorcycles, vans/medium vehicles and heavy vehicles.

Data analysis
This section explains the structure of analysis as well as the variables involved in the study. The data of study was obtained through the recording of road crossings and the descriptive analysis of the observed variables. Observed variables such as traffic speed, waiting time, frequency of attempt, driver yield behaviour, crossing distance of the road, size of vehicle, occurrence of illegal parking, pedestrian gender, age group, etc. will be discussed. Next, the road crossing behaviour of pedestrians was analysed and modeled using lognormal regression model. The regression model was able to explain the accepted gap size for road crossing based on the various predictors as mentioned above. Subsequently, the pedestrian crossing behaviour was analysed in order to develop a crossing choice model using binary logistic regression with 78% of the sample for model calibration and 22% of the sample for model validation. In addition, the relationship between predictors and dependent variables (log gap size and gap acceptance) was presented visually to ease understanding.
Basically, there are 13 variables recorded as presented in Table 1,in which 2 variables are dependent variables and 11 variables are independent variables or predictors. Based on the table, gap size is a continuous dependent variable that measures the gap between two incoming cars at the crossing spot in seconds while gap acceptance is a discrete dependent variable that records whether a pedestrian chooses to cross the road or not (0: Reject; 1: Accept). Traffic speed is a continuous independent variable that measures the speed of a vehicle at the crosswalk area in km/h. The time pedestrians spend waiting at the crossing area until they cross the road is measured in seconds. Frequency of attempt is a continuous independent variable that records the number of attempts made by pedestrians until they cross the road. Next, driver yield is a discrete independent variable that shows whether the driver of an approaching car slows down (assign a value of 1) or does not slow down (assign a value of 0). The crossing distance of a road is a continuous independent variable measured in meter unit. The accepted gap size of respondents may be influenced by the crossing distance. In terms of vehicle size, which is an independent variable with a discrete scale, a value of 0 is assigned for small vehicles and a value of 1 is assigned for big vehicles. Illegal parking is another discrete independent variable that records the presence of illegally parked cars along the road side with a value of 0 when there is no illegally parked car and a value of 1 if there is an illegally parked car. Other independent discrete variables include pedestrian gender (0: Male, 1: Female), gap type which measures the distance between pedestrians and the gap (0: Near, 1: Far) and pedestrian age (1: Young, 2: Middle, 3: Elder). Lastly, pedestrian number basically refers to the number of pedestrians waiting in a group to cross. It was noticed that these factors are considered to be the most significant ones influencing pedestrian crossing behaviour at mid-block, based on to the literature (9).

.1 Descriptive analysis of variables
In this study, 448 samples were collected at Jalan Sultan Ismail, Kuala Lumpur and the descriptive analysis for both discrete and continuous variables are presented in Table 2 and Table 3. Based on

.2 Pedestrian behavioural model
The multiple linear regression (MLR) model was used to evaluate the accepted gap size for pedestrians. A lognormal regression model was chosen due to the normal distribution assumption for the logarithm of the dependent variable which will likely be able to fit the logarithm of the gaps. Prior to running the test, some samples were filtered out (those who do not cross) since the model was meant to evaluate the accepted gap size. In other words, only samples that accepted the gap (crossed the road) were used, in line with the purpose to examine the factors that affect the accepted gap size of pedestrians. The lognormal regression model was conducted using the Stepwise method, leaving only truly significant variables and the results are summarised in Table 4. Based on the table, F value of 161.352 with a p value close to 0 show that the model is significant. Moreover, an R square value of 0.770 indicates that 7 predictors were able to explain 77.0% of variance in log gap size. First, the variance inflation factor (VIF) value was lower than 10, suggesting that there are no multicollinearity issues for the model. Based on the table, pedestrian waiting time and frequency of attempt showed negative regression weight, indicating that both variables showed a reverse relationship with accepted gap size. Such findings suggest that the longer the pedestrian waiting time and the more attempts taken, the shorter the gap size that pedestrians willing to take. Referring to the B value of -0.010 and -0.028, a unit increase in waiting time and pedestrian attempt will result in the log accepted gap size decreasing by 0.01 second and 0.028 seconds respectively. In short, the longer the pedestrian waiting time or the more the pedestrian attempts, the more likely they will take the risk by accepting a shorter gap size.
Apart from that, this study discovered that traffic speed, pedestrian gender, crossing distance, age group and pedestrian number showed a positive impact on the log accepted gap size with a positive regression weight, B. Such positive coefficients indicate that an increase in the predictor will increase the log accepted gap size as well. For continuous variables, each unit increase in the continuous predictor will increase the pedestrian accepted log gap size by 0.013 seconds (traffic speed), 0.036 seconds (crossing distance), and 0.016 seconds (pedestrian number) respectively. As for discrete or categorical variables (refer back to the coding), a positive coefficient in pedestrian gender suggests that female pedestrians are more prone to accept longer gap size compared to male pedestrians. Besides, a positive coefficient of age groups indicates that older pedestrians tend to accept a longer gap size.
In regression modelling, it is important to assess the relationship strength of predictors to understand their impact on the dependent variable (log gap size). The absolute value of standardised regression weight (BETA) is examined and it was found that traffic speed and pedestrian waiting time showed a relatively strong relationship with log gap size with absolute BETA values of 0.373 and 0.247 respectively. The strength of relationship is then followed by pedestrian gender and age groups which have relatively normal strength or impact towards log gap size. As for frequency of attempt and pedestrian number, both are relatively weak in terms of relationship or impact towards the log gap size. However, it was observed that lognormal regression assumes a normal distribution for the logarithm of the dependent variables. The general model framework is given below: where Log-Gap= logarithm of accepted gaps; Xi-n= explanatory variables; β1-n= are estimated parameters from the model; β0= constant To better illustrate the relationship, the relationship of particular predictors was modelled using the regression model obtained earlier while other variables were held constant with their corresponding mean values (Table 5). For example, if the study illustrates the relationship of traffic speed and gap size, the mean values of other variables in Table 5 were substituted into model Equation 1. The regression equations for each variable is presented in Table 6. Table 5. Mean statistic for variables Table 6. Relationship between independent variables and the dependent variable (gap size)

.3 Choice behaviour model
A Binary Logit Model was employed to analyse the choice opportunities for pedestrian road crossing behaviour at an uncontrolled mid-block location. It is important to state that the data was divided into approximately 78% (n=348) and 22% (n=100) randomly, where 78% of data was used to analyse and calibrate the Binary Logit model while22% of the data was utilised to validate the prediction model. The probability of selecting an alternative (accept/reject) is based on a linear combination function (utility function) expressed as: where Ui = the utility of choosing alternative i, i = the alternative (accept/reject), n = number of independent variables, C = constant, and ß = coefficients. The utility of alternative 'i' has to be transformed into a probability in order to predict whether a particular alternative will be chosen or not. The probability of choosing alternative 'i' is, in this case, crossing the road which is then calculated using the following function: ( From the regression weight results in Table 6, it seems that traffic speed and age group produced a negative coefficient in terms of the decision of road crossing or gap acceptance (0: reject, 1: accept). The negative B value shows that if vehicles approach with higher speed, pedestrians are more likely to reject the gap or chose not to cross compared to vehicles with lower traffic speed. For age groups (1: young, 2: middle, 3: elder), a negative B value means that pedestrians who looked older are more likely to reject a gap compared to pedestrians who look younger or more energetic. It was found that driver yield and pedestrian number (in the group) have a significant positive effect on pedestrian crossing choice. Besides that, driver yield (0: no, 1: yes) occurs when drivers slow down their vehicle and a positive B value indicates that if a driver slows down, pedestrians are more likely to cross the street.
Similarly, a positive B value in terms of pedestrian number shows that if pedestrians move in a larger group, they are more likely to cross the street compared to those crossing alone. This could be due to the fear factor when doing it alone or the fact that drivers tend to slow down for a larger group of pedestrians, which then promote higher gap acceptance. Looking at the exponential B value, it seems that if drivers were to decrease their speed by 1 unit (1km/h), the probability of pedestrians crossing the road increases by 2.3 times. Also, if the drivers were to slow down (driver yield), the probability of pedestrians crossing the road increases by about 1700 times, which is a very significant predictor in the Crossing Choice Model. On the other hand, an increase of each person in the pedestrian group will increase the gap acceptance probability by about 17 times. Lastly, if the pedestrian age were to decrease, for instance, from the elderly group to the middle-aged group, the chance that a pedestrian will cross or accept the gap increases by about 1000 times.
With the calibrated model, this study validates the model by calculating the probability of crossing as follows: If the calculated probability of crossing is greater than 0.5, the study will then classify or predict that the sample chooses to cross the road. As such, the predictive accuracy of the model is summarised in Table 8. Based on the table, the calibrated model showed an overall accuracy of 98.9% while the validation model showed an accuracy of 98% which show that the model can be used to predict the crossing choice of pedestrians based on 4 significant predictors namely traffic speed (of vehicle), driver yield (slow down), pedestrian number (number in group crossing), and physical age group (by observation).
While altering the predictors' one at a time, the remaining predictors were held constant at a value of 37 km/h for traffic speed, 0 for driver yield, 1 for pedestrian number and 3 for age groups. Based on Table 7, it is not hard to see that traffic speed and age group have a reverse impact on the crossing probability while driver yield and pedestrian number have a direct impact towards the pedestrian crossing probability.

Summary
This study found that there were basically seven factors that influence the pedestrian behaviour at Jalan Sultan Ismail, Kuala Lumpur, in terms of accepted gap size in seconds. In fact, traffic speed, pedestrian waiting time, pedestrian gender, crossing distance, age group, frequency of attempts and pedestrian number were found to have significant impact on accepted gap size among pedestrians. Pedestrian waiting time and frequency of attempt are the two variables that showed a negative relationship with log gap size while the rest showed a positive relationship. In short, a pedestrian who is female and experiences higher traffic speed, lesser waiting time, longer crossing distance, older age, lesser attempts and crosses in a larger group will be more likely to accept longer gap size. In terms of crossing choice, traffic speed, driver yield, pedestrian number and age were found to influence pedestrian choice. Results also showed that lower traffic speed, driver yield (slow down), crossing in larger groups and younger pedestrians will promote a higher probability of crossing the road.