The application of random regret minimization on commuter\'s mode choice behaviour: Model-Fit comparisons with Rum-Modelling (case: comparison between Matsuyama and Yogyakarta)

Modelling the mode choice behaviours of travellers is a key to design effective transport management policies, particularly in shifting travellers to public transport. Abundant studies have analysed the impact of level of services on mode choice preferences through its Random Utility Maximization (RUM), but the possibility of minimalize the regret have been overlooked. This paper will discusses the possibility of using generalised Random Regret Minimization (G-RRM) model on choosing transportation modes. The study is performed in two cities for comparison: Jogjakarta in Indonesia and Matsuyama in Japan. A stated preference (SP) survey isconducted, in which respondents choose Bike or Bus under hypothetical situations. As the result of RUM revealed that travellers prefer the transportation mode with more ensuring level of service. While an empirical proof of concept, the G-RRM model is estimated on a stated mode choice dataset, and its outcomes are compared with RUM and RRM counterparts.


Introduction
Many modern cities are facing traffic problems due to rapid urbanisation.Increasing population induces increasing number of mobility demand, which results in problems in transportation.Particularly in developing Asian cities, the traffic problems have existed for many years.Growing economies have enabled the residents to own private vehicles, such as motorbikes and cars.This change makes the individuals more dependent on the private mode of travel and causes serious traffic jam on a daily basis.The problem of traffic congestion stems to disrupted traffic conditions, such as traffic accidents; and delay in mobility of people and goods, which cause huge economic loss every year.
A public transportation could give an outstanding solution to the above problems by attracting the travellers of private vehicles to mass transportation modes.However, the benefit of the public transportation systems is not easily attained.In general, private vehicles are more attractive mode of transportation because of their comfort and speed.To devise an effective public transportation policy, it is essential to understand the relationship between travellers' mode choice decisions and the factors affecting it.
Abundant studies have analysed the impact of level of services, on mode choice preferences using Random Utility Maximization.However, the possibility of minimalize the regret have been overlooked.To the best of authors' knowledge, Chorus, et al. [6] has been empirically comparedbetween the application of Random Utility Maximization and Random Regret Minimization on consumer choice modelling.They identified that the RUM and RRM perform equally well on some aspects, but there are possibility of compromising effect on using RRM Although the study presents promising findings, it does not quantitatively evaluate the impact of using RRM while modelling the consumer choices.

Yogyakarta, Indonesia
Yogyakarta is one of the most growing cities in Indonesia, and the mobility demand is dramatically increasing.In order to meet the demand, the government provided a public transportation system by operating Transjogja Bus in 2008.However, since its start, the number of passengers has been decreasing [2].According to Department of Transportation of Yogyakarta, the designed time headway of Transjogja Bus is 16 minutes, and the targeted punctuality tolerance level is 5 minutes with average speed is 25 km/hour.In reality, a lot of schedules are missed from its tolerance level because of many obstacles in a road lane, which shuts away the passengers.

Matsuyama, Japan
Matsuyama has a well-developed transport network as the prefecture capital.It has a network of suburban railways run by the Iyo Railway.Iyo Railway also operates a system of trams and bus, which serve as the main modes of public transportation in the city.The public transportation network in Matsuyama is almost the same with Yogyakarta.Nevertheless, the majority of travellers tend to use private vehicles, such as bicycles, motorbikes or private cars.

Questionnaire individual attribute
The first section is devoted for asking individual attributes of the respondents.The questions are designed to comprehend their socio-demographic characteristic, daily travel behaviours and past experiences in using public transportation.The list of questions includes the following: 1. Age 2. Gender 3. Occupation

Disposable income per month
The questions related to their travel behaviours seek for the daily commuting characteristics and the past experiences as listed below: Frequency of traffic accidents while using the motorbike (Acc.)

Experimental design
Scenarios are created by assigning two levels of values to each of the six attributes.According to Rushing,H., Karl, A,et.al [8]Each of the six attributes has two cases, which creates 64 scenarios ( ) in total.In order to ease the answering load for respondents, the scenarios are decreased based on fractional factorial design [3], where represents the number of attributes.The fractional factorial design is constructed by listing down a basic design consisting of the scenarios consisting of a full factorial in attributes.Then the remaining two attributes are added.Thus there will be 16 scenarios.

The-RUM model
According to Hensher,D.A [7], logistic regression is part of regression analysis that is used when dependent variable is dichotomous which represents whether an event appears or not.This regression forms a dependent variable as a linear combination of independent variables.The linear function, a utility function, is then transformed to probability using logit function.To predict if an alternative is chosen based on model, the value of utility function should be contrasted by alternative choice, transformed into a probability with score between 0 and 1.
Based on Issac K,et.al [3] citated from Ben-Akiva,et.al[4], the assumption of the model is that respondents would choose the travel mode for the greatest utility under certain condition, and the utility function of it is composed by fixed term and random term.The formulas are as follows: (1) where is utility function of the travel mode chosen by the respondent ( ); is the fixed term of utility function ( ; is the random term of utility function ( ; is the number of attributes (which is also called characteristic variables) affecting on the mode choice of respondents; is the undetermined coefficients; is the factor of the travel mode chosen by the respondent.If the random term follows the Gumbel distribution and all variables are independent each other, the probability for the respondent choosing travel mode, , is defined by the following equations: (3)

The-RRM model
The RRM model has been designed to incorporate the notion of regret-based decision making in non-risky choice models.The RRM model hypothesizes that, when confronted with a choice set, the decision-maker chooses the alternative from the set that has minimum regret.The regret of alternative i is described by the sum of binary regrets where alternative i is compared to every other alternative in the choice task on each attribute.Overall regret is increasing with the number of attributes on which alternative i is outperformed as well as with the number of alternatives by which alternative i is outperformed (as denoted by the summation over j≠), and the importance of the attribute (as denoted by δm). (4) RRi: denotes the random (or: total) regret associated with a considered alternative i Ri : denotes the 'observed' regret associated with i εi : denotes the 'unobserved' regret associated with i βm : denotes the estimable parameter associated with attribute Xm Xim, Xjm : denote the values associated with attribute Xm for, respectively, the consideredalternative i and another alternative j 3 Result

Attributes of respondents
The Socio-demographic attributes of the respondents are summarised in Table 3.In both cities, the majority of respondents are male aged 18-24 years old with an occupation as student.The disposable monthly income varies, but ranges in an average student level in both cities.Based on travel behaviour factors, the significance factor that could distinguished between Yogyakarta and Matsuyama respondents as Table 4.

RUM-Model
In order to create the valid input data, test of significance is conducted to figure out the attributes that affect the preference to the choices.The affecting factors are identified from the statistical test such as Wald-test for each variable.In the Wald-test, the null-hypothesis is (Not significantly affecting), while the alternative hypothesis is (significantly affecting).According to Ben-Akiva, et.al [5], the Wald-test calculation with α = 0.05 and DF = 20 will has critical t-value as 1.96.The table 5 and 6 will show the significant affecting factors in both cities.The goodness fit between Yogyakarta and Matsuyama also has different result due to the limitation number of respondent, but both model in Yogyakarta and Matsuyama has been fulfilled the criteria by the Omnibus test that state the significance of the model which is lower than 0.05 as α-value and also HosmerLemeshow value that is bigger than 0.05 which means that the model chosen is fit.Yet the variance in Indonesia can represent 54.7 % to the respondent variance, while in Matsuyama only represent 47.0%.In overall percentage, both are more than the null hypothesis which is 50%, the overall percentage in Matsuyama is 64.8% and in Yogyakarta is 83.9%.The -2 Log Likelihood shown in Yogyakarta is 686.335 which if compared to Chi-critical (α = 0.05 and DF = 20) which is 422.695(Chi-square of Omnibus Test),then the value of -2LL is bigger and stated on rejection area, which means Ho is rejected and proved that the unrestricted model is better than restricted model.While in Matsuyama the -2LL value which is 1000.201 is also higher than Chi-critical (108.835).Both cities stated the unrestricted model is better than restricted model.Restricted model is a model with only estimating the constant parameter, while the parameter inside scenario is not being used, while unrestricted model in involving sociodemograph factors, travel behaviour factors and all attributes in scenario.

RRM-Model
The respondents' characteristics and the SP survey results are analysed for estimating the mode choice model.The RRM-Model will be differs in result with RUM since in RRM estimated attributes will not involving the socio-demograph and travel behaviour attributes, the proceeded attributes are the SP scenario only.Since the RRM model is not fit with the dummy variables in socio-demograph and travel behaviour attributes.The result in RRM-Model using Biogeme will consist of coefficient parameter, statistical t-test and statistical p-value as mentioned below:

Comparison RUM-Model and RRM model
There are slight differences between 0-LL in Yogyakarta and Matsuyama, but the value of Rho-square are quite different value in both cities.The significant factors both in Yogyakarta and Matsuyama are the same.In terms of model fit, RUM-Model generally fit the data significantly.In context of SP survey, the "opt out" or "none of these seems to performs better in RUM.The model also has an ability to provide correct choice probability predictions for no-choice option.
While the RRM-Model, Generally fit the data significantly.In context of SP survey, the "i am indifferent" seems to performs better in RRM models.The model has no ability to provide correct choice probability predictions for no-choice option, but it could capture compromising-effect, it recognizes alternatives with mediocre performance.Based on the model-fit test as the comparison of choices in actual site and choices in the observation, the proportion could be seen in Table 11 for Yogyakarta model and Table 12 for Matsuyama model.The difference in goodness of fit as in Table 11 and 12, as expected, is in favour of the RUM-model, which is a sign that RRM indeed has some difficulty with handling imputed choice sets.As in the model fit of site comparison in Yogyakarta, the approaching value for bus is 82.63% while in Matsuyama is 65.87%.

Conclusion
In transportation system, every aspects should be considered to affect the preference.In this model there are 12 variables, with 4 socio-demograph factors (age, gender, occupation, income per month) and 8 travel behaviour factors (daily mode of transportation, travel distance, travel duration, travel cost, vehicle ownership, availability of motorbike, selfexperience on criminality and accident).
In terms of RUM-Model and RRM-Model that involves dummy variables using SPSS, factors that is significantly affecting the preference in Yogyakarta are are Travel Cost, Travel Time, Daily Mode of Transportation, Travel Distance, Availability of using Motorbike, Self-experience of Crimininality on public transportation and Self-experience of Accident on Private Vehicle.While in Matsuya the significant factors are Travel Cost, Travel Time, Safety while using the mode of transportation, Travel duration to campus, Private Vehicle Ownership, and Availability of using Motorbike.
The comparison analysis between RUM and RRM using Biogeme have slight difference, in terms of model-fit parameter and statistics test, and also the value of goodness of fit due to RUM ability to predict the probability of "no choice" option.The differences might getting large if the SP survey attributes level are vary and also the number of respondent is larger.The further statistical check like, t-test comparison and pvalue is needed whether RRM and RUM model are fit to be used in terms of Mode Transportation Choices.
Where : Pbus : Probability that Bus is chosen TC : Travel Cost of Bus in a Scenario TT : Travel Time of Bus in Scenario Mode_1:Motorbike as respondent daily mode of transportation Dist_2 : Daily travel distance about 3 -6 km Dist_3 : Travel distance more than >6 km : Bikeavailable(yes), The respondent has a motorbike and/or could use it : The experience of traffic accidents ExCrime: The experience of encountering a crime on public transportation Based on tables above, then the utility function equation in binary logistic regression model that suits in Matsuyama is: Where : : Probability that Bus is chosen :Travel Cost of Bus in a Scenario : Travel Time of Bus in Scenario : Number of accident on private vehicle in scenario : Travel duration that takes more than 5 minutes OwnVeh_1: One vehicle ownership : The respondent hasa motorbike and/or could use it Where : Pbus : Probability that Bus is chosen TC : Travel Cost of Bus in a Scenario TT : Travel Time of Bus in Scenario Mode_1:Motorbike as respondent daily mode of transportation Dist_2 : Daily travel distance about 3 -6 km Dist_3 : Travel distance more than >6 km : Bikeavailable(yes), The respondent has a motorbike and/or could use it : The experience of traffic accidents ExCrime: The experience of encountering a crime on public transportation Based on tables above, then the utility function equation in binary logistic regression model that suits in Matsuyama is: Ln (1+exp(Pbus))= -0.015TC -0.049TT -0.521Acc+0.499Dur1-0.456OwnVeh1-2.101BK+2.923Where : : Probability that Bus is chosen :Travel Cost of Bus in a Scenario : Travel Time of Bus in Scenario : Number of accident on private vehicle in scenario : Travel duration that takes more than 5 minutes OwnVeh_1: One vehicle ownership : The respondent hasa motorbike and/or could use it

Table 1 .
Levels of attributes for Yogyakarta

Table 2 .
Levels of attributes for Matsuyama

Table 4
Mode choice behaviours

Table 11 .
Site comparison in Yogyakarta