Research on Expressway Travel Time Prediction Based on Exclusive Disjunctive Soft Set

In order to study the prediction problem of expressway travel time, due to the ambiguity and uncertainty in the road traffic system, the travel time prediction model is established based on the exclusive disjunctive soft set theory. Through the parameter reduction theory of soft set, the main influence factors are extracted, and the mapping relationship between the influence factors and the travel time is obtained through the exclusive disjunctive soft set decision system. The travel time model is established based on the soft set theory, and the travel time is calculated through the mapping relationship. The experimental results show that, compared with the BPR function model, the travel time model based on the exclusive disjunctive soft set theory reduces the prediction error and effectively improves the calculation accuracy of the travel time.


Introduction
With the rapid development of intelligent transportation system, the traffic information that can be obtained and provided is more and more abundant, and the travel time prediction is an indispensable part of the advanced highway travel information system.Facing the increasingly prominent traffic problems, dynamic traffic control and traveller behavior decision-making require highway travel time prediction. It is important to improve the accuracy of the estimation when considering the estimated value, it is of great practical significance to quantitatively describe the uncertainty of the predicted value.
The researches on travel time mainly focus on the construction of prediction models and travel time function forms, which have achieved rich research results. More Rohan applied artificial neural networks to road traffic flow prediction, which reduced the difficulty of establishing accurate mathematical models, while ensuring high prediction accuracy [1]. Mori et al. introduced the literature on travel time model and thoroughly classified existing technologies, most of the attention was focused on travel time estimation and travel time prediction [2]. Allstr M et al. combined parametric and nonparametric traffic state prediction techniques through assimilation in aggregating Kalman filtering. For nonparametric prediction, the neural network method was adopted; the parameter prediction was performed by using the cell transmission model with velocity as the state [3]. Celikoglu et al. investigated the performance of the flow model in providing efficient travel time estimation for varying flow patterns of freeway traffic by adopting the two-phase fundamental diagram [4]. Zhong et.al. proposed an online travel time prediction model with emphasis on capturing the effects caused by anomalies. A Functional Principal Component Analysis (FPCA) framework was adopted to forecast link travel times based on historical data and real-time measurements [5]. Sehyun Tak et.al.
proposed the new algorithm called multi-level k-Nearest Neighbor (k-NN), which was designed for predicting travel time with higher computational efficiency and prediction accuracy, and provided the data fusion method to overcome the problems of data errors that combined the traffic data from ILDs and DSRC [6]. Lu et al. proposed a piecewise function based on traffic flow and vehicle type in order to estimate the travel time function of heterogeneous flow, and used the VISSIM model of actual traffic data calibration to generate traffic data for estimating parameters [7]. Chen et al. used the data source from highway charge system to establish the prediction model based on Bootstrap to improve the reliability of point prediction in travel time [8]. Wang et al. used the K nearest neighbour algorithm to predict the short-term travel time of the expressway based on the location and time information of the vehicle entering and leaving the toll station in the highway toll data [9]. The experimental results showed that the algorithm could effectively improve the prediction accuracy of the model, but the impact of weather conditions on traffic conditions was not considered, and the historical database wasn't classified according to different weather conditions. However, when studying the travel time, the following problems are rarely considered: the judgment of the traffic participants on the traffic state is inherently vague, not precise. Under the premise of ensuring the effective provision of information, traffic participants choose different paths by judging road congestion conditions, etc., but their judgments on road congestion conditions are qualitative, not quantitative. That is to say, the ambiguity should be fully considered for the estimation and prediction of the travel time. Soft sets are powerful theories for dealing with ambiguity and uncertainty, its advantages are not only for quantitative analysis, but also for qualitative analysis [10][11][12]. However, the existing soft set is mainly the aspect of complete information, but the aspect of incomplete information system needs to be studied. From the point of view of data mining, this paper introduces the exclusive disjunctive soft set to describe the incomplete information system, and quantifies the data in the incomplete information system by redefining the characteristic functions in the exclusive disjunctive soft set.
The rest of the paper is organized as follows. Section 2 introduces the basic concepts. Section 3 describes the prediction model and algorithms used in the study. Section 4 uses the numerical examples to verify the validity of the model, and the model is compared with the prediction result of the BPR function model. Finally, conclusions are discussed in Section 5.

Soft set
Molodtsov proposed the new uncertainty mathematics tool of soft set theory [13]. For the first time, he systematically introduced some basic theories and applications of soft sets, and used soft set as a general mathematical tool for dealing with uncertain, fuzzy and inaccurate objects. Soft set is a new method for solving uncertain problems that can't be solved by traditional methods, the biggest difference between the soft set and the traditional data method is that it can set different forms of parameters according to its own preferences, which can solve the problem of ambiguity and uncertainty well. Definition 2.1 ( , ) is a soft set on the domain , if and only if is a mapping of into the set of all subsets of the set [13]. Definition 2.2 Let ( )be the fuzzy set of the domain , ⊂E, and the two-group ( , )is called fuzzy soft set defined on the universe , where is a mapping [14]: ( ) can be interpreted as : a)For each is unknown, but ∑ ( = 1); b)And is called an indeterminate element of soft set( , ) [15].

Exclusive disjunctive soft set
Both the bijective soft set and the exclusive disjunctive soft set are developed on the basis of soft set, the bijective soft set is a special exclusive disjunctive soft set. The bijective soft set doesn't contain uncertain elements, and exclusive disjunctive soft set can be used to describe an incomplete information system.

Exclusive disjunctive soft set decision system
Soft set theory as a new mathematical method to solve the problem of uncertainty obtains fast development, the paper introduces the exclusive disjunctive soft set, and proposes to systemize the incomplete information into the exclusive disjunctive soft set decision system, so that the exclusive disjunctive soft set can process the decision analysis based on the incomplete information system. In the process of quantification, it can make up for the information that may be missing, and comprehensively reflect the overall information of the system. Combined with actual traffic data and road traffic conditions, according to expert experience, data and natural language are transformed into a fuzzy vector map of the domain of travel time. The parameter reduction theory of soft set is used to eliminate the redundant information and preserve the mapping features that can't be removed [16].

ICTTE 2019
According to the definition of soft set, firstly establishing a mapping relationship between the influence factors of road traffic (including qualitative and quantitative factors) and the travel time, so as to quantify the fuzzy factors [17]. A fuzzy soft set ( , ) on the domain = ( 1 , 2 , … , ), represents the power set of travel time under different traffic conditions. For the parameter set , = 1 ∪ 2 ∪ … ∪ , represents the th factor that affects travel time.

Construction of the travel time prediction model based on exclusive disjunctive soft set
On the basis of the above, giving the following definitions: Definition 3.1 Universe = { 1 , 2 , 3 } indicates the formula mode of travel time under different traffic conditions. hhere is currently no uniform regulation on the division of road traffic states, but we think it is possible to satisfy the following possibility: when the number of vehicles is less than or equal to the maximum value to ensure that the vehicle is free to travel, the maximum value indicates the actual traffic capacity of the road section, the travel time is generally stable and small, and the travel time is denoted as 1 ; when the number of vehicles exceeds the threshold, the threshold represents the traffic jam of the road section, the travel time gradually changes and increases, and the travel time is denoted as 2 ; when there is severe congestion on the road, few vehicles can enter the network, and the travel time required is longer, the travel time is denoted as 3 . indicates the parameter set, = { 1 , 2 , … , } , represents the th factor of affecting travel time. Here, the soft set ( , ) describes the prediction of the travel time.
The specific steps to establish the travel time prediction model are as follows: Step 1 Considering the uncertain and fuzzy factors, according to the parameter reduction theory of soft set, and retaining the important influence factors, and eliminating the secondary factors.
Step 3 Establishing the membership function and the mapping relationship between the influence factors and travel time according to the decision rules, and then quantifying the fuzzy factors to calculate the dependence , κ = γ(⋀ ( , ), ( , ) =1 ) of the exclusive disjunctive soft set decision system (⋃ ( , ), ( , ), =1 ).
Step 4 Establishing the parameter form of the soft set between the travel time and the influence factors.
Step 5 According to the mapping relationship, predicting the travel time.
According to the mapping relationship, the following expression of the travel time is obtained, ( ) = indicates the actual traffic capacity of the road section , and the threshold value indicates the traffic jam of the road section , these two parameters can be obtained by consulting relevant experts or by calculating the arithmetic mean value of the historical record of the section. denotes the dependence of the exclusive disjunctive soft set decision system. Usually, = 0.15, = 4.0, > 0, > 0, they are the constants.

Experimental data source
The data in this paper is derived from UK Highway Network, which can help find and use open government data developed by data.gov.uk, using a variety of sources to estimate traffic flow parameters such as travel time and speed, including automatic license plate recognition (ANPR) cameras, on-board global positioning systems (GPS), and induction circuits built into the road, including average traffic time, vehicle speed, road occupancy, traffic flow, etc. In order to facilitate data selection, the data is first preprocessed. This paper chooses to use the road segment (A1 northbound at a minor junction between A167 near Washington Birtley and A692 (120014301)) as the research object of this paper, and calculates the average travel time of the two stations. Collecting 10 consecutive days of data, the collection interval was 15 minutes, and a total of 500 sets of data were collected. Taking a certain section as the predicted section, the data collected in the first 9 days of the section is taken as the learning sample, and the data on the 10th day of the section is the test sample, and the newly established prediction model is used to predict the travel time.
According to the relevant information and expert experience, and parameter reduction theory of soft set [16][17]. hhis paper mainly considers objective factors (traffic flow, vehicle speed), weather conditions and traffic event as the main factors affecting travel time. represents the factor parameter set that affects the travel time, = 1 ∪ traffic sections for certain safety reasons, such as road and bridge construction, etc.. hhe membership functions of each set are formulated as shown in hable 1, hable 2, hable 3, hable 4 [18].    In summary, the dependence of the exclusive disjunctive soft set decision system was calculated. hhrough the mapping relationship [19] and the decision rule, the corresponding travel time prediction model was obtained from the interval range of the dependency degree. In order to verify the validity of the travel time prediction model established based on the exclusive disjunctive soft set theory, it was compared with the calculated value of the BPR function model. When only considering factors such as traffic flow and speed, comparing the prediction results of the new model and BPR function, and the result at this time was called the predicted value. When adding weather factors and emergencies of road traffic, comparing the prediction results of the prediction model, the result at this time was called the modified predicted value, and the results of the three calculation models of the road section travel time were compared as shown in Fig.1, in the figure, X-axis represented the number of samples tested and Yaxis represented the travel time, the unit was t/s.

Analysis of results
hhree sets of prediction experiment results were output, and the evaluation indicators were used for data analysis. hhe indicators for the prediction results are selected as follows: hhe two evaluation indicators are: MAPE (Mean Absolute Percent Error) is the average of the absolute error between the true value and the predicted value, which can reflect the actual situation of the prediction error; RMSPE (Root-Mean Squarer Phase Error) is the measure of the deviation between the predicted value and the observed value. Where is the predicted value, is the actual value, and is the predicted number. hable 5 shows the results of the comparison of the indicators of the two optimization methods. From the comparison curve of travel time, the calculation results of the two models (BPR function model and travel time model based on exclusive disjunctive soft set) can better estimate the actual travel time. However, it is easy to know through the error analysis table: Under the same conditions, predicting the travel time based on the exclusive disjunctive soft set theory, MAPE and RMSPE of the experiment reduce by 0.0107 and 0.001 respectively. haking into account the weather and the sudden situation of the actual traffic, the modified predicted model predicts better results than the original model, MAPE and RMSPE of the experiment reduce by 0.0131 and 0.0019, it can be seen that the accuracy of the prediction is obviously improved.

Conclusions
In this paper, the influence of ambiguity and uncertainty on travel time is studied based on traffic flow data, and using the exclusive disjunctive soft set decision system and soft set theory to establish the corresponding prediction model and verify it with actual data. hhe experimental results show that, compared with the BPR function, the model established in this paper can improve the accuracy of prediction. Certainly, the results of this paper are only a relatively preliminary study, and there are still many aspects that need to be discussed in depth. For example, when dealing with fuzzy factors, there is inevitably some subjectivity, which leads to some deviation of the prediction results, this requires multiple observations, consultation with multiple experts, and repeated correction of relevant data to improve the prediction process and results. For the influx of massive data, whether the calculation speed can meet the requirements of real-time applications in real projects, and so on.