Assessment on the Cohesion Pattern of Rail Transit: Integration of Rough Set and Grey Correlation Theory

Most existing project assessment relies on expert scoring, whose precision can be deteriorated by personal subjectivity. This paper presents an assessment method to compare the advantages and disadvantages of three cohesion patterns between suburban and urban rail transit network, which reduces the influence of subjective score. A modified rough set – grey correlation model is developed as a core of this assessment, based on the integration of rough set theory and grey correlational analysis, where an index system is set up for model calculation. A case study using the network in Ningbo is applied to demonstrate the effectiveness of the method, the results show that the method is more effective using discretely distributed data sensitive to sample size. The consistency of the results in comparison with marginal cost analysis can be a preliminary verification of the model.


Information
Scheme comparison is commonly used in rail transit planning; and the results are always sensitive to the assessment method.Currently, most of the assessment are based on expert scoring, in which the subjectivity can always be the problem that will cause uncertainty and inaccuracy of the results.Therefore, it is necessary to develop a data mining method to reduce the error in subjective judgement.
In the rail transit network planning, many cities has been faced with difficulty in selecting cohesion pattern connecting urban and suburban network, which can be categorized as (i) core-through; (ii) periphery-access; and (iii) core-terminate patterns.For instance, deeper extension of suburban line results in better accessibility; simultaneously increase the cost of project.To facilitate the rail transit builder to maximize the benefit, this paper developed an assessment method with a combined rough set-grey correlation model.

Literature review
Great efforts have been made to quantify the scheme evaluation.Currently, there are many popular methods that overcomes the drawbacks of the traditional qualitative method [1][2][3], such as marginal cost analysis, AHP (Analytic Hierarchy Process), fuzzy logic, grey system theory, clustering analysis and neural network.Some studies combine multiple methods to push forward a comprehensive assessment in order to make up the flaws of single method in complicated engineering system design, such as fuzzy AHP [4], fuzzy clustering analysis [5], neural AHP [6], grey fuzzy analysis [7] and so forth.Among these methods, there are still some problems: 1) the travel utility is calculated simply by the product of total travel time and unit value of time, which does not make sense in some cases; 2) the estimation of weight factors rely on the expert's score, which makes it difficult to eliminate the influence from personal subjectivity and uncertainty.To reduce the dependency on expert scoring, some scholars extended the rough set models in project scheme evaluation and decision making assistant, which incorporate features in processing data with small sample size, uncertainty and incompleteness.
The current research of grey rough can be divided into two categories: the first type is grafting of grey system and rough set theory [8][9][10][11], which can be regarded as combination of Grey Assessment Model from (i) the attribute reduction and attribute dependence of rough set, (ii) combination of grey clustering method and rough set, (iii) combination of grey relational analysis and rough set, etc., which is more interested in practical application.The second type contains four categories based on the model of grey information system and their attribute analysis and application, integration of grey system theory and rough set model [12][13][14], etc.
At present, the fusion of grey correlation analysis and rough set mainly focus on attribute-reduction, while scheme ranking is more concerned in practical application.Therefore, this study put forward a modified grey rough set model which can utilize flexible data input to calculate the significance index and assign weight (as an alternative of expert scoring).The model output contains not only the score but also the rank.

Index system
An index system is set up as an input for rough set calculation.Since rough set theory can only deal with discrete data, all the attributes must be discretized to fit the model.In addition, a decision table of non-decision attributes is set up for weight calculation, which contains only status variables instead of decision variables.
Considering the condition in China, where local government is in charge of rail construction and operation; and where the cost and benefit is sensitive to the decision for development of public transit system, we select the attributes in three aspects: the passenger utilization (mainly reflect the difference of travel time and the number of passengers), construction and operation (mainly reflect the difference of construction costs and operating costs), and the level of service (ultimately reflected in the number of passengers).Definition of assessment index:

The total in-vehicle time
Unit: 10 4 min; Calculation Method: where: ܶ ‫ܿݕ‬ represent the in-vehicle time of suburb rail network; ܶ ‫ܿݍ‬ represent the in-vehicle time of urban rail network; Note: The in-vehicle time can be obtained according to forecast

The total transfer time
Unit: 10 4 min; Calculation Method: where: ܳ ‫ݕ‬ℎ represent the transfer volume of suburb rail network;ܳ ‫ݍ‬ℎ represent the transfer volume of urban rail network;ܶ ℎ represent the average transfer time;

The total waiting time
Unit: 10 4 min; Calculation Method: where: ܳ ‫ݕ‬ represent the passenger volume of suburb rail network;‫ܤ‬ ‫ݕ‬ represent the average waiting time of suburb rail network;ܳ ‫ݍ‬ represent the passenger volume of urban rail network; ‫ܤ‬ ‫ݍ‬ represent the average waiting time of urban rail network; To Illustrate: if there is only one operation cross, we can reduce the waiting time by using short-marshalling train or improving the departure frequency in order to improve the service level; if the passenger flow section is non-uniform and the line is long, then multiple operation cross is needed, with the departure frequency of large cross restricted and increase of passengers' waiting time.

Construction cost of cohesion program
Unit: 10 4 yuan; Calculation Method: where: D j1 represent the length of underground line; C j1 represent the average construction cost of underground line;D j2 represent the length of over-ground line;C j2 represent the average construction cost of overground line;D j3 represent the length of elevated line;C j3 represent the average construction cost of elevated line;

Construction cost of core-outside line
Unit: 10 4 yuan; Calculation Method: where: ‫ܥ‬ ‫‬ represent construction cost of suburb line without cohesion scheme;∆‫ܥ‬ represent construction cost increment bring by cohesion scheme; where: ܰ ݁ represent the number of stations affected by different vehicles marshalling; ‫ܮ∆‬ represent the length increment of the station; ‫ܥ‬ ′ represent unit mileage cost of station; ‫ܥ∆‬ ܿ represent vehicle purchase cost increment; To illustrate: In different cohesion pattern, the high passenger flow sections of suburb rail are different, so vehicle types and vehicle marshalling are different.If the vehicle type selection is constant, the length of the station will be different, further to affect the construction cost of the station.At the same time, the total number of train may be different, further to affect the vehicle purchase costs.

Operation cost of suburb rail transit
Unit: 10 4 yuan; Calculation Method: where: ‫ܦ‬ ‫ݕ‬ represent operation length of suburb rail, ‫ܥ‬ ‫ݕ‬ represent vehicle kilometers operating cost; To illustrate: Operation costs of rail transit are composed of running cost, management cost, financial expense and Non-operating expense.Because of the fixed scale and operation mode of urban network, the index can be calculated by the product of operating cost per vehicle kilometer and the operating mileage.

Assumption
Model assumptions are as follows: 1) The urban rail network planning of the city is stable in the assessment year, where different cohesion pattern have no impact for the urban rail network structure and the operation; 2) The suburb rail network planning in the city's downtown area is the same for each cohesion patterns, the change brought about by different cohesion patterns change the urban section of the program, which affect the cost of the total construction and the operation; 3) The city has the economic strength and engineer condition to achieve any pattern of the three, while each cohesion pattern should be able to achieve the desired goal of traffic, and the capacity should be adapt to the traffic demand; 4) All cohesion programs can achieve the planning in the assessment year, the index value of the assessment system can be collected and accurate in the assessment system.

Weight definition based-on rough set
The concepts of rough set in the paper: Definition 1: knowledge base The pair K= (U, R) is called a knowledge base, where U is the universe of objects, R is a set of attributes.
Different forms of U can be classified according to different R. In this way, the knowledge base expresses the basic classification methods of one or a group of intelligent organizations.

Definition 2: information table
The expression of knowledge is commonly in form of information table or for information system, which can be expressed as the quaternion orderly group s = (U, a, V, f), where: ܷ represent the nonempty finite set of a group of objects, called discourse domain; ‫ܣ‬ represent the nonempty finite set of the attributes; ܸ is the set of attribute range; ݂ is an information function, .Where, ܲ(ܺ) is called the positive region of X according to A, denote by ܱܲܵ(ܺ)., where |ܷ| is the number of set (i.e., the total number of set elements) base, called Knowledge Q K (0 < K < 1) dependence on knowledge R. Definition 7: discretization method Because the rough set can only deal with the discrete data, in view of the decision table of this paper has no decision attribute (no-decision system is a set of attributes without decision attributes), use isometric discretization method: Step 1: Calculate the Interval length of attribute where z imax is the maximum of attribute i, z imin is the minimum of attribute i; n is the number of interval.
Step 2: Determine the interval range Step 3: Calculate attribute value Each attribute has a total of N intervals, for an attribute value Z, if it is located in the range i, its value is i.
Based on the theory above, the paper further study the determination method of weight: remove some of the attribute from the table, and then investigate the change of the classification out of the attribute removed.If the change of the corresponding classification is large, the intensity of the attribute is high.The importance of an index can be obtained by calculating the dependence degree.
Then the weight of each condition attribute index can be determined by the following formula

Ranking based on grey correlation
The ranking process is applied using grey correlation theory.The correlation coefficients are the critical intermediate variables that clarify the system's structure and relationship based on the combination of intentions, opinions and demands.
Step 4: calculate correlation degree Step 5: rank correlation order In order to evaluate the correlation degree of the subsequence to the sequence, the correlation degree is arranged according to the magnitude.

Solving Flow Chart
To solve the model, the solution approach can be divided into five steps.
Step 1: Establish the original index matrix Assume m cohesion schemes, select n index for assessment, establish matrix as follow: where: i means the NO of the scheme, i=1, 2, …, m; k means the NO of the index , k=1, 2, …, n; ܷ ݅݇ means the value of the index k in scheme I; Step

2: obtain the decision table of assessment index system
Select the maximum value of each index in the column as the reference array: Compare the original matrix with the reference array, carry out non-dimensional treatment, obtain multischeme assessment matrix to form a decision table.
Step 3: discrete the decision table and calculate the weights Use the discretization method mentioned above to dispose the data of the decision table.Use the weight determination method according to the rough set, to calculate the weight of each index ‫ݓ‬ ݇ .
Step 4: calculate the correlation of index system Use grey correlation analysis method to calculate the initial image of the columns, difference sequence of decision table, two stage maximum difference sequence and two stage Minimum difference sequence.For the importance of each index is different, calculate the relevance value to take the weight multiplied by the correlation coefficient, the formula for the correlation value is as follow: Step 5: sort the schemes The ߛ in above formula is assessment result matrix, the greater of the value on behalf of the better scheme, the highest scoring scheme should be the recommended scheme ultimately.

Original data
The suburban and urban rail network planning of Ningbo, a port city in Zhejiang Province, China, is selected for our case study.We generate 4 sim-schemes based on three cohesion patterns, carry out passenger flow forecast for each one, the year of data in the paper is the forecast year when the scheme is assume to be achieved.We collected the forecast data, and to calculate the index data according to Eq. (1)~Eq.(7), then established the original data matrix.

Comparison with marginal cost analysis
To test the calculation result, the marginal cost analysis is deployed here.A non-cohesion scheme is assumed as a base scheme, a reference scheme for the 4 sim-schemes.The performance of each sim-scheme is measured by comparing the difference in cost and difference in value of time with the base scheme, which is formulated in Eq. (21).According to the original data, we assume the total passenger flow volume is 1 million, the total in-vehicle time is 82 million minutes, the total transfer time is 3.75 million minutes, the total waiting time is 7.5 million minutes, the vehicle marshalling is 2B, the peak departure frequency is 20 pair/h, the normal departure frequency is 12 pair/h, the time value is T.
The construction costs, operation costs and travel time value of the assumption scheme can be seen as fixed cost, cohesion scheme can be seen as the variable cost, take the passenger travel time saving as the benefits bring by the cohesion schemes [15].According to the results above, it can be found: to save the time value of the same, scheme 2 will costs the least.This is consistent with conclusion calculate by the grey rough set model, the rationality and feasibility of assessment method is verified for a certain extent.

Conclusion
This paper presents an assessment method to compare the advantages and disadvantages of three cohesion patterns between suburban and urban rail transit network, which eliminates the dependency on expert scoring and output more comprehensive information, such as ranking and significance of selected attributes.A modified rough setgrey correlation model is developed as a core of this assessment.A index system is set up for rough set calculation, where all the variables needs to be discretized to be input to rough set model.A case study using the network in Ningbo is applied to demonstrate the effectiveness of the method.The consistency of the results in comparison with marginal cost analysis can be a preliminary verification of the model.
In addition, this method can also be applied in other types of scheme analysis.To enhance the capability of this method, there are some other issues that will be further studied in the future: 1) systemic methods to select attributes in the index system; 2) optimization of the variable's discretization method.

Table 1 .
Assessment index system.
Definition 5:upper and lower approximation Set X a nonempty subset of U , ܲ ⊆ ‫ܣ‬ and ܲ ≠ ∅ , upper approximation is ܲ(ܺ) and lower approximation is

Table 4 .
Knowledge presentation system

Table 5 .
Decision table.‫ܯ‬ is the performance measure Similar to the marginal cost), ‫ܥ∆‬ is the difference in cost, ‫ܧ∆‬ is the difference in value of time,Calculate with the original data: