Multi-agent ant colony optimization for vehicle routing problem with soft time windows and road condition

In this paper we consider two important objects of transportation, cost and customer satisfaction. The latter mainly depends on vehicle arrival time and expecting time of the customer. Whereas in the reality, road conditions varies at different time periods and affect the vehicle travelling speed. Meanwhile, transport cost, including fuel consumption, relate to load of vehicle. Correspondingly, mathematical model of vehicle routing problem with soft time windows and road factor (VRPSTWRF) was established in which transport cost, fuel consumption and customer satisfaction are considered. Multi-agent ant colony optimization is proposed in which the features of agent perceiving and reacting to the environment are applied reasonably. Adaptive information heuristic factor and pheromone expectation heuristic factor changing mechanism is used to improve global convergence ability. Pheromone is updated adaptively, the fuel consumption rate also considered, to ensure the convergence speed. 3-opt strategy was introduced to improve local search ability. Thus, multi-agent ant colony optimization (MACO) was constructed and used to solve 40-customer VRPSTWRF model. Experiments show that MACO proposed is feasible and valid.


Introduction
Due to traffic management, traffic volume, traffic accident and rush hour, traveling speed of the vehicle is time-varying .In different times, cost of vehicle crossing the same stretch of road network is correspondingly change, which ultimately vary the total transportation cost. But in many other cases, transportation cost between any two points is generally considered as depending on the distance. The road network is regarded as static model. The disadvantage of solving logistics distribution problems with static road network model is that vehicles may be slowed down or jammed in real situations, which leads to large deviation between computing results and the actual results. In real life, it costs differently when driving under different road conditions, while delivery time greatly influence the customer satisfaction degree for distribution services. So vehicle routing problem with soft time windows and road condition factor is of practical significance.
In recent years, many scholars have researched in time varying VRP (vehicle routing problem) [1][2][3][4][5]. Li Yanfeng et al. [1] developed a method to solve vehicle routing problem in dynamic urban traffic network with real-tune traffic information by combining the initial routes arrangement with the real-time route adjustment. The initial route planning was made by solving a timedependent vehicle routing problem under re-current congestion conditions, and the genetic algorithm was applied. A new mechanism updating at the critical node was proposed to adjust the routes with real-time traffic information subject to the unexpected traffic accidents in non-recurrent congestion environment; Li Feng et al. [2] proposed a computational simulation model to represent time-dependent vehicle routing problems for perishable goods delivery. A heuristic search algorithm based on adaptive genetic algorithm, which was implemented on a multi-agent simulation platform, was presented to find the optimal solution; Yang Shanlin et al. [3] considered the speed as a time-dependent piecewise function, and used simulated annealing algorithm to solve the VRPTW under time-dependent; WANG Zhengguo et al. [4] constructed a mathematical model and a reactive tabu search algorithm, a two-phase heuristic method, to solve the time-dependent vehicle routing problem with backhauls; Wang Zhenguo et al. [5] presented the model for vehicle routing with time windows and backhauls under the time dependent in consideration of the conflict between the minimization of total travel time and the minimization of total tardy time, and the bi-objective problem was converted into single objective problem by weighted method; Considering time-dependent travel information and dynamic custom requests, TANG Jian et al. [6] propose a dynamic vehicle routing model based on genetic algorithm and achieved a better re-optimizing result. These research have achieved good results, but relation of fuel consumption and the total transportation cost did not be considered. Xiao Yiyong et al. [7] considered the Fuel Consumption Rate (FCR) as a load dependent function and presented a mathematical optimization model to formally characterize the FCR considered CVRP (FCVRP) as well as a string based version for calculation. A simulated annealing algorithm with a hybrid exchange rule was developed to solve FCVRP; ZHANG Jun et al. [8] considered a weighted vehicle routing problem (WVRP) for transporting one kind of production, the objective of which was to arrange a reasonable route so as to minimize the total cost of traveling cost, cost relating to cargo weight and dispatching cost. a scatter search (SS) algorithm was designed; Tang Jiafu [9] built weighted VRP model, in which the weight loaded in a vehicle is considered as an important factor of system cost, and proposed PMMAS (partition based max-min ant system) ,an ant colony optimization algorithm combined with a special partition method. Literatures of VRP research considering road conditions, load constrains and relations of fuel consumption and traffic load have rarely been found. Many literatures [10][11][12][13] published and demonstrated application of ant colony algorithm and its improved method on solving VRP, in which Ant colony optimization algorithm is used as a single individual optimization process, without collaboration of individuals. The algorithm proposed in this paper uses the feature of multi-agent's perceiving and reacting to the environment, and the cooperation and competition of individuals [14][15][16], which optimize the evolution of the ant colony. Multi-agent ant colony optimization algorithm improves through competition and self-learning, and can be used to solve vehicle routing problem with soft time windows and road factor.

Problem Formulation
Assumed that n} {1,..., N = is the customer set we need to deliver for. Let i g be the demand of customer One depot is denoted as  .The route Graph   Figure 1.
M is a large positive real value. the time of unloading is ignored.
Overtime fees are paid to the drivers when they work in non-working time. The driver's overtime fee is , as shown in Eq.(2-2), where k t is total working time of the driver k when returning to depot. a t is normal working hours ,and overtime fee per hour is  p .
This paper assumes that all the distribution tasks are finished in one day and the time for unloading is ignored. Objective function is proposed to minimize the total transportation cost and gain higher customer satisfaction, with constraints restricting the solutions to traffic factor, fuel consumption, load level, mileage and other constraints when serving all customers.
Vehicles needed is calculated according to eq.(2-3), where operator [ ] denote the largest integer no greater than the value in braces , and the coefficient  , is estimation of the load time consuming and the constraint number.
According to Literature [8], the calculation formula of fuel costs is shown in eq. (2-4).
, which is the possibility of abnormal conditions and is larger when traffic congestion happens due to rush hours or for bad weather, and so on.
And then set up Mathematical model.
Eq. (2-7) is the objective function including road condition factor, which represent minimum total transportation cost, involving fuel cost dependent on load level, vehicle fixed cost when put into service, the driver overtime payment and penalty cost. Eq. (2-8) and Eq. (2-9) impose that customers can only be serviced by one vehicle. Eq. (2-10) impose that every customer is visited only once, and depot is visited by all vehicles. Eq. (2-11) is constraint of the vehicle load. Eq. (2-12) imposes the continuity of the path. Eq. (2-13) imposes that for each vehicle k at least 1 arc leaves each vertex set S visited by k and not containing the depot.

Algorithm design
In Multi-agent Evolutionary Algorithm (MAEA), initial feasible solution is used as agent network, and constructs an evolution mechanism. Each agent self-learning and enhance energy based on objective, and compete in the adjacent domain. Agents fail in competition will be eliminated by those having largest energy in the field. Multi-agent ant colony optimization (MACO) algorithm is the combination of multi-agent system (MAS) and ant colony optimization (ACO) algorithm, in which each agent is equivalent to an ant and its aim is to minimize the adaptive value.

Initial agents generating
Build an m m agent grid. Generate n agents assigned with numbers ñ  . Select agents which satisfy the constraint and place into agent grid. If all selected agents don't fill the grid, repeat generating and selecting until get m m  feasible agents. Set initial status to 0.

Competition operator
Each agent can compete with other agents in the grid. The area consisting an agent and its four other neighboring agents in four directions are called the adjacent domain. Agent's survival rules in this environment are as follows: in the grid agent L only compete with other agents in its adjacent domain agent using competition operator. So an agent has four agents as competitor. Each agent has certain energy, and energy of agent L is denoted as ) (L E , and the agent owning the biggest energy in adjacent domain of agent L is denoted as ' L . For an agent L and its neighbors we get the agent ' L owning the biggest energy. If Infection is as below: Take a random length code of agent ' L in random position and insert it in the corresponding position of agent L ; Remove the duplicated code in L so that length and constitution of code section are kept unique. Reorder the other section of agent L coding to satisfy constraint. Loop reordering until all constraints are satisfied and get agent L feasible. There are two ways of self-learning. One is exchanging the positions of the code to promote energy. Another is moving code segment to promote energy. The learning process of agents is shown in figure 2.

Ant colony optimization algorithm
Ant number is denoted as m . The distance from i to j is denoted as   depends on model of pheromone updating rule. We adopt ant-cycle system as shown in Eq. (3)(4), where pheromone intention is denoted by Q ,  is constant , Gen is current iteration times and max Gen is max iteration times.

Major improvement strategies
(1)Adaptive Pheromone trail factor  and visibility factor  Commonly, a small  and a big  can achieve reasonable convergence speed. In the other hand, a big  and a small  can achieve larger search space, and avoid falling into local optimum.in the later process of the optimization algorithm. Therefore, adjusting the  and  adaptively can help to improve the optimization ability of the algorithm. An adaptive and nonlinear optimization model which improves the global search ability of the algorithm is shown in Eq. (3)(4)(5) and Eq. (3)(4)(5)(6) ,where t is current iterate times and max Gen is max iterate times .
(ii) Pheromone updating Reducing  can improve the global search ability of the algorithm, but it will lower the rate of convergence. So changing  value need to be adaptive. The initial value of  is When the optimal value obtains in the N cycle of the algorithm process has no obvious improvement,  will be adjusted as in Eq. (3)3-opt local search 3-opt local search method is proposed, which can enhance the local search ability of the algorithm. 3-opt local search method is as below: First, remove 3 edges from the path and add 3 new edges in other parts of the path, ensuring integration of new path. Next, if path length is shorter after switching, keep the results; otherwise, try to select other edges to switch. Repeat these steps to try all switching possibility, until the quality of the solution can't be improve. Output the optimal path and quit from the algorithm.

The process of multi-agent ant colony algorithm
The flow chart of Multi-agent ant colony algorithm is shown in figure 3.

Parameter settings and results analysis
Our experiment is based on Matlab R2010b running on a PC with Intel (R) Core ™ i5 CPU 3.0 GHz , 8.0 G memory and Windows 7 installed. According to model we run AGA, ACO, MACO program 20 times each to solve the model.
Parameter design for multi-agent ant colony optimization algorithm are as below: Ant colony size  . We adopt best-preserved method for selection operator, as well as multipoint crossover and uniform mutation. MACO gets the optimal solution in the 30th generation of 156.31 yuan. AGA gets the optimal solution of 180.63 yuan in the 20th generation. ACO get the optimal solution 169.62 yuan in the 24th generation. MACO obtained solution is superior to the ACO and AGA, but iteration to get solving optimal solution is slightly more than AGA and ACO. the specific distributing information is as shown in table 2. 6 vehicles are put into service. This paper proposes a multi-agent ant colony algorithm with adaptive pheromone stimulating factor; adaptive expectation factor and pheromone update method combined with the fuel consumption rate. Considering the influence of fuel consumption of logistics distribution, we model vehicle routing problem with soft time windows and road condition factor. We solve the problem of moderate scale with MACO algorithm, and the superiority of the algorithm is proved. Expanding the scale of problem and modelling a variety of the VRP (multi-depot, multi-vehicle, order-relating, asymmetric network, random demand, multi-cyclical, demand-separable, pickup while deliver, open vrp, service priority, etc.) and its solving algorithm will be the future research direction.