Collective decision making in dynamic vehicle routing problem

. The paper presents a collective decision making in dynamic vehicle routing problem. In contrast to the static problem, a part or all of the customers’ companies are revealed dynamically during the design or execution of the routes. The problem is modelled using the algebraic-logical meta-model (ALMM) methodology, which enables making collective decisions in successive process stages, not separately for individual vehicles. ALMM is considered to be the rule according to which the availability of companies is determined. The steps and schematics of the general algorithm that take into account the dynamic appearance of new companies are shown. The proposed approach belongs to trajectory-based metaheuristics methods.


Introduction
The aim of the paper is to present ALMM-based approach for collective decision making in dynamic vehicle routing problem. In dynamic problems, a part or all of the input is revealed dynamically during the design or implementation of a statically planned solution. An invention of new ways to work in real factories (e.g. new complex production lines and automation of production) and logistic systems (e.g. mobile phones and global positioning systems) creates the need for new solutions and the development of existing methods. Moreover, a holistic approach is becoming increasingly popular, rather than those focusing on individual executors and other resources.
The proposed approach is based on the algebraiclogical meta-model (ALMM) [4], [7], which enables making collective decisions in successive process stages, not separately for individual objects or executors [8], [14]. This approach may take into account the basic and specific limitations of the problem (e.g. deadline, resources dependent on the process state). Moreover, the ALMM approach allows reconstructing the process of decision making as well as monitoring and tracking decisions during the manufacturing or the logistic process. It should be emphasised that the decision is made jointly and takes into account the current situation (current process state). Modelling and simulation of collective decision-making in the logistic process belong to artificial intelligence method.

Motivation
The real-world scheduling problems are one of the most challenging because most of them are known to be NPcomplete or NP-hard in their general and also dynamic forms (i.e. problem data are changing while the optimisation is in progress). For example, in the Dynamic Vehicle Routing Problem (DVRP), the aim consists in designing the optimal set of routes for a fleet of vehicles in order to serve a given set of customers. However, new customer orders arrive while the working day plan is in progress. In this case, routes must be reconfigured dynamically while executing the current simulation [12]. Dynamic problems are much harder to solve then combinatorial optimisation problems, which are usually static and all data are known in advance, i.e. before the optimisation process has started. For that reason, dynamic VRP is under consideration of many research studies. Moreover, dynamic problems should be considered in the context of increasingly popular modern urban systems and smart cities [13].
The paper [19] is a survey that classifies routing problems from the perspective of information quality and evolution. A number of variants have been studied: Capacitated VRP (CVRP), where each customer has a demand for a good and vehicles have finite capacity; the VRP with Time Windows (VRPTW), where each customer must be visited during a specific time frame; the VRP with Pick-up and Delivery (PDP), where goods have to be picked up and delivered in specific amounts at the vertices; and the Heterogeneous fleet VRP (HVRP), where vehicles have different capacities. Routing problems that involve moving people between locations are referred to as Dial-A-Ride-Problem (DARP) for land transport; or Dial-A-Flight-Problem (DAFP), for air transport.
In the paper [12] the authors present the dynamic VRP, its interests in practical applications, and its specific characteristics and also a review of common solutions for the representation of the problem and the major classes of metaheuristics proposed to solve this problem. The authors present two major classes of metaheuristics: Trajectory-Based and Populations-Based. They assigned to the first group of classes Tabu Search [21], Greedy Randomized Adaptive Search Procedure (GRASP) and variable neighbourhood search (VNS) [1], [20]. To the second group belong Ant Colony [18], Evolutionary Algorithms and Particle Swarm Optimization. It should be emphasised that all these metaheuristics have been appropriately adapted to the types of dynamic variables in the considered problems or a combination of two or more metaheuristic [10], [16]. Also approaches based on dynamic programming could by classified as trajectory based simulations [3], [17]. In the literature, other classes of approaches to dynamic VPR are also considered, such as Scenario Based Planning [2] and Logic and Graph Theory-based approach [11], [22]. The approach presented in this paper is the trajectorybased simulation. Previously, the supply routes for multi-location companies problem was considered [9], it is a special case of the considered problem. This paper presents an approach based on the ALMM methodology, which takes into account the possibility of modelling the dynamic appearance of new locations to visit during the simulation by adding an appropriate general rule.

Problem Description
Let us consider a supplier company that provides goods to several customers' companies and the goods delivery is carried out by the company cars. The supplier employs a number of sales representatives (vehicles) to sell and deliver goods to customers' offices. A dynamic problem is considered, i.e. a part or all of the customers' companies are revealed dynamically during the design or execution of the routes. There may be different reasons for this. In the simplest case, the disclosure times are given in advance. In another case, the availability of companies depends on the availability of other companies or other conditions resulting from arrangements between companies. For example, in the case of multi-location companies, there is a need to visit the head office first (in order to negotiate, to establish the transaction conditions, etc.) and then delivery to their branches is possible. The article considers the general rule, according to which it can be determined from when it is possible to deliver to given company.
The problem under consideration can be stated as follows. The supplier company is assumed, in which there are given: a team of supplier travelling vehicles and a set of locations to visit. Each location represents a customer company (office-head or branches are not distinguished in the case of a multi-location company). There is also one initial location for vehicles (i.e. supplier company office). Each location (except the initial) must be visited exactly once, and any vehicle from the team can do it. Vehicles start from the initial location and return to it in the end. A location can be visited only if conditions resulting from arrangements between companies are met (they define the rules for making the company available to visit). The distances between locations are known. The aim of the optimisation is to plan vehicle routes so that the whole set of locations is visited in the shortest time.
The following notation is used: = { 1 , 2 , … | | } is a set of vehicles and = { 0 , 1 , 2 , … , | | } is a set of locations, where the initial location is denoted as 0 and for = 1, … , |N| denotes locations to visit. In addition, let denote the available location to visit and a set ⊂ \{ 0 } is a set of all available location to visit. It means that the number of A set elements is changing (is increasing). The available locations to visit in the initial state are known. The dates of availability of subsequent locations are calculated according to a predetermined rule ( ), which means that the individual's availability dates of locations to visit are dependent on the state of the system and for the location is denoted as ( ).
The distances between locations are presented in the distance matrix D, where individual elements are the distances between the location and the location , where , = 0,1,2, … , | | and ≠ . All elements are infinite. Two kinds of vehicles are considered, a car or a truck. A vehicle travels with the speed , where a car travels with the 1 speed and a truck travels with a speed 2 , because of traffic regulations and technical capabilities. In addition, the time spent at every location is omitted. The aim of optimisation is makespan . Thus, dynamic vehicle routing problem with a heterogeneous fleet (D-HVRP) is examined.
Note that the well-known vehicle routing problem is NP-hard, thus, the considered problem also belongs to the class of NP-hard problems.

Algebraic-logical Model of Problem
The described problem is modelled as a multistage decision process.
Algebraic-logical meta-model of multistage decision process is a process that is defined by the sextuple = ( , , 0 , , , ) where: is a set of decisions, = × is a set of generalised states ( is a set of proper states and ⊂ ℜ + ∪ {0} is a subset of nonnegative real numbers representing the time instants), : × → is a partial function called a transition function (it does not have to be defined for all elements of the set × ), 0 = ( 0 , 0 ) is an initial generalised state, ⊂ is a set of not admissible generalised states, ⊂ is a set of goal generalised states, i.e. the states in which we want the process to be at the end.
Transition function is defined by means of two functions, i.e. = ( , ) where: : × × → determines the next state and : × × → determines the next time instant. As a result of the decision that is taken at some proper state and a moment , the state of the process changes to ′ = ( , , ) that is observed at the moment ′ = ( , , ) = + ∆ .
Since not all decisions defined formally make sense in certain situations, the transition function is defined as a partial one. Consequently, all limitations concerning the control decisions in a given state can be defined in a convenient way by means of so-called sets of possible decisions ( ) = { ∈ : ( , ) ∈ }. The values of particular co-ordinates of a state and decision do not have to be only numerical, but may be the names of elements (symbols) as well as certain objects (e.g. a finite set, sequence etc.).
To define a particular optimisation problem in ALMM methodology, one should build an algebraiclogical model of the problem and give a specified optimisation criterion . The optimisation task is to find an admissible decision sequence ̃ that optimises criterion . Thus, the optimisation problem is defined by the pair ( , ).

Algebraic-logical model of the problem
In this section an algebraic-logical model of the problem in question is described: the state of the system, the set of non-admissible states and the set of goal states, the decision, set of possible decisions and the transition function.

Process state
The process state = ( , ) in particular moment can be described by the current state of all vehicles and a set of visited locations. The proper state is defined as: where: • 0 -set of locations, that have been visited until moment , • -state of -th vehicle, for = 1,2, … , | |, • | |+1 -set of locations available at the current moment .
At any given time a vehicle can either go to some location, or remain at the last visited location. The state of a -th vehicle at a time is as follows: = ( , ), where values of individual variables are: • ∈ ∪ 0 -location the vehicle is going to or location at which the vehicle stays, • ∈ ℜ + ∪ {0} -length of road which the vehicle has to travel to reach the assigned location. If the vehicle is staying at some location, the value of is equal to 0.
The initial generalised state 0 = ( 0 , 0 ) of the process is as follows: In the initial state (when 0 = 0), all vehicles are at the initial location 0 , therefore 0 = ( 0 , 0). There is no location visited at the time 0 , so the set 0 0 is empty: It is said that -th vehicle is idle (not working) in state when it is staying at the location and the next location can be assigned to visit. Therefore the state of -th vehicle: ( ) = ( , 0). A vehicle is working (is busy) in a given state when it is going to a designated location or has finished the journey and returned to the initial location 0 :

Decisions
In a given state a decision must be taken. It is not determined separately for one particular vehicle but collectively for all vehicles. For that reason, the decision is defined as a vector ∈ = 1 × 2 × … × | | , instead of separate values for particular vehicles. The particular coordinate ∈ represents separate decisions and refers to the -th vehicle ( = 1, . . . , | |).
It should be emphasised that because the problem is dynamical the values of the possible decision are different at different states; that is why a set of decisions from the state of ( ) is considered.
Therefore, during the collective decision making, the joint operation of all vehicles, not the sum of the activities of individual vehicles is assessed. The decision chosen is the best because of the common goal. In addition, not all decisions are possible at every stage due to the existing constraints. The decision is picked from the set of possible decisions ( , ), i.e. decisions that can be taken in the particular state . In the problem under consideration, the decision is to determine to which locations particular vehicles should go at the moment . The location to visit must be available at this state. The time of location availability is not known in advance because it depends on the state of the whole system. Therefore, a rule ( ), checking whether the set of available locations should be increased by a new available location, should be calculated in each state.
It was assumed that the decision assigning a vehicle the next location to visit can be taken only after it reaches the previously assigned location. The decision cannot be changed for a vehicle which has not yet reached its assigned location (a decision can be only to continue the travel). For a vehicle staying at a certain location, a decision can be to visit another location, to return to the initial location or remain at the current location.
Thus, the decision in the state is a vector, ( ) = ( 1 ( ), … , | | ( )) and the value of particular coordinate ( ) ∈ | |+1 ( ) is as follows: • if the -th vehicle is working and location is assigned to the -th vehicle (i.e. = ( , ), where > 0), then only the possible decision to take is traveling continuation = , • if the -th vehicle is idle (is staying at location, i.e. = ( , 0)) and there is no location that could be assigned ( | |+1 = ∅) but not all locations have been visited or is assigning to visit i.e. ( ≠ 0 ( ) ∪ { : ∃ = 1, … , | |: ( ) = ( , ), > 0} ), then the only possible decision to take is not assigning any new location, the -th vehicle is still staying at location, = , • if the -th vehicle is idle (is staying at location, i.e. = ( , 0)) and there is an available location that could be assigned ( | |+1 ≠ ∅ ∧ ∈ | |+1 ), then the possible decision to take is assigning any new location ( = ), • if the -th vehicle is idle is staying at location, i.e. = ( , 0)) and there is no location that could be assigned ( | |+1 = ∅) and all locations have been visited or is assigning to visit, i.e.
The complete definition of the set of the possible decision is as follows: where ( ) is a set of decisions assigning the same location to more than one vehicle at the same time except that all locations have been visited and the vehicle return to the initial location 0 :

Transition function
Based on the current state = ( , ) and the decision taken in this state (s), the subsequent state ( ′, ′) = ( , , ) is generated by means of the transition function . The transition function is defined for each possible decision ( ) ∈ ( ) and consists of two stages.
First, it is necessary to determine the moment ′ when the subsequent state occurs. It is the nearest moment in which at least one vehicle has reached the previously assigned location or the next moment the location is available. The subsequent state ′ will occur at the moment ′ = + Δ , where Δ equals the lowest value of the established completion times: • if -th vehicle, staying at location , is idle and has an assigned location to visit, the completion time is equal to = , where = 1 for car or = 2 for truck, • if -th vehicle is going to a previously assigned location and the length of the remaining part of the road which vehicle has to travel to reach the assigned location is equal to (i.e. = ( , ), 0 < < ), then the completion time is equal to = , where = 1 for car or = 2 for truck, • if -th vehicle is idle and no new location is assigned, the completion time is equal to the minimum time at which a new location will be available: Once the moment ′ is known, it is possible to determine the proper state of the process at that time.
The state of vehicle The values of subsequent coordinates for = 1,2, … , | | represent the states of vehicles. The state = ( , ) goes into ′ = ( ′, ′ ), for = 1,2, … , | | and is determined as follows: • if the -th vehicle is going to location (i.e. vehicle is working -= ( , ), where > 0) and decision is continuation of travel = : • if the -th vehicle is staying at location (i.e. the vehicle is idle -= ( , 0)) and there is no location that could be assigned ( | |+1 = ∅) but not all locations have been visited or is assigning to visit i.e. ( ≠ 0 ( ) ∪ ( ) ) and decision is still staying at location, = : • if the -th vehicle is staying at location, (i.e. = ( , 0)) and there is available location that could be assigned ( | |+1 ≠ ∅ ∧ ∈ | |+1 ) and decision is assigning any new location ( = ): • if the -th vehicle is staying at location, (i.e. = ( , 0)) and there is no location that could be assigned ( | |+1 = ∅) and all locations have been visited or is assigning to visit (i.e. = 0 ( ) ∪ ( ) ) and decision is assigning the initial location 0 , = : The set of available locations The set of locations available in a given state (coordinate | |+1 ) is modified by reduction of the locations which have been just visited or assigned to visit by vehicle and adding each location, whose availability time, calculated from rule ( ) is in time window between current state and next state and the initial state 0 is always available:
The general corresponding algorithm for generating a single trajectory for the problem under consideration is as follows. The trajectory is generated from the initial state 0 = ( 0 , 0 ). Each next state ′ = ( ′ , ′ ) depends on the previous state = ( , ) and the decision ( ) made at that state. The decision ( ) is chosen from a set of possible decisions at the given state, using a choosing technique. Then, for the given state and chosen decision a new process state ′ = ( ′ , ′ ) is determined. The process of the proper state and the corresponding moment of time are calculated using the transition function of the process ( , , ). Firstly, moment ′ is determined as the minimum value from the time the vehicles complete their activity and the time of availability of the new locations, calculated by the rule ( ). Secondly, the proper state of the process ′ is determined i.e. the set of visited locations, the state of a particular vehicle and the set of available locations. Then it is checked whether the new system state belongs to the set of non-admissible generalised states; if so, the trajectory generation is stopped (the trajectory is nonadmissible). The next step is to check the membership of the goal states set. If the new state belongs to this set, the generation of the trajectory is completed successfully and the identification of the solution can be made. Figure  1 shows the schema of the algorithm. A very important element of the algorithm is the technique for choosing decision ( ). The decisions can be determined by local decision-making procedures. The literature presents different heuristic methods of discrete optimisation problems using local optimisation, which are based on minimisation (maximisation) of the local increase of quality criterion. Moreover, based on the properties of the ALMM methodology, it is able to create sophisticated local optimisation criteria, which take into account much more information than merely the increase of the criterion (for example unfavourable states or distinguished favourable states). Another way to determine the decision in a given process state is by using specially constructed rules, which determine only one or a few (usually not all) of the decisions belonging to the set of possible decisions ( ). A specific algorithm requires establishing the decision-making technique. Based on the ALMM methodology a special type of metaheuristics and algorithms are developed ( [6][7][8][9], [14]) and can be used to determine local decisionmaking procedures in this specific problems.

Conclusion
The paper presents the approach based on the ALMM methodology, which takes into account the possibility of modelling the dynamic appearance of new locations to visit, during the simulation, by adding an appropriate general rule. It has been shown that this approach enables defining flexible rules to model a dynamic process. The future research plan to add predictive methods to support the decision-making process with additional knowledge to the model.