Setting the Configuration Parameters of the Algorithm for the Periodic Vehicle Routing Problem by HPC Power

. The quality of an optimal solution of the Vehicle Routing Problem is strongly depended on the setting of the configuration parameters of the algorithm. The paper is focused on the introduction of hyperparameter search for solving the Vehicle Routing Problem using a HyperLoom platform for defining and executing scientific pipelines in a distributed environment. To give a concrete example, we focused on Periodic Vehicle Routing Problem for the waste collection. HyperLoom platform was used to define and execute the hyperparameters sweep pipeline. The heuristic algorithm was tested on a real benchmark of the waste collection in Ostrava, Czech Republic. The aim of our case was to effectively combine the minimization of the total travelled distance and the optimization of the fairness of the routes in terms of the standard deviation of a tour length. The waste collection problem was very extensive and computationally demanding, so it was necessary to use high performance computing architecture for testing a large number of different settings of configuration parameters. The experiments were run on the supercomputer Salomon operated by IT4Innovations National Supercomputing Center in the Czech Republic.


Introduction
In 2017, the Czech Republic p roduced 34.5 million tonnes of all waste, of which 1.5 million tonnes of hazardous waste. The Czech Republic is very intensively engaged in waste recycling. Fro m the 34.5 million tonnes of all waste, 84% was recycled, of which 80.5% was material and 3.6% as energy [1]. Every resident in the Czech Republic produces 3,259 kg of all types of the waste per year, of which about 531 kg is the mixed waste. Ostrava is the third largest city in the Czech Republic. Therefore, the problem of the waste collection is very important.
The waste management process is very complex, involving a range of technological methods and disciplines. Co llect ing problems in transport planning are usually solved by graph theory. One of the types of combinatorial optimization is the Traveling Salesman Problem (TSP), an extension of the TSP problem is Vehicle Routing Problem (VRP) [2], where the problem aims on creating a route of several vehicles with a limited capacity to determine the number of customers.
In this paper, we used the Periodic Veh icle Routing Problem (PVRP). The typical planning period is one day for the basic formulat ion of VRP. In this case of PVRP, the classical planning is generalized for M days. The difference is that a vehicle is not returning to a depot on the same day (for example, the day it left the depot). In this type of the problem, the goal is to minimize the number of vehicles and time needed to serve the customers (for examp le, the cost for all rounds). Each customer has a given daily demand, wh ich must be satisfied by one vehicle and by one route. If M = 1, we are talking about typical traffic delivery problems. In the case of PVRP, the demand of indiv idual customers can be clustered into groups, creating an individual route for each day. Each customer must be served k-times, where 1 ≤ k ≤ M. In the literature, PVRP has received considerable attention. The first works on PVRP were presented by the authors [3], and then there are more papers on this type of the problem [4,5]. Periodic deliveries are used in a lot of applicat ions, includ ing courier services, elevator maintenance and repair, vending machine rep lenishment, the collection of waste or the delivery of interlibrary loan material.
Heuristic and metaheuristic algorithms are used to solve the problem of waste collection. These algorithms need to set the configuration parameters that improve the quality of the provided solution. In this work, the heuristic algorith m that was used for an exhausted oil collection problem in the area of Bali, Indonesia [6,7] will be used. The formu lation types of the problems are very similar. In heuristic algorith m, nine configuration parameters for each type of waste need to be set. Datasets were generated on the basis of cooperation with the collecting co mpany in Ostrava. In total, 28,800 combinations of configuration parameters will be used for algorithm testing.
The aim of this article is to introduce an implementation of a distributed hyperparameter search for the vehicle routing problem solver based on HyperLoo m [8]. Using HyperLoo m p ipeline we are able to sweep through large parameter spaces and discover the optimal solution for the heuristic algorith m. HyperLoo m can discover the optimal solution for the heuristic algorithm.
The waste collection problem is very extensive and computationally demanding, so it is necessary to use high performance computing (HPC) architecture for testing a large nu mber of d ifferent settings of configuration parameters. The experiments were run on the supercomputer Salomon operated by IT4Innovations.

Periodic VRP for waste collection
Betrami and Bodin devised the PVRP problem of waste collection fo r the city of New Yo rk and Washington DC [3]. The first authors represented the concept of PVRP for the daily planning of collection routes. The aim of this concept of Beltra mi and Bodin is to minimize the time to be travelled and to select the feasible number of vehicles to be used instead of minimizing the costs of distribution. The solution of the problem was modified by Clarke-Wright algorith m. A Heuristic algorithm is comb ined here with randomized search procedures [3].
The PVRP for waste collect ion represented Russel and Igo, who developed heuristic algorithm assigning points to days of the week in order to min imize weekly travel distance (time). The problem presented herein is expanded to more than 750 points [9].
The four-phase approach to the PVRP for the waste collection was presented by Russel and Gribbin in [10], the number of different solution techniques by Chao, Go lden, and Wasil in [11]. This article, the aim is to effectively co mb ine the minimization of the total travelled d istance and the optimization of the fairness of the routes in terms of the standard deviation of a tour length.

Setting the configuration parameters of PVRP algorithm
The authors in the articles [3,12,13] used heuristics or metaheuristics algorith ms for the PVRP. The Clark-Wright algorith m is one of the classic heuristics algorith ms that is used to solve the VRP and is based on the notion of savings. The solution of the task in Clarke-Wright algorith m is done through the iterations, incremental steps. First of all, the algorith m finds the least effective solution, which is improving each successive step. The solutions carried out in gradual steps monitor and patrol a defined limit ing condition. Local Search (LS) is an algorithm, wh ich iterat ively translates fro m solution to solution in a set of acceptable solutions through local changes until we reach the solution, which we consider to be optimal or does not expire the specified search time. Tabu Search (TS) is a metaheuristic algorith m, which uses LS with other rules. LS has tended to end up in a local optimu m or in the plains, where several solutions are of equal value. TS, therefore, improves LS by changing the basic rules. No step in the environment would imp rove the solution, we will use a worsening step (this way we can get fro m the local optimu m), and we will introduce bans (hence the notion of tabu). The genetic algorith m (GA) is the metaheuristic algorith m, which attempts to copy the natural selection process. GA belongs to a broader set of so-called evolutionary algorith ms, wh ich mimic techniques from nature such as inheritance, mutation, selection, or crossing.
Optimization algorith ms for solving VRP problem involve a large number of design choices and algorith mspecific parameters that need to be set to reach their best performance. The maximu m performance of optimizat ion algorith ms may depend on the setting of tens to hundreds of parameters. There are a lot of methods to solve the setting of the configuration parameters of the VRP algorith m. Authors in [12] used heuristic search techniques for the setting of the configuration parameters for the metaheuristics algorith ms. Another group of parameters tuning of optimizat ion algorith ms is focused on statistical modelling approaches [13]. Authors in [12] proposed machine learning approaches.
In machine learning, there exists the term hyperparameters. Hyperparameters present configuration variables that are external to the model and whose values cannot be estimated fro m data. Hyperparameters tuning can be done using Grid Search, Randomized Search or Bayesian optimization.

Test case for PVRP waste collection
In this work, we used mixed waste, paper waste, plastic waste, and glass waste. We prepared a set of 3,000 different locations for the mixed waste, a set of 2,000 different locations for the plastic waste, a set of 1,500 different locations for the paper waste and a set of 1,000 different locations for the glass waste. For each location, we know the expected amount of the waste that accumulates within one week. The travel times between any two locations are known. Every waste collection tour starts at the depot with the empty vehicle. Every vehicle starts from the depot, picks-up the waste from the stops until they are full, du mps them at a dump site. The dump site and the depot for the vehicle are at the same GPS location. Each GPS location is one collection point for the vehicle. The travel time for each route is not allowed to exceed a given bound on working hours corresponding to eight hours. The metric distances were created for each dataset in Ostrava. The situation is displayed in Figure 1. Planning time for the waste collection was 14 days, including the weekend. The type of collection vehicles depends on the type of the waste.

HyperLoom for heuristic algorithm
We have explored 7,200 co mbinations of the configuration parameters for each type of the waste resulting in total of 28,800 unique parameter configurations. We developed a HyperLoo m pipeline (see Figure 2) that allows to execute part icularly configured instances of the heuristic algorithm in a parallel fashion within a distributed system and dynamically load-balance the workload fairly between the nodes as the instances are being finished.  Table 1 contains the settings of configuration parameters for particular types of the waste. The experiments were performed on Salo mon supercomputer, where each node has two-twelve-core Intel Xeon processors and 128 GB RAM (2xIntel Xeon E5 -2680 v3, 2.5 GHz, 12 cores). We have carried out all the experiments on 56 identical physical computational nodes. Figure 3 illustrates the CPU and memory utilization of a subset of computational nodes during the pipeline execution.

Test case results
The goal of our work is to optimize the fairness of the routes in terms of the standard deviation of a tour length and to minimize the total of travelled distance. That means that we are looking fo r a co mbination of input parameters for which the resulting standard deviation of a tour length value is minimal. If there are more such combinations, we choose one that has a minimu m total travelled d istance. Figure 4 contains the results of the calculations. The required values of standard deviation vs. total travelled distance are found as far as possible at the bottom left for each type of waste. As can be seen fro m Figure 4, the results for d ifferent types of waste show different behaviour. For mixed waste, unlike paper and p lastic, there is an increase in the total travelled distance when the standard deviation decreases. This behaviour is also shown by the value of the correlation coefficient ρ = -0.79. Plastic and paper waste show the same type of dependency, when the standard deviation increases, the total travelled d istance also increases, with the value of correlation coefficients ρ = 0.43 and ρ = 0.59, respectively. For the glass waste, there is no linear dependence between standard deviation and total travelled distance with the value of correlation parameter ρ ≈ 0.
The best configuration of input parameters is listed in Table 2. There are listed only those configuration parameters whose values have affected the results because in our case, we deal with a real dataset, the values in Table 2 can allow waste collection companies to reduce their overall waste collection costs.

Conclusion
The goal of our work was to optimize the fairness of the routes in terms of standard deviation of a tour length and to minimize the total travelled d istance. We solved this problem by taking the standard deviation as a priority parameter for minimizing. In practice, co mpanies can put more emphasis on min imizing total travelled distance, or on combining both parameters with different weigh ts. In this case, the problem could be solved by the Euclidean distance approach.