Improvement of the Optimal Design Procedure Using Randomized Algorithm and Process Simulators

Though a global optimization procedure using a randomized algorithm and a commercial process simulator is relatively easy to implement for complex design problems (i.e., intensified design processes), a dominant problem is their heavy computation load. As the process simulation is repeatedly executed to calculate the objective function, it is inevitable to spend long computation time to derive the optimal solution. Also, the randomized algorithms consider the treatment of all variables as continuous. Thus, the reduction of the number of iterations is crucial for such optimization procedures that include integer variables. In this work, an estimation procedure of the objective function having integer design variables is proposed. In the proposed procedure, the values of the objective function at the nodes of hyper-triangle that includes the suggested next search point are used to estimate the objective function, at the same time normalization of the design optimization variables is recommended. The procedure was implemented on the simulated annealing stochastic algorithm with a trivial case of a binary mixture in order to know the optimal solution and compare the traditional optimizations procedures and the proposed one. The proposed procedure show improvement not only for reducing the number iterations, but also for an increase of accuracy of finding the optimal solution.


Introduction
With the new tendencies of the intensified processes, many designs had been proposed. The new intensified processes design involves a big challenge to seek optimal design. Many strategies have been developed and applied for the optimization problem. Some works applied gradient-based approaches (i.e., deterministic) and others the randomization approaches (i.e., stochastic) (Cavazzuti, 2013). The advantages and disadvantages have been widely discussed. Both approaches have tackled the global optimization problems for intensified processes (Floudas and Gounaris, 2009;Ricardez-Sandoval et al., 2009;Yuan et al., 2012;Segovia-Hernández et al., 2015). In particular, randomization approaches are increasing their popularity due to their relatively easy implementation in combination with software frameworks (Cabrera-Ruiz et al., 2012;Vazquez-Castillo et al., 2009;Gitizadeh, et al., 2013;Santaella et al., 2014). As both approaches show advantages in the disadvantages with respect to each other, hybrid optimization has been suggested as a way to complement each other (Grossmann and Westerberg, 2000). Many works have proposed hybrid methods, either gradient-based (Floudas and Gounaris, 2009;Goryachih and Rachinskaya, 2017) and randomized methods (Sayyaadi et al., 2010;Gitizadeh et al., 2013) or hybrid between different method approach (Renders and Flasse, 1996;Popović et al., 2014). Munawar and Gudy (2005) have pointed out that the "advantage" about to take all values as continuous can be a great "disadvantage" in terms of efficiency. Despite the large numbers of works in randomized optimization, only a few works had focused on the main issue to increase the efficiency of the design processes problems. Nevertheless, it is necessary to improve the efficiency of these algorithms, because of their use in solving optimization problems have been increased in order to find new intensified process.
This work focuses on improving randomized algorithms used in the design of chemical engineering processes that became highly popular due to the new software frameworks that allow optimizing the process using commercial simulator process (e.g., Aspen Plus). The way to handle the randomized algorithms is based in their original development through the solution of well known mathematical problems; however the optimization criteria of these algorithms cannot be the best for the design processes problems because of the several restrictions managed for the process, most of then physical than mathematical thought.

Methodology
This section widely explains the main issues that considerably affect the efficiency of randomized algorithms used for the processes optimization; It is caused by excessive, and unnecessary iterations. The main factors have been tackled for: purity restrictions, the range value differences between variables, and the discrete variables handled as continuous.

Definition of the problem
Formally for the randomized algorithms, it is necessary to establish the range of the variables as large as are possible to allow the algorithm to move freely through to whole values region. This can be true for most of the mathematical problems, but as it is known, mathematical models of chemical processes show high stiffness caused by the model complexity itself, and the combination of the discrete and continuous values, and their restrictions, respectively. Figure 1 shows two discrete values N1 and N2, where only the green dots correspond to real feasible points, and the rest of the shading area is unfeasible. In this work, the following notation is used: Ni,j represents that element j is within the bounds of variable i where Ni,j < Ni,j+1. Nevertheless, typical randomized algorithms take these discrete values as continuous as shown in Figure 2, where Ni,j<Ni,j+1 and i is the discrete variable and j the feasible element of the variable; therefore, any real value can be selected (N1,R, N2,R). Then, these values are enforced to take integer values by rounding them up (ceil), rounding them down (floor), or a regular rounding. In any case, a plain surface solution will be generated for all the values circumscribing the feasible points. Among the four feasible combinations, the point (N1,1, N2,2) has the lowest value of the objective function z, and it will never be taken because rounding up or down steps will evaluate the objective function at points (N1,2, N2,2) or (N1,1, N2,1), respectively. Only the area corresponding to the green surface will take the point (N1,1, N2,2). Eventually, searching the best point can spend a lot of computing time moving through this area, and only if the optimization algorithm searches other points, thus another area, unnecessarily calculations on the previous area with all its unfeasible points resulted in time lost. The situation is the same for any of the three rounding options, however despite can be considered a better option to choose a regular rounding, this option it is not very appropriated to move around the limits.
Also, derivated of the freedom of the variables many times all variables design had chosen optimization variables. Nevertheless we must consider that these variables have a direct influence to the restriction; furthermore, the behavior of these variables are mostly convex (e.g., reflux ratio and distillate component purity), then used as optimization variables only put extra effort on the algorithm development.

Figure 2. Typical rounding treatment of randomization approach for discrete variables
There are two ways to handle process constraints: 1) explicitly define them in the optimization algorithm and take them as free variables, and 2) implicitly define them in built-in «design specifications» and take them as dependent variables (fixed variables). Both cases were discussed in this work. Additionally, in randomized algorithms, the restriction is handled with an arbitrary value that punishes the function for the cases of the restrictions (e.g., purities) are not satisfied. Since the punishment value is arbitrary, an inadequate selection of it can make impossible the optimization algorithm to find optimal solutions.

Normalization
The bounds of the optimization variables differ from each other, and even for some variables, the used system of units can change the range between tens and thousands. In most of the works about global stochastic optimization, it not clear whether the variables are handled as their real values or not. Nevertheless, typical randomized algorithms use real values for the bounds of variables. Considering these changes in the magnitude of variables, they also will add stiffness to the mathematical model. This work proposes a normalization of variables. The normalization will allow moving freely and fairly for the variables value ranges; also can avoid numerical effort because of the precision of the continuous variables.
A normalization between 1 and 0 can be obtained as described in Equation 1, where the UL and LL are the upper and lower limit of the variable (x), respectively.

Discrete variables treatment
Taken two discrete variables, if we have a surface, as shown in Figure 3, the discrete point can move adequately on the surface in direction to the feasible point with the minimum value. However, the value of the real point cannot be calculated as an average of the four vertexes. This is because these vertexes can generate a convex surface instead.  Figure 3 can be generated (Case A and B), despite the "real mathematical surface" can be a little more complex as shown for example with a blue surface in the figure, which is not of interest because it contains infeasible solutions. Therefore, only we need to generate proper surfaces that emphasize the slop through the minimum value. In order to warranty that a plain surface is generated we need to reduce the dimensional space to three vertexes. The combination of two discrete variables implies four hypertriangles. The selection of the three vertexes is based on the three nearest vertexes points ( Figure 5). Then an evaluation of the three functions values it will be required for the algorithm instead of four.  (2) Where j represents each distance between the real node and the feasible nodes.  Figure 7 shows graphically the points mentioned before accordingly with the discrete variables.

Study case
For evaluating the proposed improvements to optimization algorithms, the taken case study is a conventional distillation that separates a binary mixture. This is due to the fact we know the optimal trivial solution of the separation processes; this is the maximum number of stages with the minimum reflux ratio gives the minimum heat duty consumption. Then we know the solution that must be obtained in order to compare the optimization algorithm with the traditional algorithms and with the proposed one. The optimizations are defined by Equation 6 and 7, where NT is the total number stages, NF is the feed stage, P is the pressure of the column and R the reflux ratio (see Figure 8). Where the purities of each component must be at least 99% mol. min( R ) = ( T , F , , ) min( ) = ( T , F , , ) (7) Subject to ⃗ ≥ 0.99 (8) As mentioned above, the solution for this case is already known because the total number of stages will be at the upper bound. However, this algorithm can be extended to the optimization of intensified processes (e.g., reactive distillation). The binary case is an equimolar benzene/toluene mixture, simulated in Aspen Plus®, and the optimization algorithm used is Simulated Annealing in Matlab® toolbox. The framework is through an Excel® macro. The Simulated Annealing parameters have kept as default and changed for a reduced version of the stall generation stop of 400.
Noticed that the reflux ratio as a variable would disappear for the cases that simulation use design specification. For the estimation of the total annual cost (TAC) is used the Guthrie methodology as reported by Richard Turton, et al. (2018) using CEPCI of 607.2 (July 2018). The TAC is calculated considering a feedback period of 5 years for the investment equipment and 8600 hours for the labor hours of the process The optimizations were for the combinations considering the traditional framework implementation (without normalization, without design specifications and with upper rounding), and using a combination of the normalization, design specification and discrete values approach.

Results and Observations
Tables 1 to 4 shows most of the combinations of the use of design specification, normalization values, and discrete value approach. It can be noticed how the performance of the optimization improves with the combination of the options, and also exist and improvement of the efficacy to find the optimal solution. For these cases looks the design specification of the process has the bigger influence of finding the optimal solution, nevertheless the situation will change for the TAC cases. Any case, it is evident that using the design specification not only we reduce one design variable; also, the function value is not restrained to arbitrary punish values, and then avoiding to the optimization loses finding the reflux ratio proper to the restriction instead to searching for the other variables. Furthermore, it cannot be denied the combination of the three parameters improves the efficiency and efficacy of the optimal finding solution. In Tables 5 to 8 show the same combinations for the TAC evaluation as function objective. For these cases, it is more clear that using a design specification has an important role in finding the optimal solution. Nevertheless, it generates a bigger numerical effort. The implementation of the normalization helps to reduce this numerical effort, and at the same time improves the efficacy of the optimization slightly. It can be considered the most important part of the optimization is the design specification, due to all the options with design specification aim to reach the optimal solution. This is not a trivial coincidence because the algorithm is released of the arbitrary punishment. A good contribution of the normalization avoids many Aspen running simulations because the introduced values can be restricted dynamically then, that avoid taking an unfeasible design that will be rejected for the simulator and eventually took as an inappropriate area to optimize. Finally, the discrete values approach, reduce enormously the numerical effort by generating a softer surface reducing the stiffness of the system.

Conclusions
The proposed improvements for the optimization of the design processes has shown an evident improvement of the efficiency and efficacy of the optimization; this is the reduction of the number of iterations (until 80% fewer iterations) and the localization of the optimal design getting closer than the feasible discrete values. To guarantee the effectivity of this approach the three recommendations: (1) normalization (2) design specification, and (3) approach to discrete value points treatment, must be used at the same time for the sake of better performance.