Development of the Multifactor Computational Models of the Solid Propellants Combustion by Means of Data Science Methods. Propellant Combustion Genome Conception.

. The results of usage of data science methods, in particular artificial neural networks, for the creation of new multifactor computational models of the solid propellants (SP) combustion that solve the direct and inverse tasks are presented. The own analytical platform Loginom was used for the models creation. The models of combustion of double based SP with such nano additives as metals, metal oxides, termites were created by means of experimental data published in scientific literature. The goal function of the models were burning rate (direct tasks) as well as propellants composition (inverse tasks). The basis (script) of a creation of Data Warehouse of SP combustion was developed. The Data Warehouse can be supplemented by new experimental data and metadata in automated mode and serve as a basis for creating generalized combustion models of SP and thus the beginning of work in a new direction of combustion science, which the authors propose to call “Propellant Combustion Genome” (by analogy with a very famous Materials Genome Initiative, USA). "Propellant Combustion Genome" opens wide possibilities for accelerate the advanced propellants development


Introduction
There is a great deal of experimental data about combustion characteristics of various solid propellant compositions. It is obvious that combustion characteristics are closely associated with the composition of combustible mixtures and kind of mixture components, with kind of catalytic and other additives as well as additive's size, the pressure and initial temperature, etc.
However, now there is no generalized multifactor model for predicting the effect of these factors. There is no multifactor model that allows the prediction of the burning rate of a new solid propellant mixture for different ranges of pressure and initial temperature (direct problem). There is no multifactor model that allows the determination of a new solid propellant that can provide a desirable burning rate at the required pressure and initial temperature (inverse problem). Therefore, generally, we need to conduct a large number of expensive and hazardous experiments to obtain the burning rate information or to determine the optimal solid propellant composition.
The burning rate of the solid propellants can be altered and tailored by the additives. Application of the nano size additives in the energetic materials compositions is a relatively young but very promising field of research, and expected results could lead to new achievements in the development of advanced explosives, propellants. There have been many studies in this direction worldwide [1]. Common advantages of nano size additive are their great reactivity and ability to promote of very fast chemical transformations.
The combustion of propellant systems containing nano size additives are extremely complex due to the heterogeneity of the solid propellant grain. The hazards and cost involved in developing these systems through elaborate experiments could be reduced with the development of a proper combustion models. In this work, we present results of artificial neural networks (ANN) usage for a creation of generalized

Experimental, modelling, results and discussion
The results of experimental research of combustion of double-based propellants with nano size additives were taken in [1]. The results deal with the effects of nano size additives on combustion of double based SP with such nano additives as metals, metal oxides, metal salts, metal composite materials, organic metallic compounds, termites and carbon nano materials. It have been used for the creation of combustion multifactor computational models that solve the direct and inverse tasks. All models were obtained by means of ANN. The ANN can be considered as a universal tool for multidimensional approximation [2,3]. The Kolmogorov-Arnold theorem, which deals with the capability of representation of a function of several variables by means of superposition of functions of a smaller number of variables, is the basis of ANN applications.
The latest wording of the theorem consists of representation of a function of several variables function by means of superposition of functions of only one variable. It allows using ANN for solving the problem of approximation of function of several variables in all cases. The real computer emulator of ANN is like usual computer programs. The difference is that its creation is based on the use of a training procedure by means of a set of examples (a database of examples).
The basis of ANN and techniques of ANN usage for experimental data modeling are presented and described in [4 -11].
Here we should briefly explain how an ANN layout is selected. At present, there are only some general recommendations.
1. It is believed that a problem of any complexity can be solved using a two-hidden-layer neural network, therefore setting a configuration with more than 2 hidden layers can hardly be justified.
2. A single-hidden-layer neural network is quite good for solving many problems.
3. The following rule should be adhered to when selecting the number of neurons: "The number of connections between the neurons must be considerably less than the number of samples in the training set". The number of connections is calculated as the number of each neuron's connections with all the neurons in the adjacent layers, including the connections in the input and the output layers.
A database for ANN training (a known discrete set of input information and a known discrete set of output information corresponding to each other) can be formed by means of various techniques. For example, a database for ANN training can be formed by means of real experimental data or data obtained by means of numerical experiments. ANN technologies do not negate, but rather include and extend statistical methods, methods of experimental design, and methods of solution regularization. A main advantage of ANN modeling technologies is that they can be used to solve problems that have no obvious algorithmic solution. With ANN, legitimacies of a system behavior may be revealed on the basis of "inexact or incomplete data" about the characteristics of the system. This is especially important when the system is characterized by many parameters and the determination of a part of them is technically unfeasible.
Our models were created as follows.
Firstly the all graphs of [1] have been digitized and supplemented by metadata concerning kinds of additives and its physical-chemical properties.
Secondly, the all results obtained were organized as the Data Warehouse.
The Data Warehouse allows obtaining any set of experimental data and metadata as a table for solving any kind of task of modeling The example of a small part of the table obtained by means of the Data Warehouse is presented on Figure 1 in Supplementary Materials (SM). These data were used for solution of direct and inverse tasks deal with effects of metals and termites on combustion of double-based SP.
The table on Figure 1 in SM consists of 5 column of experimental data and 1049 rows (the Data Warehouse as a whole consist of 20 variables: directly measured experimental data and metadata). The heads of column are the names of factors and the name of goal functionburning rate. The every row is the set of concrete values of factors and corresponding value of the burning rate. The ANN structure for solving the direct task -the creation of multifactor computational model revealing the dependence of burning rate of SP on kind of SP, pressure, * Corresponding author: abrukov@yandex.ru additive's kind -is presented on Figure 2 in SM. It was determined by means of the rules of the usage of ANN [5]. The ANN consists of one input layer (7 neurons), one hidden (inner) layers (7 neurons), and one output layer (1 neuron).
This calculation structure, after training on the experimental data, allows instantly to determine (to calculate) the burning rate value for any set of values of factors. We used well-known training procedure named the method of "back propagation of errors" [12].
The verification of the obtained ANN model was performed in an automatic mode. The way of verification was as follows. The 95% of the experimental data were used to train the model, and the 5% of the data were not used. The model uses these 5% to check the accuracy of the model in automatic mode. Thus ANN, directly during the training, checks up itself by means of real experimental data and does an estimation of quality itself.
The set of values of the factors (pressure, kind of propellant, kind of additive and additive's size), value of burning rate calculated as well as the graph of dependence of burning rate on pressure are depicted on Figure 3 and 4 in SM. The results depict an influence of additive's size (micro and nano) on burning rate value and it's dependences on pressure in the case of additive Al/PbO.  The ability to depict the graph simultaneously with the input and output data is a very useful ability of our own analytical platform Loginom.
The set of values of the factors (pressure, kind of propellant, kind of additive and additive's size), value of burning rate as well as the graph of burning rate values for various additive's for the P = 3 MPa are depicted on Figure  5 in SM. The estimation of the root-mean-square error of the ANN model depicts that the root-mean-square error of the models for 87% of the data does not exceed 1.3 x10-4 (for training set) and 1.     In last case, the result of modeling is extrapolation of dependences revealed far beyond the experiments conducted.
The possibility of extrapolation of the multifactor experiment results is a good advantage of ANN techniques.
The possibilities of ANN techniques for solving the inverse task are depicted on Figures 10 -12 in SM.   The ANN structure for solving the inverse task is depicted on Figure 13 in SM. In this case input data are burning rate, pressure, additive's kind, output data are the quantity of NG/NC (two possible value -0,89 -DB-propellant and 0,66 -MDB-propellant with the RDX (hexogen) and additive's size (two possible value -micron's sizes and nano's sizes. The ANN consists of one input layer (6 neurons), two hidden (inner) layers (4 and 3 neurons), and one output layer (2 neurons).
This calculation structure, after training on the experimental data, allows to determine (to calculate) the quantity of NG/NC and additive's size (i.e. composition of SP) that allow to obtain the required value of burning rate at the desired pressure.
The result of modeling which is depicted on Figure 13 is an extrapolation of dependences revealed beyond the experiments conducted. The possibility of extrapolation of the inverse task (also as the direct task) of multifactor experiment results is a unique advantage of ANN techniques.

Conclusions
The results of usage of artificial neural networks (ANN), for the creation of new multifactor computational models of the solid propellants (SP) combustion that solve the direct and inverse tasks are presented. Analysis of the results obtained depicts that ANN have the wide possibilities for solution direct and inverse tasks of SP combustion research. The usage of such modern methods of Data Science as deep learning neural networks, multiple adaptive regression splines, decision trees and selforganizing Kohonen maps could make the possibilities of multifactor modeling of SP combustion most wide. It could allow beginning of work in a new direction of combustion science, which the authors propose to call "Propellant Combustion Genome" (PCG).
Under the PCG we propose to understand the totality of multifactor computational models containing links between all the variables of the combustion process and allowing to determine the composition of the propellant, which provides the required burning rate at a given pressure.
In the general classification of the problems of physics of combustion, this can be attributed to inverse problem.
The basis of the PCG is data on the burning rate in various conditions. The PCG will include multifactor computational models, which allow to determine burning rate using data on propellants composition and pressure. In the general classification of the problems of physics of combustion, this refers to direct problems. These models for solving direct problems also allow solving inverse problems by the method of enumerating of values (sets of values) of input factors (propellant composition and pressure).
It can open excellent wide possibilities for accelerate the advanced propellants development.