Applying Machine Learning Algorithms to Solve Inverse Problems in Electrical Tomography

The article presents four selected methods of supervised machine learning, which can be successfully used in the tomography of flood embankments, walls, tanks, reactors and pipes. A comparison of the following methods was made: Artificial Neural Networks (ANN), Supported Vector Machine (SVM), K-Nearest Neighbour (KNN) and Multivariate Adaptive Regression Splines (MAR Splines). All analysed methods concerned regression problems. Thanks to performed analysis the differences expressed quantitatively were visualized with the use of indicators such as regression, error of mean square deviation, etc. Moreover, an innovative method of denoising tomographic output images with the use of convolutional auto-encoders was presented. Thanks to the use of a convolutional structure composed of two autoencoders, a significant improvement in the quality of the output image from the ECT tomography was achieved.


Introduction
The issue of tomographic imaging is an important element of modern life, not only because of the role of tomography in medicine, but also the potential of this technology, which has not been used so far [1][2][3][10][11][12][13][14][15][16][17].
The use of tomography in relation to the imaging of internal cross-sections of large embankment structures, flood dams or building masonry is still relatively small. The main reason is the lack of methods ensuring high enough output image resolution. The barrier is the smaller amount of input data than the average resolution of the image, which is required for an efficient, detailed display of the interior of a large object. Hence the need to develop methods that will overcome this barrier to the development of large-scale objects tomography [6,8,12].
This article presents a number of methods that can solve the above problems both in relation to the EIT (electrical impedance tomography) and ECT (electrical capacitance tomography). In addition, a way to improve the quality of imaging is presented through the use of convolutional auto-encoders, whose task is to denoise the output images.

Machine Learning in tomographic image reconstruction
This chapter presents four machine learning methods that can be used in tomographic image reconstruction processes: Neural Networks with Multilayer Perceptron, Supported Vector Machine, K-Nearest Neighbour and Multivariate Adaptive Regression Splines (MAR Splines) [4,5,7,9].
In order to analyse their effectiveness, a similar set of data was used. Thanks to this, it was possible to compare them and evaluate their effectiveness in terms of reconstructing images. The discussed concept of tomographic processing of electrical signals into images using the EIT method is applicable to different kind of buildings. The ECT method is used in the reconstruction of objects such as tanks, reactors as well as pipelines, etc.
In the discussed cases, an intelligent tomographic system type EIT was designed. The system includes 16 electrodes connected to a power source and a multiplexer. Thanks to this, during one measurement an input vector is generated, containing 208 values of voltage drops in the electric current between different electrode pairs placed in series across the flood embankment. The cases discussed in the article relate to supervised machine learning, which means that the input variables are assigned the corresponding output variables. Ultimately, the resolution of the output image depends on the density of the grid of points that make up the model of the reconstructed object. Fig. 1 presents a diagram of an intelligent system for reconstruction of tomographic images, the output of which is a section of a flood embankment. The cross-section grid consists of 2012 points. AI system consists of 2012 separate computing units, each of which generates the colour of a single point of the output image.
In order to compare four intelligent methods of image reconstruction, an example of a single computing unit generating one pixel of an image was used. Input vector is a 208-element voltage drop vector. The data set contains 500 historical cases placed in a 500x209 matrix. The first 208 columns are input variables (independent) while the last column contains the output variables (patterns).

Neural Networks with Multilayer Perceptron
In the discussed case, the computing unit was the Artificial Neural Network. The network structure was as follows: MLP 208-23-1. So, it was Multilayer Perceptron containing one hidden layer with 23 neurons and one output neuron. In Table 1, correlation coefficients for output are presented. They are close to 1 so the quality of the obtained neural network is high. The coefficient for the test set is of particular importance. It is the lowest of 3 coefficients, however the differences are insignificant.  Table 2 presents the main information about the trained neuron network and its effectiveness. Table 4 presents a comparison of Pattern and Output images generated by the intelligent system for the reconstruction of tomographic images. Thanks to this comparison, it can be seen the impact of the neural network errors on the visual effect.

Support Vector Machine
Support Vector Machine (SVM) method performs regression (or classification) tasks by constructing nonlinear decision boundaries. Because of the specifics of the feature space in which these boundaries are found, Support Vector Machines can exhibit a large degree of flexibility in handling regression problems of varied complexities.
In a regression SVM, we estimate the functional dependence of the dependent variable y on a set of independent variables x. It assumes, like other regression tasks, that the relationship between the independent and dependent variables is given by a deterministic function f plus the addition of some noise (1).
The problem is then to find a functional form for f that can correctly predict new cases that the SVM has not been presented with before. This can be achieved by training the SVM model on a sample set, i.e., training set, a process that involves, like regression, the sequential optimization of an error function. Depending on the definition of this error function, two types of SVM models can be recognized: Regression SVM Type 1 and Regression SVM Type 2.
The learning results of the supervised SVM controller are shown below. Support Vector machine results:  SVM type: Regression type 1 (capacity=10,000, epsilon=0,100)  Kernel type: Radial Basis Function (gamma=0,005)  Number of support vectors = 62 (24 bounded)  Mean error squared = 0,005(Train)  S.D. ratio = 0,104 (Train)  Correlation coefficient = 0,995 (Train) Table 5 and 5a present characteristics of SVM model.    The examples are used to predict new unseen data based on the majority vote (for classification problems) and averaging (for regression) over a set of K nearest prototypes (hence the name K-nearest neighbours).

K-Nearest Neighbour
In case of regression, the average of K exemplary points is taken as the result. Choosing a K number is very important because it is the basic parameter of the method that determines the quality of the prediction. This parameter can be treated as a measure of data smoothing degree. With a small K we will have a large prediction variability. However, with a large K, we can receive significant, systematic shifts of prediction values. Thus, K should be large enough to minimize the probability of incorrect predictions, but small enough that K nearest neighbours would be sufficiently close neighbours of the new point. As always when smoothing, there is some optimum -a golden mean between excessive variability and systematic, significant deviations. The K-nearest neighbours method proposes an optimal K-value based on the cross-check method. KNN results for analysed data set:  Number of nearest neighbours =1  Distance measure: Euclidean  Input standardization: on  Averaging: uniform In Table 6 characteristics of K-Nearest Neighbour model are presented.  5 shows Observed Vs. Predictions KNN scatter plot for K = 1. Despite the regressive nature of the problem, the KNN method tends to classify the results, which results in an ideal prediction or definitely an incorrect one. This is a disadvantage of this method compared to other regression methods.

Multivariate Adaptive Regression Splines
Multivariate Adaptive Regression Splines (MAR Splines) is an implementation of the generalization of the technique introduced for wide use by Friedman (1991), used to solve both regression problems and classification, in the purpose of which is to find the values of the output variables (dependent) on the basis of input (predictive) variables. There are many methods to match the model to quantitative variables, such as linear regression (e.g. Multiple regression, General linear model), nonlinear regression (generalized linear and nonlinear models), regression trees (Classification and regression trees), CHAID, Neural networks, etc.
Multivariate Adaptive Regression Splines (MAR Splines) is a nonparametric procedure that does not require assumptions about the functional relationship between dependent and independent variables. MAR Splines model this relationship using a set of coefficients and base functions, "derived" only from data. A strategy is used here, which you can use to say "divide and conquer". The entrance space is divided into areas in which separate regression or classification functions are determined. This approach makes MAR Splines particularly useful with a larger number of dimensions on the input (more than two variables), when in the case of other techniques, the problem of dimensionality begins to threaten.
The MAR Splines technique is particularly known in the field of mining data, due to the lack of necessity to establish anything of the type of dependency (linear, logistic, etc.) between dependent and independent variables. We get very good models (giving exact predictions) in situations where dependencies are very complicated, nonmonotonic and difficult to any parametric modelling. MAR Splines results for analysed data set:  Number of terms = 18  Number of basic functions = 17  Order of interactions = 1  Penalty = 2,000000  Threshold = 0,000500  GCV error = 0,000308  Prune = Yes In Table 7 characteristics of MAR Splines model are presented. In Table 8 regression statistics for the analysed data set are shown. It can be seen that standard deviations for both observed and predicted cases are similar. This is evidenced by the low value of the standard deviation (residua). A high (near 1) R-square regression value is also a positive signal. The zero-equal mean for residuals allows to conclude that the developed model is characterized by high prediction efficiency.  6 shows a comparison of observed and predicted outputs for the MAR Splines method. Comparing this graph with the analogous graph presented for the SVM method (Fig. 3), it should be stated that both methods give similar results.

Image deblurring using convolutional auto-encoders
The discussed case concerns the problem of noise reduction of the image obtained as a result of tomographic reconstruction using the ECT capacitive impedance method.
Inside the reactor there is a liquid substance that crystallizes under the influence of the processes. The aim of the research is tomographic reconstruction of crystals inside the reactor. The tank's shape is a cylinder. Its horizontal cross-sections are shown in Figs. 10-12. 16 ECT electrodes were mounted on the circular wall of the reactor. Because of them a 120-element vector of electric capacitances read from measurements including various electrode pairs was obtained. The output image resolution is 4146 pixels.
As a result of the use of the previously discussed intelligent system for reconstruction of tomographic images, in which each pixel of the output image is generated by a separate computing unit, low quality images were obtained. The quality of the images was evaluated in comparison to the reference images. In order to improve the image quality, an auto-encoder was implemented.
An auto-encoder is a network-born network, which is taught to generate the same values as received at the inputs. Inside it is built from a hidden layer that describes the code representing the input values. The network consists of two parts: an encoder and a decoder, which creates a reconstruction. Such auto-encoder architecture is shown in the Fig.7. If auto-encoders were allowed to easily learn the dependence of y = x, they would not be very useful. For this reason, the encoders are designed so that they cannot duplicate the input value ideally, but to omit part of the information. In order to best match the input values despite these limitations, they are forced to find the most important information in the input vector. This feature of auto-encoders is used in practice.
In the present case, two auto-encoders consisting of 4146 inputs and outputs and one hidden layer involving 10 neurons were used. The number of encoder inputs and outputs is equal to the number of pixels in the image grid of the ECT tomograph. Fig. 8 presents a graph of the learning process of one of the encoders. The shape of the line is devoid of sudden fluctuations. After the fall of MSE, there is no situation in which the error value increases again. All the facts indicate that the network has not been over trained. The structure of the convolutional neural network included two convolutional layers generated by encoders and one fully connected regression layer, constituting the output layer (Fig. 9). The default loss function for a regression layer is the mean squared error (MSE).   Fig. 11 is an output image obtained from a system consisting of 4146 separate neural networks. Compared to the pattern, the amount of crystals is much higher here. Due to the regressive nature of prediction, the colours of the image are also much more varied than in the reference image. Fig. 12 shows a denoised image using the convolutional auto-encoder. The number of crystals and their location is much more like a reference image. It can be seen that the numeric values (colours) of the output pixels are different from those in the output image. From a practical point of view, this is not very important, because the primary purpose of electrical tomography reconstruction is to determine the presence of crystals in the reactor. Numeric pixel values only change the picture's colour.   Fig. 5 is helpful. This method gives extreme results that are either ideal or significantly different from the reference values. This fact significantly reduces the efficiency of the KNN method in issues including regression problems.
An interesting alternative was the MAR Splines method. It is a relatively new approach to solving complicated regression problems, which include tomography imaging. Generalized Cross Validation (GCV) error is very small and amounts to 0.000308. The analysis of Fig. 6 also confirms the great potential of the MAR Splines method with respect to tomographic imaging issues.
The last chapter presents the original method of deblurring the image with the use of a convolutional double stack of encoders. This approach seems to be particularly useful in the case of ECT. Capacitance tomography has serious limitations, which include a small number of electrodes and noise in the measurement data, caused by the influence of the environment (neighbouring electrodes, walls of the investigated object, etc.). For this reason, it is particularly important to develop a method to reduce interference from the input data vector or from the output image. The presented example confirms the efficiency of the convolutional auto-encoders in the problem of reducing noise in ECT tomographic images.