Transformer fault diagnosis based on chemical reaction optimization algorithm and relevance vector machine

. Power transformer is one of the most important equipment in power system. In order to predict the potential fault of power transformer and identify the fault types correctly, we proposed a transformer fault intelligent diagnosis model based on chemical reaction optimization (CRO) algorithm and relevance vector machine(RVM). RVM is a powerful machine learning method, which can solve nonlinear, high-dimensional classification problems with a limited number of samples. CRO algorithm has well global optimization and simple calculation, so it is suitable to solve parameter optimization problems. In this paper, firstly, a multi-layer RVM classification model was built by binary tree recognition strategy. Secondly, CRO algorithm was adopted to optimize the kernel function parameters which could enhance the performance of RVM classifiers. Compared with IEC three-ratio method and the RVM model, the CRO-RVM model not only overcomes the coding defect problem of IEC three-ratio method, but also has higher classification accuracy than the RVM model. Finally, the new method was applied to analyze a transformer fault case, Its predicted result accord well with the real situation. The research provides a practical method for transformer fault intelligent diagnosis and prediction.


Introduction
The power transformer is the core equipment in power system, so it is very important to analyze its operating parameters and make a correct judgment on the internal latent faults.The research on intelligent fault diagnosis of transformer is the key of ensuring power system safety [1].
The operating environment of power transformers are complicated, which has subjected by heat and electricity.Transformer insulation oil will be decomposed to corresponding gases by the process of heat and electricity.The gas types and their volume fraction can distinguish the fault types effectively.Now, dissolved gas analysis (DGA) is a convenient way to analyze and diagnose the potential fault of transformer [2,3].The classical IEC three-ratio method is widely used because of its simple principle.However, its coding mode is too absolute, which causes missed diagnosis.Therefore, how to improve diagnosis accuracy becomes a research focus in the field of transformer fault diagnosis [4].
In recent years, with the rapid development of artificial intelligence, many scholars are committed to the research of the transformer fault diagnosis based on artificial intelligence algorithm combined with DGA.For example, artificial neural network(ANN) was widely used in Transformer fault diagnosis [5].However, it requires a lot of training samples and its convergence rate is too slow.Also, support vector machine(SVM) has advantages in dealing with nonlinear and small sample data classification problems [6,7].Thus it has made some achievements in transformer fault diagnosis.But its deficiency is that the number of support vectors will increase sharply with the sample size expanding, and the kernel function parameters are restricted by Mercer condition [8].
Relevance Vector Machine (RVM) is a new supervised learning method, which is proposed by Michael E. Tippping [9].RVM is a sparse learning algorithm based on Bayesian framework.Compared to SVM, the number of relevance vectors is greatly reduced, and the kernel function is no longer constrained by the Mercer condition.At present, RVM has been applied to signal processing and power load forecasting, but it is rarely used in transformer fault diagnosis.This paper presents a multi-layer classifier model based on relevance vector machine, then combined with chemical reaction optimization algorithm to optimize the parameter of kernel functions in each layer.Finally, this paper collected 300 sets of measured data, then we use IEC three ratio method, the standard RVM model and the CRO-RVM model to test these data sets.The results show that the diagnostic model presented in this paper has obvious advantages over other models in diagnostic accuracy.And through some diagnosis cases, we found it is suitable for practical application.

Introduction of relevance vector machine multi-classification model 2.1 Relevance vector machine (RVM)
Hyper parameter is introduced into RVM machine learning algorithm to ensure its sparsity.It can assign zero-mean Gaussian prior distribution to the weight vector.The maximizing marginal likelihood function method is applied to estimate hyper parameter, and the automatic regulation mechanism is adopted to avoid the problem of the difficulty in determining the regularization coefficient [10,11]. Let " t be objective function and ^`N i i x 1 be input vector, then the RVM classification model can be expressed as the following equation (1): where w is weight vector, ) , ( i  K x x is kernel function .For a two classification problem, the target value is only 0 or 1, so we can use formula (2) to characterize the likelihood function of the whole data sets: Where x V is a sigmoid function.Sparse Bayesian method is used to give zero-mean Gaussian prior distribution for weight vector w, which can avoid over-fitting and keep the model sparse: where D is a N+1 dimensional hyper parameter vector in formula (4) can not be solved by integration, the approximate solution proposed by Mackay needs to be used.The specific steps of this method are as follows.
1)Firstly, assign a value to the current hyper parameter Į , then the "most likely" weight MP w is estimated by using the maximum weight posterior probability function , the following equation ( 5) can be used to estimated the weight value of MP w .

Aw w
. And MP w is solved by the second-order Newton method, the gradient vector of w can be represented as equation (6).
Similarly, the Hessian matrix and in formula ( 5) is given by equation ( 7) and ( 8): °°® ' ' where, ; ĭ is the design matrix, with 2) The posterior probability of Gaussian approximate weight is given by ) , where MP w is posterior probability mean value of weight, and is a covariance matrix.3) Using formula (9) to update the hyper parameter Į: Where the i i, ¦ is the th i diagonal element, and MPi w is the th i element for weight vector MP w .
When the hyper parameter new D is updated, we need re-estimate the posterior mean of weight and calculate the covariance matrix.Repeat steps 1) to 3) until the convergence condition is satisfied.After the update completes, the value of most i D will be close to infinity, the corresponding value of i w will be 0. As for the remaining i D will take a limited value, and the collection of the corresponding i x is called Relevance Vector (RV).x is called gas deterioration degree.

Chemical reaction optimization algorithm CRO
CRO is a meta-heuristic algorithm which has been developed in recent years.It has been put forward by Albert Y.S.Lam and Victor O.K.Li from the University of Hong Kong [12].It simulates molecular motion and energy conversion processes in chemical reactions.
Compared with genetic algorithm and particle swarm optimization algorithm [13,14], it is characterized by not only the dynamic change of group size, but also the rich information interactive modes among individuals.There are two basic arithmetic units in CRO algorithm, one is called molecule Ȧ and the other is named container wall (buffer).Molecules have two kinds of energy are called kinetic energy (KE) and potential energy (PE).Buffer represents the reaction environment, PE is the measure index in CRO algorithm.When CRO algorithm is applied to solve a real problem, PE represents the objective function.KE is the quantitative value to judge whether the molecular reaction can be carried out.There are four basic reaction operators in CRO: single-molecule collision, single-molecule decomposition, inter-molecular collision and molecular synthesis [15,16].
The chemical reaction algorithm first used the strength strategy formed by invalid collision to search the adjacent region.When a lower energy state cannot be found in the neighborhood, it will use the diversification strategy formed by inter-molecular collision and molecular synthesis to jump to a relatively remote areas continue to search.In the search process, the energy will be redistributed among molecules in various ways.
The choice of the initial value of the algorithm has a great influence on the results.In order to make the algorithm have better convergence speed, search precision and optimization ability, the initial molecular population is generated in a random way.The related parameters include molecular population size (Popsize), central energy buffer (Buffer), molecular initial kinetic energy (initialKE), molecular kinetic energy loss rate (KelossRate), etc.The specific steps of parameter setting are as follows: 1) Form a random molecule in a random manner within the range of molecular properties; 2) Judge whether the random molecule to meet the constraints or not, if satisfied, then enter step 3), otherwise, return to step 1); 3) Calculate the molecular potential energy, set the initial kinetic energy of the molecule, and save the molecule into the reaction container.At the same time, verify whether the initial molecular population in the container to achieve the initial set size or not, if reached, then finish the initialization process, otherwise, return to step 1).

The diagnosis steps of transformer fault based on CRO-RVM model
Transformer fault diagnosis based on CRO-RVM model used CRO algorithm to optimize the kernel function parameters of the RVM classifiers.The flowchart is shown in Figure 2, the specific steps are described as follows: 1) The sample data are selected and normalized to obtain the feature vector.We divided the sample data into training set and test set according to the ratio of 2: 1. 2) Use the binary tree method shown in Figure 1 to establish a multi-level transformer fault diagnosis model based on RVM, and a set of kernel function parameters are initialized , then inputting the training set to train the model.
3) Initialize the chemical reaction optimization algorithm, select the mean square error of the RVM's actual output and expected output as the objective function.
4) Calculate the initial potential energy and initial kinetic energy of the molecule.Optimize the kernel function parameters by iterative calculation.
5) When the PE of molecule is smallest, parameters will obtain the global optimal solution.Assign these kernel function parameters to corresponding RVM classifiers.
6) Input the test data sets into the CRO-RVM model, and test its diagnostic accuracy.

Comparison of different diagnostic models and example analysis 4.1 Comparison of different diagnostic models
In this paper, we collected 300 sets of measured DGA data from the 220kV, 330kV and 500kV transformers.The 15 sets of measured data samples shown in Table 1 are part of them.We divided all of them into training set and testing set according to the ratio of 2: 1.We used IEC three-ratio method, RVM model and CRO-RVM model to train and diagnose these data samples.The kernel functions of RVM model and CRO-RVM model both are RBF, and the kernel function parameters of the RVM model all are 0.5.The four classifiers' parameters of the CRO-RVM model are optimized by CRO algorithm, they are [8.625,0.264, 0.516, 1.246].Figure 3 shows the mean square error performance of the CRO-RVM model is changed with the iteration number.The results of the three diagnostic methods are shown in Table 2 as follows.We can get the following conclusions from the diagnostic results based on different models: 1) The IEC three-ratio method has the problem of missing codes.For example, [354, 387,   2) The kernel function parameters in RVM model are fixed, so it is difficult to ensure the classifiers in different levels can achieve the best classification results at the same time.So the diagnostic accuracy of the RVM model is only about 78%.
3) After using CRO algorithm to optimize the kernel function parameters of the RVM classification model, the diagnostic accuracy has been obviously improved, reached more than 94%.The correct recognition rate meets the requirement of practical application.

Transformer fault diagnosis case
The No.1 main transformer in a 330kV substation was broken down in June 25, 2014.With the transformer heavy gas operation, its three side switch tripped.The main transformer was made by Xi'an XD Transformer Co., Ltd, its model number is OSFPSZ9-M-360000/330.It has been put into operation since 2006.Oil chromatographic test data is shown in Table 4.The CRO-RVM model proposed in this paper is applied to analyze these DGA test data.The output diagnosis result is HD, that means the fault type of No.1 transformer is high-energy discharge.After disassembling inspection, we found there were discharge marks at the bottom of transformer bushing and the end of current-carrying rod.And there was insulation breakdown in B phase winding.So the real fault type of this transformer is high-energy discharge.It perfectly confirms the prediction of the CRO-RVM model.The disassembling picture of No.1 transformer is shown in Figure 4.The analysis of above case indicates that the transformer fault diagnosis method we proposed not only has good theoretical feasibility, but also has splendid ability of practical application.The new method can be applied to identify multiple fault types of large power transformer.

Conclusions
Relevance Vector Machine (RVM) is one of the emerging algorithms in machine learning field.The use of RVM in transformer fault diagnosis has been paid more and more attention.Based on the standard RVM classification model, this paper proposed a CRO-RVM model for transformer fault diagnosis.Using the CRO algorithm to optimize RVM kernel function parameters and improve the classification correct rate.Through the comparative analysis of the diagnosis results by different methods, we found that the CRO-RVM model can fully consider the individual differences of different transformers, and makes a relatively accurate diagnosis.Compared with the IEC three-ratio method, it doesn't has the problem of missing codes.And its diagnostic accuracy is higher than the standard RVM model.And a fault diagnosis case shows that the new method we presented is suitable for practical application.Its predicted results accord well with the real situation.The research indicates that it is a more useful tool for transformer fault diagnosis.In the future, we will try to combine DGA data with the routine electrical test data as fault features to further improve the diagnostic performance of the CRO-RVM model.In addition, we will study how to apply it to diagnose the fault of other oil-filled electrical equipment.

Figure 3 .
Figure 3. Mean square error variation trend of CRO-RVM model.

Table 1 .
DGA data samples of power transformer

Table 2 .
Comparison of fault diagnosis results.
65.8, 101.2, 189] is a group of high-energy discharge data.According to the coding rules shown in table 3, it is obviously out of the coding range.So the IEC three-ratio method can't diagnose it.The RVM model and the CRO-RVM model

Table 3 .
Coding rules of the IEC three-ratio method

Table 4 .
DGA test data