A mathematical model for estimating and forecasting the state of a digital substation based on the wavelet transform method

The article the possibility of applying the wavelet transform method in combination with a neural-fuzzy approach to solving problems of forecasting the state of digital substations is considered. The optimum level of the wavelet expansion of the time series corresponding to the change in the phase voltage for the day on the basis of the Hurst index is determined. The influence of the sample size and the type of the mother wavelet on the Hurst index is researched. It was revealed, that for wavelet decomposition, the use of the Daubechy wavelet as the mother wavelet is effective, which provides a smoother filtering of noise, compared to the Haar wavelet. Analysis of the original series does not allow to evaluate the optimal level of wavelet expansion if the noise level of the time series under consideration is low (less than 10%), since the Hurst index remains unchanged. However, using the logarithm of changing the time series allows for small fluctuations to be taken into account, which allows to determine the optimal level of the wavelet expansion for their smoothing.


Introduction
The technology "Digital substation" is a new level of automation of electrical substations. Based on the fact that this technology is relatively new, there is a need to develop new algorithms for diagnosing complex technical systems. The demand is confirmed by numerous researches and developments in this field [1][2][3][4][5][6][7][8][9].
As the basis of new algorithm for complex technical systems diagnosing (as example -for digital substation), that earlier proposed in [2], the method of full-scalemodel test is used [10]. To diagnostics a computer model based on National Instruments technologies is developed [11]. The model allows to select the monitored parameter (current strength, phase/line voltage, frequency, etc.) and the monitored value (the numerical value of the monitored parameter, exceeding which is dangerous) during program operation. The model also allows to estimate the change of the monitored parameters within the time. Thus, it becomes possible to more accurate diagnose digital substation current state on the basis of an analysis of the parameters values.
However, the diagnosis of the substation current state does not always allow assessing the evolving defects in the equipment. Therefore, it becomes difficult to prevent the emergency operation of the equipment. To solve this problem, it is proposed to use methods of intellectual analysis to predict the substation state. In view of the fact that the data on the digital substation is permanently stored, there is access to a large amount of information provided from the sensors of power equipment. Thus, it becomes possible to use intellectual analysis (modern methods of signal processing, wavelet analysis, methods of artificial intelligence, machine learning) to diagnose equipment. The use of intellectual analysis allows to assess the current state of equipment, as well as to predict the pre-toxic states on the basis of an analysis of the values of the parameters characterizing the state of the equipment, as well as the relationships of the system parameters [3].

The wavelet transform method
To create a predictive model, the time series of parameters describing the state of the substation according to the C2C power quality monitoring project are taken. Data about a digital substation can be considered as random time series. The analysis of such time series reflects the behavior of the system in the past and makes it possible to predict its future behavior. The primary task in forecasting complex non-stationary time series is to decompose them as components. The structure of the time series is a set of components, such as the systematic component, the oscillations around the systematic component and the random component.
One of the most effective methods of intelligent data processing are neural networks. However, this method has drawbacks [12,13]. The disadvantages of this method are partially eliminated by combining neural networks and fuzzy logic -a neuro-fuzzy approach [14]. According to numerous studies [15][16][17][18][19][20][21][22], due to the use of a neuro-fuzzy network, the forecast error is minimized. At the same time, a selection of the architecture of the neural-fuzzy network, taking into account the dynamics of the change in the rapidly changing and slowly changing components of the time series, is a laborious process. In addition, accurate accounting for changes in the rapidly changing time series component leads to significant prediction errors. The solution of the abovementioned problems, as well as an additional increase in the accuracy of the forecast, can be ensured by applying a wavelet transform. Forecasting is carried out according to the scheme shown in Fig. 1. To solve the problem of predicting discrete time series, an algorithm of excessive discrete wavelet transform is chosen [23,24]. The choice is due to the fact, that other species have disadvantages that do not allow to solve the current task: • continuous wavelet transformation -a large redundancy, leading to a large time-consuming analysis, • fast wavelet transformation -presence of decimation operation, i.e. when the time series is shifted by at least one sample, the results of the whole decomposition will change, which will lead to the impossibility of forecasting).

The optimal level of detail based on the Hurst index
The prediction process ( Fig. 1) includes the decomposition of a time series with a deterministic depth of detail, which is determined empirically from the analysis of the original time series, or on the basis of an analysis of the Hurst exponent [25]. This indicator characterizes the noisiness of the analyzed time series. The best decomposition depth is understood as the obtaining of uncomplicated quasiperiodic sequences at the output of the wavelet filtration with no sharp fluctuations and emissions at the approximating decomposition level (the Hurst index value is different but close to 0.5) [26].
For example, consider the expansion of the time series, which is a change in the phase voltage per day (1440 values, fixing the voltage every minute). The appearance of the change in voltage is shown in Fig. 2. To determine the optimal level of detail, an algorithm for calculating the Hurst index based on R / S analysis was implemented [27,28]. There are two ways of analyzing the time series: • analysis of the original time series, • analysis of variation in the original time series.
By variation of the original series N we mean either the difference between two neighboring terms of the series [29]: (1) or the difference between the logarithms of two neighboring terms of the series [28,29]: According to the E. Neumann research "Calculation of the Hurst index for the purpose of revealing the trend (persistence) of financial markets and macroeconomic indicators," it is recommended to analyze the source series, rather than the original series variation. To calculate the Hurst exponent, used the formula [30] is used: where H -the Hearst index; S -the standard deviation of a observations number; R -the magnitude of the accumulated deviation; N -the observation periods number.
The results of calculating the Hurst index for the initial series are presented in Table 1. The value of the Hurst index, according to the recommendations [27], should be taken with t ≥ 5. According [31], at small sample sizes, the Hearst index value is undergoes significant changes, and only starting from a certain sample size, the value of the Hearst index oscillates about a specific value. With respect to the time series under consideration, we take H = 0.7514, which indicates the presence of long-term memory in the analyzed series.
To test the possibility of further use of the obtained Hurst index, we select a test developed by Sheinkman and Le Baron [27]. The test consists in mixing the initial data with the subsequent recalculation of the Hurst index. If the Hurst index takes a value close to 0.5, we can conclude that the original series had a long-term memory, and as a result of mixing the original data, their structure was destroyed. In the case under consideration, after randomization of the initial data, the Hurst index became 0.4741. This result indicates the presence of long-term memory in the original series and, accordingly, indirectly confirms the correctness of the definition of the Hurst index.
To obtain the best decomposition depth, the Hurst index is calculated from the first level of detail, followed by an increase based on the Haar wavelet [32,33]. The results are shown in Table 2. It can be seen, that the value of the Hurst index undergoes minor variation with an increase in the level of detail. The next step to get the Hurst index as a function of the level of detail was tested, replacing the mother wavelet with the Daubechies wavelet with a number of zero moments equal to 4 [32,33], which provides a more smooth filtering of noise. The results of the wavelet decomposition are presented in Table 3. It can be seen, that the value of the Hurst index also does not undergo significant changes. To study the dependence of the Hurst index on the level of detail, the sample size was increased to 4,096 values (Fig. 3).  Table 4. Thus, from the data, it is clear that to form a relationship between the level of detail and the Hurst index, it is necessary to analyze the logarithms of the change in the time series under consideration. This is due to the fact that due to the use of the properties of the logarithm, it will be possible to take into account the detailed changes in the time series under consideration. The results of calculating the Hurst index are presented in Table 5. According to the E. Neumann research, the value of the obtained Hurst index falls under the classification of a random series, or noisy. We carry out the wavelet decomposition using the mother wavelet Haar and Dobesi. The results are shown in Table 6. According to the obtained data, it can be concluded that for the time series under consideration, the use of the Daubechy wavelet is more effective. This is due to the fact that using the Haar wavelet does not allow to smooth out the current noise level, but because of the specifics of this wavelet, it only eliminates the information content of the original signal (Fig. 4). Based on the Daubechy wavelet, the results are similar to the "rough" estimate of the Hurst index without taking into account the logarithms of the change in the original series. It should be noted that with the increase in the level of detail, the Hurst index increases, respectively, indirectly confirms that the Hurst index is a measure of the complexity of the analyzed series.
Thus, the best depth of decomposition obtained as a result of approbation of different levels of detail for 4,096 values is 5. As the level of decomposition increases, a rough change in the configuration of the signal occurs, at a lower level of decomposition the noise component will not be sufficiently eliminated. As a result of the decomposition, we obtain 6 components-the approximating (a5) and five detailed (d 5 , d 4 , d 3 , d 2 , d 1 ), which together give the original signal (Fig. 5): S ≅ a 5 + d 5 where S -source signal, a 5 -signal equal to approximation of 5; d 5 , d 4 , d 3 , d 2 , d 1 -the signal of detail levels 5, 4, 3, 2, and 1, respectively.
As can be seen from Fig. 5, the original signal is noisy. The use of a neural-fuzzy approach without wavelet transformation entails the appearance of an error, but taking an approximating time series, without taking into account the rapidly changing components, there will also be a significant prediction error [34].
Training is carried out for each level of decomposition separately, accordingly, in the case under consideration, 6 neural-fuzzy networks are formed. After the expansion of the time series, a prediction is carried out, which can be represented in the form of: (5) where interval [t,t+1] -the lead time interval, l -length of lead time interval, Y*(t, l) − pretiction, φ -prediction function, depends on the number of observations t, lehgth l and values of Y(1),Y(2),…Y(t). The received predictions Y*(t, l)…Yn*(t, l), are summarized, to abtain the result prediction value [35]. As a result of the research, it was revealed: -For wavelet decomposition, the use of the Daubechy wavelet as the mother wavelet is effective, which provides a more smooth filtering of noise, in comparison with the Haar wavelet.
-The increase in the sample size does not have a significant effect on the Hearst index.
-Analysis of the original series does not allow us to estimate the optimal level of the wavelet decomposition if the noise level of the time series under consideration is low (less than 10%), since the Hurst index remains unchanged. However, using the logarithm of changing the time series allows for small fluctuations to be taken into account, which makes it possible to determine the optimal level of the wavelet expansion for smoothing them.
Thus, in order to solve the problems of predicting the state of a digital substation, the application of wavelet transform makes it possible to obtain a signal without noise, which is effective for further application in neuralfuzzy networks. This is due to: • Eliminating the need to compile a neural-fuzzy network architecture, • Lack of accurate accounting of the detailed components of the original time series, which leads to an increase in the accuracy of the forecast, • Forecasting of individual rows after the decomposition of the original signal, which also leads to an increase in forecast accuracy. This article is based on results of the project No. SP 4108.2015.1, implemented under the program "Scholarship provided by the President of the Russian Federation for young scientists and graduate students engaged in advanced research and development in priority areas of modernization of the Russian economy" using equipment of shared facility "Diagnosis and energy-efficient electrical equipment" (NPI).