Multi-level Signal Decomposition for Power Quality Distur-bance Classification

. The introduction of electric vehicles impose large disturbance to the grid-level power signal due to the charging and discharging mechanism. Power signal monitoring in the electrical grid can provide several insights such as power quality disturbance detection, major power consumption area, peak power usage period, and their potential catastrophic failure conditions. As for preventive maintenance purpose, automatic classification of power quality disturbance using a hybrid method incorporating wavelet transform and deep LSTM network is proposed in this paper. Multi-level signal decomposition is applied to input signal to increase the resolution of input decomposing into multiple frequency bands. Subsequently, these multi-level frequency components are fed into deep LSTM layer to further extract useful higher order latent feature. Classification performance of the proposed wavelet-based LSTM (WT-LSTM) network is bench-marked with deep LSTM method. Additive white Gaussian noise (AWGN) with signal-to-noise (SNR) levels between 20-50dB are inserted during the training process to increase the generalization of signal learning with the realistic scenarios. The classification performance of both WT-LSTM and Deep LSTM networks are tested with 20,30,40,50dB SNR AWGN and noiseless conditions. As a result, the WT-LSTM network obtains an overall classification performance of 89 . 77% on 20dB and 99 . 21% on noiseless condition as compared to Deep LSTM, with 88 . 48% and 98 . 54% respectively.


Introduction
Electric vehicle (EV) has gradually replaced the internal combustion engine vehicles due to its higher efficiency and lower emission of greenhouse gases [1]. The increase in number of EVs connected to the power grid could bring impacts to the power grids on its performance, such as reduced efficiency, overloading and other power quality issues [2]. The use of fast charging stations also causes power quality issues to the grid [3]. These power quality issues may lead to reliability of power delivery, devices malfunctioning, and reduced lifespan of electric components [4]. Hence, power quality monitoring is of paramount importance with addition to future development of smart grids [5,6]. Classification of power quality disturbance is the main feature required in a power monitoring systems. Power quality disturbance (PQD) defines as the fluctuation occurred on the standard rating of voltage, current, or frequency.

Wavelet-based LSTM network
A hybrid model composing of multi-level signal decomposition (MSD) and LSTM layers is proposed as shown in Fig. 1. MSD is used to transform the signal into wavelet domains. The unequal length of the coefficients outputs are aligned using temporal align layer. The aligned embedded features are then being passed into LSTM for sequential feature extraction, followed by fully connected layers for classification. The detail components of the proposed architecture are explained in the following subsections.

Multi-level signal decomposition
MSD allows the yield of multiple coefficients, each representing different frequency components of the original signal input. Varying window sizes of wavelet transform allows detection of non-stationary signals and proved efficient in identifying discontinuity in signals [17]. MSD allows multiple bands signal filtering by performing DWT in a hierarchical levels. The decomposed signals consists of detail coefficients and an approximate coefficient. Approximate coefficient, cA i (k) represents lower frequency component, while detailed coefficients, cD i (k) contains higher frequency components. This expands the input resolution into multiple bands, which allows better extraction of PQD signals at different frequency levels. The input signal x(n) is pass through low pass filter h(n) and high pass filter g(n) at each level decomposition. The two digital filters, low pass filter h(n), n ∈ Z, and high pass filter g(n), n ∈ Z, which defined by scaling function ϕ(x), and ψ(x) respectively.
where  n h(n) 2 = 1, Down-sampling is then being performed to remove samples from every two samples. Wavelet coefficients of I th level is express as follows [18]: The MSD output is noted as M and the I th level can then be mapped as: , z k is the dimension of individual vector in M and K = I + 1

Temporal align layer
In this experiment I = 4-level MSD is used, where the total number of output coefficients is K. Temporal align layer is a collection of single perceptron layers noted as [ f k ] K k=0 : R Z  → R K×d , where the single perceptron layer is a mapping function f k : R zk  → R d . Subsequently, the temporal aligned MSD output,M is obtained as follows, (6) whereM ∈ R K×d and fixed embedding output d = 256 is used.

LSTM
LSTM architecture has been introduced to efficiently learns the temporal information from the input sequence [19]. LSTM can be achieved via three gates, forget gate, f t , input gate, i t , and output gate. While forget gate removes unwanted information, input gate retains useful information. A tanh activation function is applied over the output of input gate, producing new candidate,c for the cell state. The operations can be summarise in Eq. (7) with trainable weights, W and bias, b.
At each time step, a new cell state, c t is produced as, where c t−1 denotes as previous cell state, andc t denotes as candidate cell state. The hidden state output of LSTM, h t is based on cell state and output gate as follows, The extraction of temporal features output from temporal align layer are performed in LSTM layers. These temporal features are encoded into higher dimension latent feature, representing different classes of PQDs. The latent feature are then being classified using a fully connected dense layer with softmax activation function.

Experiment setup
The experiments are carried out using Pytorch framework, with AMD Ryzen 7 3800X 8-Core Processor and Nvidia P6000 graphic processing unit. In this experiment, 16 classes of PQD including normal class are generated as shown in Table 1. A total of 76800 samples of 10-period 3200Hz sampled power waveform is used. A total of three layers LSTM with 32 units of hidden units, and 2 layers of fully connected dense layers are used [16] for both Deep LSTM and WT-LSTM model. Daubechies 4 (db4) wavelet with four level of decomposition is used via MSD. The use of four levels is suggested by literature as it provides better classification performance [20,21]. Besides, four levels of MSD allows decomposition into five frequency ranges, 1600-3200Hz, 800-1599Hz, 400-799Hz, 200-399Hz, and 0-199Hz. This separation is sufficient to differentiate between high frequency disturbance and low frequency disturbances. The fixed embedding output d = 256 is used in the proposed temporal align layer. The PQD used in this experiment are simulated using mathematical models [22]. AWGN with SNR of 20-50dB has been used for the training process. Testing analysis are carried out with 20dB, 30dB, 40dB, 50dB SNR AWGN, and noiseless conditions. Classification accuracy has been used for the performance comparison.

Results and discussion
The classification performance of the proposed WT-LSTM has been compared to Deep LSTM [16] as shown in Table 2. From Table 2a, it can be noticed that the the classification accuracy of Deep LSTM network is poor on class P10-Sag+Harmonics on all noise level. Confusion matrix shows that P10 is confused with P0. The harmonics presence in class  P10 and P11 might have contributed to the magnitude changes of the signal, where P10 and P11 magnitude levels are distorted seriously from the harmonics components. Besides, the classification accuracy on fast disturbance class, P8-Notch on high noise 20dB SNR AWGN is notably low. This weakness in class P8 shows that Deep LSTM model is facing difficulty in classifying high noise fast disturbance. Confusion matrix shows confusion between class P8 with class P14-Flicker+Sag. The additive effect of AWGN noise might have neutralised the characteristics of notching with smaller magnitude difference. In addition, it can also be noticed that most of the classification performance of combined-disturbance classes (P10-P15) are having classification performance lower than 90%. This shows that Deep LSTM model is having less noise immunity, especially on classifying combined-disturbance classes.
On the other hand, the proposed WT-LSTM model shows better overall classification performance with slightly higher classification accuracy on all SNR tests. From Table 2b, most of the combined-disturbance classes are having > 90% classification accuracy. This shows that WT-LSTM is having better noise immunity on classifying combined-disturbance classes. From the confusion matrix, class P0-Normal is having mutual confusion with class P10-Sag+Harmonics. The poor performance of class P0 under high noise condition might be refers as harmonics. This confusion is less of concern as a knowledge-based method can be applied to differentiate between normal signal and signals with disturbances. The ability to identify PQD other than class P0 under high noise condition is more important because real world disturbance signals are varying in different scenarios. The introduction of high noise allows generalization of the specific disturbance class. It can be noticed that P10 accuracy improved significantly. This shows that the proposed temporal aligned MSD increases the sensitivity in identifying slight differences between small average magnitude changes. Class P8 on the other hand is having weaker performance, the confusion mainly occurred with class P14 and P15. Different level of notches with high level of noises cause big fluctuations on the average magnitude of the signal, which causes the confusion. It can be see that although temporal aligned MSD improved the average magnitude sensitivity, the mechanism does not help in identifying fast disturbance on high noise condition.
A hybrid model comprising of signal processing using MSD and LSTM algorithm, namely WT-LSTM is proposed for automatic PQD classification. WT-LSTM increases the resolution of the input signals using MSD. Five frequency bands of signal components are extracted as wavelet coefficients. Temporal align layer is introduced to replace the need of signal reconstruction while embedding various output dimension from multi-level signal decomposition into temporal aligned embedding features. The introduction of temporal align layer helps in improving in classifying combined-disturbance PQD by providing higher sensitivity on difference in average magnitude of the signal. As a result, the proposed WT-LSTM network obtains an overall classification performance of 89.77% on 20dB and 99.21% on noiseless condition as compared to Deep LSTM, with 88.48% and 98.54% respectively. As for future work, attention mechanism can be added into the model to detect salient features from multiresolution data to improve the performance on short-period transient disturbance classes that are observed to be similar to the high frequency noises.