Remaining Useful Life Prediction of Gas Turbine Engine using Autoregressive Model

Gas turbine (GT) engines are known for their high availability and reliability and are extensively used for power generation, marine and aero-applications. Maintenance of such complex machines should be done proactively to reduce cost and sustain high availability of the GT. The aim of this paper is to explore the use of autoregressive (AR) models to predict remaining useful life (RUL) of a GT engine. The Turbofan Engine data from NASA benchmark data repository is used as case study. The parametric investigation is performed to check on any effect of changing model parameter on modelling accuracy. Results shows that a single sensory data cannot accurately predict RUL of GT and further research need to be carried out by incorporating multi-sensory data. Furthermore, the predictions made using AR model seems to give highly pessimistic values for RUL of GT.


Introduction
The GTs are widely used as prime movers to many applications such as aviation, marine, power generation, gas compression and pumping facilities. Due to higher power to weight ratio and quick starting characteristic, GT are preferred over steam turbines. Subject to the recent advancements in renewable energy, GT are now expected to operate more efficiently under part-load operating conditions. Either under part-load or peak load conditions, GTs are also required to operate flexibly with relatively higher number of firing start-ups and shut downs, which reduces the time between failures [1,2]. Therefore, the maintenance strategy for such system should be robust and flexible in order to have improved reliability and better availability that would result in less downtime and operating cost [3]. Improper maintenance may lead to increased rate of deterioration. Due to that maintenance cost may also reach up to 35% of operating cost [4].
Maintenance strategies can be classified into three main categories namely breakdown maintenance, planned maintenance and condition based maintenance [2,5]. In recent years a new concept of maintenance called "prognostics based maintenance" that would potentially minimize risk while maximizing useful life has been reported [1]. This approach is based on the condition monitoring data and involves prediction of likely fault cases instead of waiting until sudden failure occurs. As such, RUL is predicted after a fault is diagnosed. Accordingly, all the required planning for maintenance is completed well ahead of any sudden failure. This approach is reported to have saved the cost associated with unplanned maintenance and is considered superior to diagnostics alone [6].
So far various techniques have been used for the prognostics of GTs. These include support vector machines (SVM) [1,[7][8][9], Bayesian forecasting [10][11][12], Kalman Filters [10], state-space models [12], artificial neural networks [13][14][15][16], independent component analysis [17], regression techniques [18], Dempster-Shafer regression [1] and one parameter double exponential smoothing [10]. These techniques can be classified into two types: (i) physics based modelling approaches, and (ii) data driven models [5,[19][20][21]. Physics based models are effective in dealing with novel faults but are considered difficult to implement to complex systems [1] since they require detail knowledge about the system. In the context of dynamic condition, it may require real-time parameter optimization, which is a limiting factor since GT is featured by many interacting parameters [21]. Under such circumstances, a model that relies on historical data for input and output sensors is preferred. The general steps for developing such kind of model entails the selection of a model structure followed by parameter estimation, simulation, and validation [22].
The purpose of this paper is to investigate the use of Auto-regressive model, which is one type of historical data based model, to GT remaining useful life prediction. A parametric study is carried out by varying model order and training algorithms.

Time Series Models
In time series modeling, historical data representing the process is used for prediction. The general procedure is that, first part of the historical data is used to train a model and then the remaining data is used to validate the model. The proportion for the data, however, depends on the judgment of the model developer even though 70/30 or 60/40 proportions are widely reported in the open literature. Regardless, the parametric and time invariant version of time series models, which is the main target of the present paper, is given by Where, ( ) is the predicted value; −1 is the shift operator; ( ) is the modeling and measurement error, ( −1 ) = ( −1 ) ( −1 ) ⁄ and ( −1 ) = ( −1 ) ( −1 ) ⁄ are transfer functions for the input and error data, respectively. Depending upon the forms of ( −1 ), ( −1 ) and ( −1 ), the resulting model can be Autoregressive Moving Average (AR), Autoregressive Integrated Moving Average (ARIMA), Autoregressive Moving Average with Exogenous Inputs (ARMAX) or Box-Jenkins (BJ) model. In the present paper, AR model, for which ( −1 ) = 0 and ( −1 ) = ( −1 ) = ( −1 ) = 1 is investigated.

Model performance Measures
The AR model parameters will be determined using optimization algorithms from MATLAB model identification toolbox.

NASA Benchmark Data
The data used for the case study is degradation data for Turbofan Engine which is made publicly available by NASA Prognostics Center of Excellence. Details on how the data was generated and the nature of the benchmark problem itself can be found from the publication by Saxena and Goebel [24]. An illustration of the Turbofan engine is provided as Fig. 1. The reader can refer to [25] for the list of sensors installed in the system. As highlighted in Table  2, the available data covers four operating regions.

Effect of Optimization Algorithm
For modelling, we have tested five parameter optimization algorithms: burg's approach (burg), geometric mean approach (gl), least square (ls), Yule-Walker (YW) and forward approach (fb). For the best and optimum algorithm selection, a comparison was made using the performance measures provided in Table 1.
The data set for sensor number 2 was used for AR model performance investigation. For this particular sensor, there are 192 cycles before the turbofan is declared to be failed. From the 192 cycles, the first 134 cycle (70% of the data) are taken for model construction while the cycles from 135 to 192 (30% of the data) are used for model validation. Table 3 depicts the comparison of different optimization algorithms based on performance measures. From collective values of MSE, FPE and AIC, it can be concluded that 'burg' and 'gl' methods tend to provide best results.

Effect of Changing Model Order
After the selection of optimum and best parameter optimization algorithms, we wish to find best and true model order number, again using the same set of performance measures. This has to be done to mitigate non-consistent estimation and increased variance [26]. The AR model was run for model order numbers in the range of 45 to 65 with the increment of 10.
The two optimizers ('burg' and 'gl') demonstrated almost similar performance (ref. Table 4). The noticeable differences are only in terms of AICc and BIC. It can be concluded that as the model order increases, the performance of the model improves. The implication to that is that in the field, the designer need to choose between expensive model and good performance.

Predicted vs Actual RUL comparison
For RUL investigation, AR models with 80 model parameters are produced for Sensor 2 and Sensor 7, respectively. The decision to choose the two sensors was made based on the condition that the data from the two sensors must follow different trends. Since the full information is known, the RUL from AR predictions are compared with the actual RUL. The predicted and actual RUL along with the corresponding errors are given in Table 5. (a) (b) Fig. 2. RUL Prediction using AR(80) of (a) increasing trend sensor# 2, (b) decreasing trend sensor# 7.
Plots of the data for sensor #2 and sensor # 7 are shown in Fig. 2. The corresponding RUL prediction by quadratic regression (ref. Table 5), indicates that the simple regression model provides optimistic and pessimistic RUL, with the actual RUL being always in between. AR model, on the other hand, tends to provide longer pessimistic RUL. Practically, the optimistic RUL urges maintenance preparation to be made as quickly as possible in anticipation of early failure. The RUL from AR model, on the other hand, shows longer RUL which might lead the maintenance team to believe that there is still ample time to do the preparation. Overall, the results imply that AR has a good potential for RUL prediction. One way to improve on the performance might be to use multiple input single output model structure.

Conclusion
The purpose of this paper was to investigate the use of AR models to RUL prediction. Accordingly, the Turbofan benchmark problem was studied with the hope of uncovering important observations. Five different optimization algorithms have been tested to assess AR model performance. At the end, it was possible to draw the following conclusions:  Performance of AR model is weakly dependent on the type of parameter optimization algorithm.  Increasing the model order increase the model accuracy. However, high model order is less preferred due to the need to have more data and high computational cost.  AR model can be used for RUL prediction. However, the accuracy depends on the type of data. Too much noise in the data means, AR model highly likely to give longer optimistic RUL. Future work will concentrate on expanding the AR model by considering data preprocessing and different model structure, perhaps Wiener model structure.
The Authors are thankful to Universiti Teknologi PETRONAS for the facilities provided. The study is partly funded by Ministry of Higher Education (MOHE), Malaysia under grant number FRGS/2/2014/TK01/UTP/03/2. Also, Authors are grateful to Saxena and Goebel for providing Turbofan engine simulation data at NASA Ames Research Center.