Modeling of Main Steam Parameters in Large Thermal Power Plant

To solve the problem of main steam temperature abnormal of the 300 MW subcritical coalfired boiler, the main steam temperature model was built based on running data. Firstly, the correlation analysis of the data was carried out based on the SPSS software to determine the main influence factors of the main steam temperature. Then, a nonlinear modeling method based on PLS was proposed, which is used to determine the number of the extracted components. At last, the simulation of the partial least squares regression model of the main steam temperature was realized by using MATLAB software. The partial least squares regression equation of the standard variable and the original variable was constructed by using the training samples and the accuracy and correctness of the model were tested by using the test samples.


Introduction
The development of information technology in power plant provides a good platform for data driven operation optimization research. Among them, based on the power plant real time & historical database, the method of modeling of the complex thermal system has gradually become one of the hot topics and it has become the basis of solving the key problems such as operation, control, performance evaluation, fault diagnosis, simulation and so on [1]. However, running data is different from the experimental data, which has many characteristics that are not conducive to modeling. For example, there are multiple correlation among variables, uneven distribution of the working conditions, and the existence of process nonlinearities, etc. All the problems are seriously hindering the development and application of data modeling methods [2]. Therefore, here the main steam temperature prediction model was built based on the Partial Least-squares Regression (PLS) algorithm.

Review of research methods
The main steam temperature of boiler has the obvious characteristics of nonlinearity, large inertia and long time-lag and in many cases, it is difficult to establish an accurate mathematical model. To solve the problem, Deng built the on-line fuzzy identification system of steam temperature based on T-S fuzzy rule model and the recursive least-square (RLS) [3]. Zhao developed an advanced Pareto multi-objective optimization with genetic algorithm based on a novel vector module adaptive function, which was applied to the PID controller parameters tuning in the boil superheated steam temperature cascade control system. The simulation results show that the improved multi-objective genetic algorithm (GA) optimization can obtain different optimal controller parameters with respect to the corresponding performance targets [4]. Zhang proposed the super-heater steam temperature model using on-line data to improve the conventional testing method and physical law methods. Based on on-line data, the identifiability of closed-loop control system of superheater steam temperature could be met and dynamic models of super-heater steam temperature were obtained in the disturbance of super-heater spray flow. Simulations show that the model could reflect the facts of superheater steam temperature [5]. Luan designed a new cascade control system based on the characteristics of superheated steam temperature of a boiler [6]. To solve the problem of reheat steam temperature (RST) abnormal of the 300MW power station boiler unit, Huang presented a method based on support vector regression (SVR). Based on the data sampled on spot, RST was analyzed using support vector regression method. The RST model was based on the statistical characteristics of the operating parameters and could reflect the potential relationship between RST and the operating parameters. For the units considered here, the RST was low and the temperature of reheater tube wall was high, tilt angles and desupreheater spray, etc. were taken as the tuning parameters and as the features of SVR model. The prediction results on test data with SVR-RST showed high regression coefficient with low complexity, which means SVR-RST model had excellent robustness, which was important to further optimize the operating parameters for higher efficiency and security [7].

Correlation analysis
There are many factors that can affect the temperature of the main steam, such as steam parameters, water spray temperature, flue gas temperature and flow rate, ash fouling status and so on. In order to determine the main influencing factors, here the correlation analysis was used.
There are three kinds of calculation of correlation coefficient: Pearson, Spearman and Kendall. Here we selected the Pearson correlation coefficient. The formula of Pearson correlation coefficient is as followed: where the r is regarded as the correlation coefficient. The x and y are the corresponding mean values and i x and i y are the ith observe value.
According to the Table 1, the factors whose correlation coefficient was greater than 0.6, were chosen to build the partial least squares regression model.

Partial least-squares regression
The PLS method has a tremendous success in chemometrics and chemical industries for static data analysis. It integrates the principal component analysis (PCA) and canonical correlation analysis (CCA) together naturally and is convenient for the analysis of the multidimensional complexity system. In its general form, PLS creates components by using the existing correlations between different sets of variance while also keeping most of the variance of both sets. Now, we provide some of the formal ingredients of the method [8].
Consider a general setting of the PLS algorithm to model the relation between two data sets. Let X = {x 1 , x 2 , …, x p } denote a p-dimensional vector of variables in the first block of data, and similarly let Y = {y 1 ,y 2 ,…, y q } denote a vector of variables from the second set. Observing n data samples from each block of variables, PLS decomposes X = (x ij ) n×p and Y = (y ij ) n×q into the form: where T and U are n×r matrices of the extracted r components, the p×r matrices P and Q represent matrices of projections, and the n×p matrices F and Q are the matrices of residuals. The PLS method, in which the classical form is based on the nonlinear iterative partial least square algorithm, finds projection axes w and c such that: The solution to this optimization problem is given by the following Eigen value problem: The X-latent-component vector t can be obtained as t= Xw, which is also the first Eigen vector about the equation XX T YY T t=ξt. The Y-latent -component vector u can be estimated as u=YY T t.
Furthermore the above algorithm can be repeated l times with the deflation of X,Y, defined as X-tt t X→X, Ytt t Y→Y then T, U, the p×l matrix W and the q×l matrix C can be created, consisting of the columns from the vectors respectively. The PLS form as regression model can be written in matrix as: where coefficient matrix B pls can be calculated as B pls =W(P T W) -1 C T , and P is the p×l matrix consisting of loadings vectors

Cross validation
Accuracy of PLS regression is verified by cross-validity: where S press,h is the sum of squares of prediction errors after h components and S ss,h-1 is sum of squares of errors after h-1 components. As if S press,h is smaller than S ss,h-1 to certain extent, adding component t h is considered to be useful for the improvement of regression accuracy, that is, if Q h , the quality of model could be improved by adding a component. Otherwise, the number of component is enough. The cross-validity Q h is the criterion for determining the number of components.

Partial least squares regression model
According to the correlation analysis above, the main influencing factors of main steam temperature are pressure of main steam x 1 , flow of main steam x 2 , oxygen content in flue gas x 3 , total air flow x 4 , inlet baffle opening of primary air fan x 5 , drum water level x 6 , front and rear differential pressure of main water gate x 7 and regulating valve opening of oil fuel atomizing steam x 8 .
According to the partial least squares regression algorithm, the modeling process is as follows: • Extracting the 1st principal component t 1 Combination coefficients (weights), vector: • Evaluation of the model The average relative error of the training sample was 0.3%. The test samples were substituted into regression equation and the comparison of the predicted results with the experimental results was shown in Fig. 1. Fig. 2 showed the relative error of test samples. The average relative error of the test samples was 0.28%. It is obvious that the error meets the engineering requirements.

Conclusion
Here, the model of main steam temperature was built based on partial least-squares regression. The main influencing factors of main steam temperature were determined through correlation analysis. The prediction results showed that the model has high accuracy and can predict the main steam temperature in large thermal power plant.