Study of feature extraction method of multi-information source for continuous casting process parameters

. The control of the roll gap of the segment is one of the key links to ensure the quality of cast billet. In this paper, the big data in traditional continuous casting production operations is studied through in-depth experimental comparative analysis of linear and nonlinear dimension reduction method. The method is suitable for continuous casting to obtain the data of the dimension reduction. The method of principal component analysis is improved by using standardized data increment method. A faster and more efficient method of dimension reduction is obtained when the unrelated data, training time and reconstruction error are removed. Actual data simulation results show that this method is more efficient and suitable for continuous casting than any other dimension reduction method.


Introduction
In the modern steel industry, high-efficiency continuous casting technology, including key continuous casting equipment and new processes, has become the most internationally competitive core technology.The process of continuous casting is a complex continuous phase change process.There are many control aspects that affect the quality of slab.The key technology and core research of high quality steel continuous casting process are mainly focused on the precise control of mould, sector and composition of liquid steel [1].Among them, the control of sector roll gap is one of the key links to ensure the quality of slab [2].Not only the establishment of a dynamic and adaptive prediction model for the roll gap value of the sector continuous casting machine but also the real-time adaptive adjustment of the sector roll gap value according to the actual working conditions has important theoretical and practical values for improving the quality of slab.

motivation()
Due to the complexity of the continuous casting process as well as the non-intuitive and variability of the internal condition of the high-temperature slab, there are many factors affecting the roll gap parameter in the production process, resulting in many measurement points required for data collection.Therefore, the sampling frequency is higher and the sampling time is longer, so that massive data can be obtained [3.4.5].For example, Handan Iron and Steel Group installed 6000 measuring points in a continuous casting production line to collect data every 3 seconds, so the amount of monitoring data is extremely large.Because of a large amount of data, how to refine the data related to the roll gap parameter and how to establish the roll gap prediction model service are the key to handle these data.Therefore, how to research and use the new method for data mining process parameters of continuous casting is a problem faced by this research.
Traditional data processing is usually based on the traditional method of dimension reduction, such as principal component analysis (PCA) and linear discriminant analysis (LDA).The process parameters of continuous casting have a lot of problems; for instance, multi-information source, real-time and large amount of data.Hence the traditional method of dimension reduction obviously cannot meet the requirements.
The third literature introduces an incremental principal component analysis (Candid Covariance-free Incremental PCA, CCIPCA) that does not compute covariance, which converges quickly and reduces computational cost when dealing with high-dimensional data.The fourth article introduces an incremental principal component analysis (Chunk IKPCA, CIKPCA), which decomposes bulk data.This method performs an eigenvalue calculation on a large data block and reduces the computational cost.MKaur[26] used LDA as the classification method from accuracy, sensitivity and prediction accuracy in the process of ECG signal analysis, and finally found that the isometric rotation method has the highest accuracy.

1.2Contribution
Many predecessors did a lot about the method of feature reduction which is better to solve the problem of real-time data and large amount of data，but some of them considered less about the accuracy and effectiveness of the reduction.The traditional linear dimension reduction method (PCA) and the emerging nonlinear dimension reduction methods (t-SNE, LLE, LTSA) were compared on the requirement of multi-information source, real-time and large amount of data of continuous casting production process parameters.Based on that, the dimension reduction methods which are suitable for predicting the roll gap parameters in the continuous casting process are analyzed and improved.Finally, a standardized robust incremental principal component analysis (Normalization Robust Incremental Principal Component Analysis, NRIPCA) was found.Before the reduction of the data of continuous casting, this method firstly performs outlier monitoring and elimination and normalizes the processed data.This method considers not only the accuracy of feature reduction but also the validity of feature reduction.

Methodology(MATEC Web of Conferences)
Feature reduction removes irrelevant and redundant information from features based on guidelines.It maintains the important features and it can effectively reduce the data dimension for overcoming the 'dimension disaster' problem.This section compares the traditional linear dimension reduction method with the emerging nonlinear dimension reduction method.Thus, it finds out the dimension reduction method which is suitable for the prediction of the roll gap parameters of continuous casting.

Principal Component Analysis (PCA)
As a mainstream method of linear dimension reduction, PCA has the most linear reconstruction error because its theory is complete.So its application is very extensive [6,7,8,17,18].PCA is mainly looking for a series of orthogonal basis vectors that maximize the variance of all data.
Suppose the sample set }, ,...x x , {x ; the optimization goal can be written as In equation ( 1) use the Lagrange multiplier method W W XX T λ  (2) Among them, W is the best projection matrix, λ is the eigenvalue.
Therefore, we only need eigenvalue decomposition of the covariance matrix T XX , then the eigenvalues will be sorted: .Then, taking the first d eigenvalues corresponding to the eigenvectors ) ,... , ( is the solution of principal component analysis.

Formatting the title
The title is set in bold 14-point Arial, flush left, unjustified.The first letter of the title should be capitalised with the rest in lower case.You should leave 35 mm of space above the title and 6 mm after the title.

Formatting author names and author affiliations
Author names should be typed in 9-point Arial.The style for the names is First Names then Last Name, with a comma after all but the last two names, which are separated by "and".Do not use academic titles.
Affiliations of authors should be typed in italic 8-point Arial.They should be preceded by a numerical superscri pt corresponding to the same superscript after the name o f the author concerned.Please ensure that affiliations are as full and complete as possible and include the country.
Step2 Dimension reduction: learn probability distribution at low latitudes to define a d-dimensional .Assuming that the similarity between i y and j y The relative entropy (Relative Entropy) is introduced to measure the similarity between Q and P.
Using the gradient descent method makes the relative entropy minimum.
The advantage of t-SNE is that it takes a smaller distan ce to create a larger gradient that separates dissimilar poi nts, and this gradient does not go too far away from dissi milar points.Although t-SNE is almost impossible to sur pass the classic, there are some deficiencies.t-SNE is mai nly used for visualization and cannot be used for other pu rposes.t-SNE is mainly used to preserve the local feature s because it is difficult to map the high intrinsic dimensio n data set to 2-3 Dimensional space very well.t-SNE is a probabilistic distribution problem and is not possible to consider reconstructing a model after dimension reductio n.

Locally linear embedding (LLE)
Locally linear embedding (LLE) was proposed by S. T. Roweis in 2000.[22,23,24] It is a non-linear method of unsupervised dimension reduction.After dimension redu ction of low-dimensional, the data can maintain the topol ogy of the original data.This has been widely used in ima ge classification and clustering, face recognition, natural Among them, G is the local Gram matrix: .Reconstruction factor i f describes the neighborhood geometry for rotation invariance and translation invariance for i x and its neighbor point i X .
Step3 Construct low dimensional embedding: Low-dimensional embedding is obtained by maintaining the weight of reconstructed data in each neighborhood in low-dimensional data; the objective function is So the low-dimensional data is still maintained at ij f , the embedding result is completely characterized by the reconstruction weights that characterize the intrinsic geometry.The advantage of LLE algorithm is simpler calculation and less parameter selection.It can optimize the characteristic parameters.There is no local minimum problem.It can retain the essential characteristics of data and express the internal structure of the data.However, LLE is sensitive to signal noise, because it is limited by the local neighborhood parameter k and the target dimension of the low-dimensional data, affecting the dimension reduction effect of high-dimensional data.

Local tangent space alignment (LTSA)
LTSA was proposed by Zhang Zhenyue and others from Zhejiang University in 2004 [25,26].The method is mainly to gradually align all the tangent planes when a manifold is unfolded.The local geometry of the low-dimensional data is constructed by approximating the tangent space of each sample point.The local low-dimensional coordinates are obtained by the projection of the observed data points in the local tangent space.The local low-dimensional coordinates of the overlap are locally transformed to obtain the global low-dimensional embedding coordinates. Step1

Comparison of four methods for dimension reduction
The dimension reduction methods will be tested with the actual production of continuous casting data in this section.As can be seen from the figures, PCA dimension reduction effect is normal and can reduce the dimension of the data, but the data classification is not obvious after the dimension reduction.In contrast, LTSA dimension reduction effect is better.Effective dimension reduction and data classification is obvious; different types of data aggregation are obvious.The reason why the two types of methods produce different results is that, in principle, linear dimension reduction mainly analyzes the importance of a certain dimension data, reduces the dimension according to its importance and deletes the data of lesser importance.In order to achieve dimension reduction, linear dimension reduction method is mainly to carry out the data clustering, as well as high similarity data aggregation and projection in low-dimensional space.
In the roll gap prediction model of continuous casting, the characteristics of flow data from multi-information source determine that the feature reduction method should serve as the roll gap prediction model.When the nonlinear dimension reduction method is used, the original data is processed, resulting in better dimension reduction and improved operational efficiency.However, the data after dimension reduction cannot be evaluated; the prediction of the roll gap parameter cannot be realized.After repeated tests and comparison, the linear dimension reduction method is used to characterize the dimension of the continuous casting roll gap prediction model.

Experimental and Research on Improved PCA
In this section, PCA will be modified to make it suitable for feature reduction of continuous casting process parameters characterized by multi-information source data.

The theoretical basis of NRIPCA
The NRIPCA method updates the PCA model in real time based on sliding window thinking, replacing the oldest data in the window with the latest data.

Figure. 4 Principle of sliding window
The main steps of this algorithm are: (1) Data pre-process: use a certain amount of historical data for discrete point detection.
(2) The pre-process data are normalized to make the data mapped in a range, which makes the optimization process of the algorithm become more smooth and easier to correctly converge to the optimal solution.
(3) Principal component analysis: establish principal component analysis model and calculate the model parameters.
(4) Collect a new set of real-time data, and standardize the data with the current mean and variance.
(5) According to the current principal component analysis model, analyze whether the data is normal data, then go to the next step; otherwise, go back to the previous step.
(6) Add a new set of process parameters to the sliding window; meanwhile delete the oldest set of process parameters and update the principal component analysis model and related parameter values.Go to step 4.

data source
In order to verify the validity of the NRIPCA algorithm, a uniform parameter of a continuous casting production line of a steel mill was randomly collected in China for a week.These data include 32,154 dimensional data, of which 26,001 dimensional data are the switch data monitored by the control system, and 6153 dimensional data are valid process parameter monitoring data.The data in the data set cover important processes in the continuous casting process, including tundish, mould and sector segments.In order not to damage the sequence of data, Rada criterion is used to detect abnormal values of data, then the detected abnormal values are replaced by nearby values.

NRIPCA experiment on continuous casting flow data
Two important parameters of the NRIPCA algorithm are the size of both the sliding window and the sliding step.In order to study the size of both the sliding window and the step for the data set of this project, the total amount of data and the size of the step are changed respectively, and the number of stable principal components is obtained as the evaluation criterion.The result shows the ratio of sliding window size in the total amount of data.From Table 1, it can be seen that the step size has little effect on the model within 480 (8 minutes), indicating that the change of continuous casting process parameters is a gradual process which can slow the model updating time in the subsequent online analysis.In other words, each group can reduce the computational cost without having to update the model parameters of the new process.
In this paper, a total of 446,000 samples are selected for sliding window PCA analysis in a week.The window size is 334500 and the step size is 60 (i.e. the principal component analysis model is updated once every minute).In order to calculate the reduction error, the latest model data is reconstructed with the current model parameters after each model update.
In order to quantitatively analyze the size of the reconstruction error, the average relative error between the reconstructed data of each parameter and the original data is calculated as shown in Table 2 As can be seen from Table 2, the average error between the original data obtained by PCA and sliding PCA is less than 5%, which is within the acceptable error range.Therefore, it is feasible to use the sliding window PCA to dynamically update the feature parameters of process flow data online.

Discussion
In this paper, fifteen kinds of process parameters that influence the roll gap value of continuous casting sector are reduced by one week's data.PCA and NRIPCA are used respectively.The former is to perform the feature reduction on the whole seven-day data; the latter uses part of the data as real-time data to simulate the process of online feature reduction.By observing the comparison between the reconstructed data and the original data, it can be found that both methods have a good fitting effect.In order to compare two methods more intuitively, the average reduction error of the reconstructed data and the original data of the two methods is shown in Table 3.As can be seen from Table 3, NRIPCA reduction error is not small for each process parameter.And the reduction error of the two methods in the comparison of each parameter has its own advantages and disadvantages.But overall, the reduction error in NRIPCA is smaller than in PCA, meaning that NRIPCA has a small reduction error when feature reduction is performed on changing process parameters.From an algorithmic point of view, PCA calculates the global optimal reconstruction, while NRIPCA calculates the current local optimal reconstruction and has optimal reconstruction errors for each sliding window.Therefore, the overall reconstruction error in NRIPCA is smaller than in PCA.

Conclusion
In this paper, a standard robust incremental principal component analysis (NRIPCA) is proposed.The new method can reduce the robustness of real-time data by standardizing and pre-processing the data.Therefore, it is more suitable for the feature reduction of continuous casting production process parameters with multi-information source data characteristics.The experimental results prove the effectiveness of NRIPCA algorithm.The new method will provide a reliable source of data for the prediction of the roll gap parameters of the continuous casting and will be useful for establishing the prediction model for the parameters of the continuous casting.Further, we hope this feature reduction method can be applied to other industrial data area.
. If the projection of all sample points is separated as much as possible, the variance of the projected sample point is the largest.The variance can be expressed as

Table 1
Sliding window size ratio of the total amount of data .

Table 3
The average relative error of reconstructed data of each parameter