Real-time recognition of weld defects based on visible spectral image and machine learning

The quality of Tungsten Inert Gas welding is dependent on human supervision, which can’t suitable for automation. This study designed a model for assessing the tungsten inert gas welding quality with the potential of application in real-time. The model used the K-Nearest Neighborhood (KNN) algorithm, paired with images in the visible spectrum formed by high dynamic range camera. Firstly, projecting the image of weld defects in the training set into a two-dimensional space using multidimensional scaling (MDS), so similar weld defects was aggregated into blocks and distributed in hash, and among different weld defects has overlap. Secondly, establishing models including the KNN, CNN, SVM, CART and NB classification, to classify and recognize the weld defect images. The results show that the KNN model is the best, which has the recognition accuracy of 98%, and the average time of recognizing a single image of 33ms, and suitable for common hardware devices. It can be applied to the image recognition system of automatic welding robot to improve the intelligent level of welding robot.


Introduction
Tungsten inert gas welding (TIG) [1] was invented in the late 1930s, initially for welding magnesium and aluminum, and later expanded to join copper and steel alloys. Nowadays TIG is used to weld high-value components in the nuclear, automotive and aerospace industries. The separation of heat input and deposition rate in TIG process facilitates the control of welding quality and the acquisition of high quality weldments. Although the TIG process has many advantages, it is a complex phase transition process coupled with multiple factors, which is easily disturbed by external environment and human factors, and requires highly skilled labor, to ensure the welding quality. In the context of labor reduction, automatic welding robot has become a research focus. At present, the working mode of welding robot in application is "teaching-reproduction" mode, which doesn't have the function of real-time detection and self-adaptive adjustment of welding quality. [2] The very bright arc in the welding process brings great challenges to the real-time evaluation of welding quality, the process of image acquisition and image processing becomes extremely difficult. Lucas et al. [3] used the laser source combined with video recording to counteract the interference of strong arc light arc at first. An additional filter was used to eliminate the visible part of ultraviolet(UV) and arc light, and only the near-infrared(NIR) spectrum was allowed to pass through. The authors then devised a system of weld pool to improve the uniformity of welding and the repeatability of welding operation. Yu et al. [4] used plasma optical signal to monitor the laser welding process, and predicted the welding quality through algorithm based on relationship between the weld strength and the monitoring signal according to the process parameters. Gu and Duley [5] used an omnidirectional microphone with a frequency response range of 20kHz-500kHz to observe the resonant acoustic emission in the laser welding process, and analyzed the acoustic signals obtained in the laser welding process through frequency and statistical methods. The results showed that the signals collected by the sound sensors at different positions could reflect the quality of welding. J.Mirapeix et al. [6] studied the GMA welding process, obtained the spectral signal of the weld pool surface by using a spectrometer, and then carried out characteristic analysis and research on the characteristic spectrum lines, and found the corresponding relationship between the electronic temperature curves of the characteristic spectrum lines about Fe and Mn elements and the welding defects. The above methods are classified by indirect data, such as auxiliary light source, spectrum and acoustic spectrum, etc., which can't get the direct end-to-end image recognition of weld defects and are not convenient.
The state-of-the-art for image acquisition dedicated to arc weld monitoring are cameras with high dynamic range (HDR),which can obtain a clear real-time image of weld pool by using the HDR sensor to counteract the arc light, without any filtering or laser illumination. This paper takes advantage of KNN's strong generalization ability, easy interpretation and fast speed, and directly uses HDR welding camera pictures, omits the feature extraction operation of image data, and reduces the identification time. Compared with CNN, support vector machine (SVM), classification and regression tree (CART) and Naive Bayes (NB) models, the KNN model can accurately fast on a six-class test ("good welds "," burn-through ", "contamination ", " lack of fusion", "lack of shielding gas" and "high travel speed").It is suitable for common hardware devices, and the application of automatic welding robot image recognition system, to improve the intelligent level of welding robot.

Data
The images dataset came from the SS304 stainless steel shielded welding data set of Kaggle competition platform ( Figure 1). The dataset includes 45,058 pictures about 304 stainless steel shielded welding, each has a classification label (5 defects + good weld) in Figure 1. These pictures are divided into training set (24204), verification set (9694) and test set (11160) for modeling and testing [7]. The number distribution of each kind of pictures is shown in Table 1. All the images come from the Xiris XVC-1000 camera, which has a dynamic range of 140 dB+. The camera can absorb enough light to increase the brightness of the area around the arc, while avoiding over exposure of the arc. The camera is mounted on the mechanical arm and moves with the welding torch at an angle of 45°to capture the image of the weld pool and the area in front of the welding arc in real time.

Preprocessing
The size of the original image is 1208×700 pixels, exceeding processing unit hardware limitation. Therefore, firstly transforming the image into grayscale, secondly scaling to 40×22 pixels. Then the image pixel matrix data type is transformed into FLOAT32, and divided by 255 for regularization. The obtained matrix is used as the input of CNN training. The pre-processed training set 40×22 pixel matrix data is transformed into a one-dimensional array as the input of KNN, SVM, CART and NB model training.  3 Algorithm modelling

K-nearest neighbor
KNN algorithm is a non-parametric pattern recognition and classification algorithm based on statistics, which was first proposed by Yakowitz and applied to time series prediction [8].
It has the characteristics of simple algorithm, strong generalization ability and good interpretation. The model uses Euclidean distance to measure the distance between samples, as shown in Formula 1.
The prediction process is to calculate the distance between the test sample and the training sample, then obtain K adjacent samples and labels, and finally classify them by voting.

Convolutional neural network
CNN was proposed by Yann LeCun [9]. The convolutional neural network model used in this paper is consisted of seven layers, including two convolutional layers, two pooling layers, two fully connected layers and one output layer. The input layer is an image with a size of 40× 22 pixels. Detailed description of the model is in Table 2. Adam [10] algorithm is used for optimization, and the super parameter epochs is 6, batch size is 64, and learning rate is 0.001.

Model accuracy
The pre-processed training set 40×22 pixel matrix data was transformed into a one-dimensional array, and the MDS algorithm was used to project it into a two-dimensional space ( Figure 2). It is observed that similar weld defects are aggregated into blocks and distributed in different areas. There is partial overlap between different weld defects and they are not completely separated. Only the " high travel speed" category is separately clustered together. The projection results indicate that the nonlinear classifier may have a better performance. The accuracy of the five models is shown in Table 3.It can be found that KNN and CNN convolutional neural network models rank first and second in accuracy in the test set. Compared with the confusion matrix of the two models, as shown in Table 4, the KNN model has higher accuracy in "burn-through" situation, with 618 correct predictions among 731 "burn-through" samples, while CNN convolutional neural network only predicted 18 correct ones. In other categories, CNN convolutional neural network produces a small amount of confusion in predicting "contamination" and "lack of shielding gas", possibly because " lack of shielding gas" is a form of "contamination".
Comparing the confusion matrices of the two models with the projection results, it is found that the two analyses can verify each other. Some of the "good" samples in the confusion matrix are predicted to be " lack of fusion", showing the overlap of sample points in the projection. The "high travel speed" category predicted more accurately in the two models, showing that all samples were clustered together in the projection.  Compared with the accuracy, recall rate and F1-score of the two algorithm models, as shown in Table 5,KNN performed better than CNN, The most obvious difference was the classification of "burn-through" and " lack of shielding gas" defects. The performance of CNN Convolutional Neural Network in the prediction of " burn-through " defect pulled down the F1-score.The data in this study are not uniformly distributed. " burn-through " and " lack of shielding gas " accounted for 6.55% and 0.91% respectively, which are relatively low. The weight of CNN convolutional neural network model in the optimization process is relatively low, which is easy to be ignored. However, the KNN algorithm is less affected by the sample imbalance, so it has a better performance.

Consuming-time
In view of the excellent performance of KNN and CNN convolutional neural network models in accuracy, the above trained models are loaded to simulate real-time online evaluation scenarios, and the models are used to predict each picture in the test set. From the beginning of picture reading, including picture reading, preprocessing and prediction, and until recording the prediction results. The results show that KNN model takes 33ms on average to predict a single image, and CNN convolutional neural network model needs 69ms.The images used in this study came from the Xiris XVC-1000 welding camera, which records at 55 frames per second. The KNN model can predict every 2 frames of images, so it can find weld defects more timely than the CNN convolutional neural network model. All the algorithms mentioned in this paper are run in the Win10 environment, the hardware configuration is IntelCorei7-9700CPU, 32G memory without GPU. It can be seen that the KNN model requires less hardware resources while maintaining the prediction accuracy. It can meet the requirement of real-time online evaluation of weld defects in production environment.

Conclusion
In this study, we trained a new KNN model, which can be used in TIG welding process monitoring system for real-time online evaluation of welding quality. Images of weld pool and surrounding area in this study was collected by HDR camera, and wasn't need to manually claim features, they was directly input into the model for prediction. Moreover, there is little demand for hardware resources, it means that ordinary PC can meet the demand. So it has a strong application value. It can be applied to the image recognition system of automatic welding robot to improve the intelligent level of welding robot.