Study on Machine Learning Based Intelligent Defect 1 Detection System

: In the paper, it is proposed to develop a machine learning based intelligent defect 10 detection system for metal products. The common machine vision system has the surface (stain, 11 shallow pit, shallow tumor, scratches, Edge defects, pattern defects) detection, or for the processing 12 of the size, diameter, diameter, eccentricity, height, thickness and other parts of the non-contact 13 numerical parameters of detection. Considering the quality of the work piece and the defects of the 14 standard, so for the quality of customized testing requirements, the study is the development of 15 machine vision and machine learning metal products defect detection system, mainly composed of 16 three procedures: Image preprocessing, training procedures and testing procedures. The system 17 architecture consists of three parts: (1) Image preprocessing: we first use the machine vision. 18 OPENCV to carry out the image pre-processing part of the product before the detection. (2) 19 Training procedures : The algorithm of the machine learning includes the convolution neural 20 network (CNN), chunk-max pooling is used to train the program, and the generative adversarial 21 network (GAN) based architecture is used to solve the problem of small datasets for surface defects. 22 (3) Testing procedures ： The Python language is used to write the program and implement the 23 testing procedures with the GPU-Based embedded hardware In industries, collecting training 24 dataset is usually costly and related methods are highly dataset-dependent. So most companies 25 cannot provide Big-data to be analyzed or applied. By the experimental results, the recognition 26 accuracy can be obviously improved as increasing data augmentation by GAN-Based samples 27 maker. Manual inspection is labor intensive, costly and less in efficiency. Therefore, this study will 28 contribute to technological innovation, industry, national development and other applications. (1) 29 The use of intelligent machine learning technology will make the industry 4.0 technology more 30 sophisticated. (2) It will make the development of equipment industry be better by the machine 31 learning applications. (3) It will increase the economics and productivity of countries for the aging 32 of the population by machine learning. 33


Introduction
Manual inspection is labor intensive, costly and less in efficiency.Further, the accuracy of the defect detection is lower due to harsh industrial environment and human errors.Therefore, big data, computer vision and machine learning play important roles in the automated defect detection system.In industries, collecting training dataset is usually costly and related methods are highly dataset-dependent.In 2016, Y. Shimizu et al. [1] presented a concept of a micro thermal sensor to be used for defect inspection of a smoothly-finished surface such as a bare wafer or a hard disk.
Existences of the defects on a measurement surface will be auto-detected by scanning the surface with the micro thermal sensor, which is utilized to detect the variation of the thermal flow in-between the sensor surface and the measurement surface.The proposed micro thermal sensor has a possibility of auto-detecting various types of surface defects.In 2017, R. Ren et al. [2] proposed a generic approach that requires small training data for automated surface inspection.The approach builds classifier on the features of image patches, where the features are transferred from a pre-trained deep learning network.And then pixel-wise prediction is obtained by convolving the trained classifier over input image.The experiment involves two tasks: (1) image classification and (2) defect segmentation.The results of the experiments showed the proposed method improves accuracy by 0.66%-25.50% in the classification tasks, improves accuracy by 0.66%-25.50% in the classification tasks, and improves accuracies by 2.29%-9.86% in all seven defect types.In 2017, Y. C. Samarawickrama et al. [3] proposed an automated inspection system for ceramic tile industry based on image processing techniques.This system can detect color variations and defects such as corner damages, edge damages and middle cracks on the surface of the tile with high accuracy and efficiency.The results of the experiments were outstanding with of 96.36% detection accuracy rate.
And the processing time for one tile was approximately 2 seconds.So this outstanding achievement of results reflects that this automated system can effectively replace manual ceramic tile detection system with better accuracy and efficiency.In 2016, S. Faghih-Roohi et al. [4] proposed a deep convolutional neural network solution to the analysis of image data for the detection of rail surface defects.This huge amount of data obtained from many hours of automated video recordings makes it impossible to manually inspect the images and detect rail surface defects.Therefore, automated detection of rail defects can help to save time and costs, and to ensure rail transportation safety.The advantage of the deep convolutional neural network solution is to skip elaborate procedures of feature extractions required in classical learning approaches.The results of different network architectures characterized by different sizes and activation functions were compared.By the experimental results, the rail defect classes can be successfully classified with almost 92% accuracy.
Wood defect detection has an important influence on the automation of wood industry.In 2016, Z. N. Ke et al. [5] proposed the hybrid algorithm of genetic algorithm (GA) and particle swarm optimization (PSO) algorithm in view of the complexity of wood defect segmentation.Firstly, the contrast of the image is enhanced by the linear transformation function.Secondly, applying GA and PSO-GA hybrid algorithm respectively for the image segmentation, and finally morphological processing is performed.The experimental result shows that the PSO-GA hybrid algorithm has a better and more stable effect on detecting wood defects compared with the GA.Image technologies nowadays are used not only for keeping personal events safe, but also are widely applied in conjunction with automated electronic systems.Computer vision is widely used for inspection of the production quality in industries.Food industry is not an exception.Containers for food industry are made in very large quantities.In 2016, A. Laucka et al. [6,7] proposed an automated computer vision algorithms for the control of PET preparation quality.The algorithms and methods were used for the detection of defective products mainly based on the image segmentation, digital production, erosion, smoothing.The most effective filters for the defect detection of the workpieces have been determined.It was carried out that efficiency of algorithms are close to 100 %.And reached throughput is 10,000 workpieces per hour.In 2015, M. Win et al. [8] proposed two new thresholding methods, namely contrast-adjusted Otsu's method and contrast-adjusted median-based Otsu's method, for automated defect detection system for titanium-coated aluminum surfaces.The results of the experiments showed that the proposed contrast-adjusting methods have performance similar to minimum error thresholding (MET) and are generally better than Otsu's method.In 2015, M. W. Ashour et al. [9] proposed the approach of the Support Vector Machine (SVM) classifier with various kernels for the categorization of machined surfaces into the six machining processes of Turning, Grinding, Horizontal Milling, Vertical Milling, Lapping, and Shaping.And the effectiveness of the gray-level histogram as the discriminating feature is explored.
The results of the experiments suggest that the SVM with the linear kernel provides superior performance for a dataset consisting of 72 workpiece images.And its training time is less than the training time of ANN.In 2015, C. Huang et al. [10] proposed a method of workpiece recognition and location by Hu moment invariants based on Open Source Computer Vision (OPENCV).Firstly, the methods of image preprocessing including image graying, mean filter, image binaryzation by adaptive threshold segmentation are used.And then, the contours from the binary image are extracted.Finally, the object workpiece can be identified by matching the extracted contours with the object contour from the template image by Hu moment invariants.At the same time, the method of confirming the workpiece position and direction is put forward.It showed that the proposed method can recognize the target workpiece and locate the position effectively by the experimental results.So the used algorithms are simple and fast in operation.
To Sum up, being able to identify machining processes that produce specific machined surfaces is crucial in modern manufacturing production.Image processing and computer vision technologies have become indispensable tools for automated identification with benefits such as reduction in inspection time and avoidance of human errors due to inconsistency and fatigue.And collecting training dataset is usually costly in industries.Therefore, it is very important to develop a machine learning based intelligent defect detection system with small training data.

Materials and Methods
The common machine vision system has the surface (stain, shallow pit, shallow tumor, scratches, Edge defects, pattern defects) detection, or for the processing of the size, diameter, diameter, eccentricity, height, thickness and other parts of the non-contact numerical parameters of detection.
Considering the quality of the work piece and the defects of the standard, so for the quality of customized testing requirements, the study is the development of machine vision and machine learning metal products defect detection system, mainly composed of three procedures: Image preprocessing, training procedures and testing procedures.
The system architecture consists of three parts: (1) Image preprocessing: we first use the machine vision.OPENCV to carry out the image pre-processing part of the product before the detection.( 2) Training procedures: The algorithm of the machine learning includes the convolution neural network (CNN), chunk-max pooling is used to train the program, and the generative adversarial network (GAN) based architecture is used to solve the problem of small datasets for surface defects and (3) Testing procedures: The Python language is used to write the program and implement the testing procedures with the GPU-Based embedded hardware.

Image preprocessing
In order to improve the efficiency of the operation, a series of integrated image processing will be conducted in advance to facilitate the learning of the machine learning model.It consists of three layers of processing, the first layer is Median filter, the second layer is Canny edge detector, and the third layer is Thresholding.The flowchart of image preprocessing is showed as Figure 1.

 Layer 1: Median filter for grayscale images
The median filter is a nonlinear digital filtering technique and very widely used in digital image processing.It is often used to remove salt and pepper noise from an image or signal and preserves edges while removing noise.The median filter replaces a pixel by the median, instead of the average, of all pixels in a neighborhood.Thresholding creates binary images from grey-level ones by turning all pixels below some threshold to zero and all pixels about that threshold to one.

Testing procedures
Testing procedures：The Python language is used to write the program and implement the testing procedures with the GPU-Based embedded hardware.In order to effectively identify the workpiece surface defects, the architecture of convolutional neural network is chosen in testing procedures.The flowchart of testing procedures for defect detection system is showed as Figure 5.

Results
The samples of 151 normal surface images and 151 surface defects images are the training datasets for CNN.Moreover, the testing samples of 10 normal surface images and 10 surface defects images are the valid datasets.
In order to make CNN work well, we need to train and adjust the configuration of network several times to obtain a relatively accurate network architecture.The accuracy comparisons of the defect detection can be used to analyze the results of many different hyper parameters for convolutional neural network.
As shown in Table 1, there are eight configuration (type A ~ H ) for convolutional neural network.The differences between the eight configurations are the number of layers of convolutional layer and the pooling layer, CNN training epochs and learning rate.The parameter layer_position represents the order of the layers is located in the CNN.The parameter "conv_nb_filter" represents the number of convolutional filters.The parameter "conv_filter_size" represents the Size of convolutional filters.The parameter "pl_kernel_size" represents the Pooling kernel size.The parameter "fc_n_units" represents the number of units for fully connected layer.In addition, the recognition effect of CNN with the Canny edge detector pre-processing was tested.
The result of pre-processing image is showed as Figure 6.The eight results of accuracy with or without image pre-processing are showed in Table 2   In this study, several experiments are conducted to verify the feasibility of data augmentation by using the generative adversarial networks.Firstly, existing samples are used to train and adjust the CNN.The best configuration from Type A ~ H will be chosen for GAN-Based samples maker for defect detection system.The experimental results with data augmentation by GAN-Based samples maker for defect detection system shown in Table 3.

Discussion
By the experimental results in Table 2 and Table 3, we can find the following： (1) The maximum accuracy of the defect detection is only 74%, no matter with the preprocessing of Canny edge detection images or not.It means that the recognition accuracy cannot be obviously improved with or without the preprocessing of Canny edge detection images.
(2) As the number of convolution layers and pooling layers increase, the recognition accuracy of defect detection only increases by up to 2% for type C and type D. As the number of convolution layers and pooling layers increases, the recognition accuracy of defect detection drops from 75% to 74% for type A and type B. It means that the recognition accuracy cannot be obviously improved with or without increasing the number of convolution layers and pooling layers.
(3) As the Epoch increases from 999 to 1999 and the learning rate decreases from 0.00001 to 0.0000001, the recognition accuracy of defect detection is improved 5%~16%.It means that the recognition accuracy can be obviously improved by increasing the Epoch and decreasing the learning rate.
(4) The experimental results of data augmentation by GAN-Based samples maker are shown in Table 3.The recognition accuracy of defect detection increases from 90% to 94% for 10% data augmentation by GAN-Based samples maker.So We can find that the recognition accuracy can be obviously improved as increasing data augmentation by GAN-Based samples maker.

Conclusions
In industries, collecting training dataset is usually costly and related methods are highly dataset-dependent.So most companies cannot provide Big-data to be analyzed or applied.By the experimental results, the recognition accuracy can be obviously improved as increasing data augmentation by GAN-Based samples maker.It means it will be a good solution to solve the problem of small dataset in the future.To sum up, the development of a machine learning based intelligent defect detection system will contribute to technological innovation, industry, national

 Layer 2 :
Canny edge detectorThe Canny edge detector is an edge detection operator that uses a multi-stage algorithm to detect a wide range of edges in images.The Canny Edge detector algorithm aims to satisfy three main criteria: (1) low error rate: a good detection of only existent edges.(2) good localization: the distance between edge pixels detected and real edge pixels have to be minimized.(3) minimal response: only one detector response per edge.The steps of the Canny edge detector are (1) Filter out any noise by using the Gaussian filter.(2) Follow a procedure analogous to Sobel for finding the intensity gradient of the image.(3) Non-maximum suppression is applied.(4) Hysteresis: Canny does use two thresholds (upper and lower).Layer 3: ThresholdingThresholding is a non-linear operation that converts a gray-scale image into a binary image where the two levels are assigned to pixels that are below or above the specified threshold value.MATEC Web of Conferences 201, 01010 (2018) https://doi.org/10.1051/matecconf/201820101010ICI 2017

Figure 1 . 1 )Figure 2
Figure 1.The flowchart of image preprocessing for intelligent defect detection system

Figure 2 .Figure 3 .
Figure 2. The architecture of convolutional neural network for intelligent defect detection system

Figure 4 .
Figure 4.The overall architecture of GAN-Based defect detection system

(Figure 5 .
Figure 5.The flowchart of testing procedures for defect detection system

Figure 6 .
Figure 6.The result of pre-processing image

Figure 7 .
Figure 7.The process of GAN-Based samples maker

development and other applications. ( 1 )
The use of intelligent machine learning technology will make the industry 4.0 technology more sophisticated.(2) It will make the development of equipment industry be better by the machine learning applications.(3) It will increase the economics and productivity of countries for the aging of the population by machine learning.

Table 1 .
. The configurations for convolutional neural network

Table 3 .
The results of accuracy with or without data augmentation by GAN-Based samples maker