Fusion of Multi-Vision of Industrial Robot in MAS-Based Smart Space

Abstract. The paper presents a fusion method of muti-vision of industrial robot in a smart space based on multi-agent system(MAS), the robotic multi-vision consists of top-view, side-view, front-view and hand-eye cameras, the moving hand-eye provide vision guidance and give the estimation of robot position, other three cameras are used for target recognition and positioning. Each camera is connected to an agent based on an image-processing computer that aims at analyzing image rapidly and satisfying the real-time requirement of data processing. As a learning strategy of robotic vision, a back-propagation neural network(BPNN) with 3-layer-architecture is first constructed for each agent and is independently trained as a classifier of target recognition using batch gradient descent method based on the region features extracted from the images of target samples(typical mechanical parts), and then the outputs of trained BPNNs in MAS-based smart space are fused with Dempster-Shafer evidence theory to form a final recognition decision, the experimental results of typical mechanical parts show that fusion of multi-vision can improve the robotic vision accuracy and MAS-based smart space will contribute to the parallel processing of immense image data in robotic multi-vision system.


INTRODUCTION
At present, the majority of industrial robot still rely on manual teaching in practical application, adding vision to robot is becoming a trend of intelligent robot, and the robotic vision with monocular or binocular camera has widely used in intelligent robot system, whereas one or two cameras can not deal with complex vision task such as workpiece recognition and positioning, visual servoing, robot tracking, multi-robot cooperating. Using muticameras is an effective method to deal with the complex vision task of industrial robot [1][2][3], and the subsequent problem of multi-vision is immense image data processing that may be not undertaken by one computer. In this paper, a multi-agent system(MAS) based on computer network is adopted for the immense image data processing of multi-vision task, in which each agent is a computer connected to a camera with independent image processing capability, the MAS forms a smart space surrounding the robot and provides the parallel processing of multi-vision information, there are some models for the communication and cooperation of multiagents in MAS [4][5], a blackboard system of artificial intelligence is used for this purpose, and we put emphasis on the learning and fusing stratergy of multi-vision in industrial robot application.  It is necessary to design a learing strategy for each agent oriented to robotic vision, BPNN has been widely used for many years since its adaptability and flexibility [6][7], therefore it is chosen as the learing strategy of agents in MAS-based robotic vision system in this paper. As an input of BPNN, each image feature can be characterized by converting its pixel matrix into a vector, resizing its normal sizes to small sizes for reducing the dimensionality of feature space. A typical BPNN structure is as follows:  For each BPNN, an input vector of input layer, x={x1,x2, …, xn1}T, sands for a target image which piles up along column direction, y={y1,y2, …, yn2}T is an output vector of hidden layer, z={z1,z2, …, zn3}T is an output vector of the BPNN, subscript n1, n2 and n3 stand for the neuron number of input, hidden and output layer respectively. IW is a matrix of weight between input layer and hidden layer, HW is a matrix of weight between hidden layer and output layer. In order to improve learning efficiency, the batch training with N target images is used during BPNN training, and all computations of BPNN is completed with matrix operation instead of element operation. Training procedures of BPNN for target image recognition is as follows:

Feedforward computing
All columns of each target image are piled up into a vector X, N images of different target are arranged to form a matrix X n1× N , the matrix will be extended to a new matrix X n1+1 × N with first row of -1 value for the sake of threshold of neurons the sum of weighted X is given by: Similar to the hidden layer, for the output layer we have

Back-propagation computing
Let D stands for a desired output matrix of the network, from feedforward computing result, the error of the BPNN output is given by: The local gradient of output layer G GZ can be computed by Eq.6, here denotation .* stands for the element-by-element product of the matrices.
Z Z E δZ (6) Deleting first column, the weight matrix of hidden layer, HW, is changed into Local gradient of hidden layer GY can be given by: Update the weight of hiden layer using batch gradient descent method with learning rate r as follows: Update the weight of input layer using batch gradient descent method with learning rate r as follows:

Judgement of BPNN training end
The sum of squared errors between the actual output of the BPNN and desired output is computed using Eq. 11 and is used to evaluate training performance of BPNN based on batch gradient descent method. E is a measure to the performance of trained BPNN based on batch gradient descent method, BPNN training process will end while E being less than a given threshold value E t , for example, E t =0.0001.

Fusion of Multi-Vision Based on D-S evidence theory
Considering the fusion of top-view, side-view and frontview image from muti-vision sensors, three BPNNs need to be constructed respectively, each network is composed of input layer, hidden layer and output layer [6,7], their K measures the degrees of conflict between the different bodies of evidence. In this paper, the six target classes will be discussed, suppose class space Ω={ link rod, piston, gear, bolt, hexagon nut, other}, the output vectors of the three BPNNs of top-view, side-view and frontview vision can be written as follows respectively,

Experimental Results of Muti-vision Fusion of Robot in MAS-Based Space
Target classes ={link rod, piston, gear, bolt, hexagon nut, other}, test samples are acquired from the captured images of mechanical part from top-view, side-view and front-view direction, the region feature are extracted through morphological pre-processing such as dilation and erosion, as shown in Fig.4.  samples, as shown in Table1. The test results in Table 1 shows that the recognizing accuracy of multi-vision fusion proposed in the paper is higher than one of single BPNN, it helps to improve the accuracy of robotic vision. Error ratio 2.5% 7.5% 17.5% 15%

Conclusions
In MAS-based smart space, multi-vision fusion of industrial robot has been completed using D-S evidence theory, the BPNN classifer of each agent is trained independently to recognize the various targets (mechanical parts) from different directions, and the region features of top-view, side-view and front-view image are inputed to their own BPNN respectively, the outputs of three BPNNs are fused with D-S theory to produce a final recognition decision. The experimental results of target recognition for a variety of mechanical parts have shown that recognizing accuracy of the proposed fusion method in the paper is higher than conventional one of single BPNN method and is robust to degradated target image, and MAS-based smart space will contribute to the parallel processing of immense image data in robotic multi-vision system.

Acknowledgment
This work is financially supported by the Guangdong natural science foundation (No. S2012010010265) and Jiangmen science and technology bureau (No.20140060 117111).