Research on target location of unmanned aerial vehicles in parallel path

This paper mainly studies how to use the stereo vision system that combines the monocular vision with parallel path search to locate the target. When the unmanned aerial vehicle (UAV) searches in the mission area according to the parallel path, the SSD image detection algorithm based on deep learning is adopted to detect and identify the target in the area. The image coordinate information is inversely calculated by using the pixel coordinate information fed back by machine vision. The auxiliary coordinate system is established according to the relationship of angle position between the track line and the basic coordinate system in the parallel path. Combining the position relation and the attitude direction information of UAV, the target position conversion relation between the imaging coordinate system and the auxiliary coordinate system is solved by using the direction cosine matrix. Combined with the coordinate information of UAV, the coordinate position of the target point in the basic coordinate system is finally solved through three coordinate conversion operations. In order to avoid the single calculating error of the target coordinates, the weighted average operation is carried out. On the basis of not changing the search trip of the parallel path, the target location function is preliminarily realized through the reverse solution and the weighted average operation of the target coordinates.


Introduction
At present, the application scopes and fields of target location are extensive, and machine vision is widely used to implement target location in all walks of life [1][2] [4]. Compared with multi-vision, monocular vision has less research on static target location. There are many limitations in target location using monocular vision, but monocular vision also has its outstanding advantages, such as small computation, fast speed and economical benefit. Monocular vision can locate static targets by stereo vision algorithm. The evaluation of the machine vision target location algorithm is mainly reflected in three aspects: stability, location accuracy and location time-consuming. In the process of using stereo vision algorithm to implement location, it is easy to be affected by noise interference, illumination change and other factors, which will have a negative impact on system stability. At the same time, a large number of operations and time consumption cannot meet the real-time location requirements. Therefore, this paper employs a new way to carry out the target location by using monocular vision to construct the stereo vision system, and uses the stereo vision system combines the monocular vision with parallel path search to locate the target.

Platform selection
SSD image detection algorithm based on deep learning is a one-stage method [4] [5]. The design idea of this method is to conduct intensive sampling at different scales and different locations on the target image according to certain characteristics, and then use CNN algorithm for classification and regression [5][6] [7]. For the reason that this method is a onestage method, it has the advantage of faster speed. However, it is relatively difficult to conduct uniform and dense sampling on the images because of the uneven distribution of the samples and the backgrounds.
SSD algorithm is used to detect and identify the target, and a multi-scale feature graph of CNN network is used for detection [6][8]. Its basic architecture is shown in Fig.1. To make the data set based on the target style, and add labels to the training images employing labelImg tool, then the four files under Main folder were generated by code, which are named as test.txt, train.txt, val.txt and trainval.txt, respectively. The implementation of SSD on TensorFlow is based on the TensorFlow version of balancap (https://github.com/balancap/SSD-Tensorflow), which is used to implement the inference process of SSD method.

Pixel coordinate conversion
This case implements SSD300. There are actual target and phantom target as output values. In this paper, we ignore the latter phantom target and mainly discuss the positioning of the former actual target in the parallel path in detail. For the detected actual target, we can obtain the center point pixel coordinates of the target, as well as a height value and a width value, both the two values are represented by pixel coordinates.
In this paper, we assume that the flight attitude has no impact on the imaging, and the camera is connected to the unmanned aerial vehicle (UAV) through the cradle, which is vertically downward to search in a parallel path. The target pixel coordinates are extracted and converted to imaging coordinates. The camera imaging coordinates are established with the center point of the image as the origin of coordinates, but the origin of coordinates constructed during the target detection is the upper-left endpoint of the target image. There is a displacement difference between the origin of pixel coordinates and the origin of UAV camera imaging coordinates.  . Where in, pixel coordinates can be simply understood as imaging coordinates. During target detection, the images will be compressed and restored, and then represented according to pixel coordinates. x k is half of the image length of the camera, and y k is half of the image width of the camera.

UAV direction determination
In the process of parallel path search, the camera collects data information, and the onboard computer detects the target by running the SSD target detection program. When the target is found, the flight parameters of the UAV are first taken to obtain the real-time coordinate information of the UAV. The UAV carries out the search according to the parallel path.
During the search, the flight directions of the UAV are two relatively fixed directions, that is, the direction along the parallel path tends to the positive x -axis and the direction along the parallel path tends to the negative x -axis. Here, we specify that the direction along the parallel path to the positive x -axis is positive, and the direction along the parallel path to the negative x -axis is negative. When the parallel path is perpendicular to the x -axis, there will be no incremental change along the x -axis, then the direction tends to the positive yaxis will be positive, and the direction tends to the negative y -axis will be negative.
Acquired the last two coordinates  

Coordinate conversion
The target location solution at find T mainly includes the conversion from pixel coordinates to imaging coordinates, from imaging coordinates to auxiliary coordinates, and from auxiliary coordinates to basic coordinates. First, we set up the auxiliary coordinate system, taking the coordinate point of the UAV at x -axis and y -axis as ' x -axis and ' y -axis. There is a translation relationship between the auxiliary coordinate system ' ' ' o x y and the basic coordinate system, that is

Construct auxiliary coordinate system
i and the imaging coordinate system j , so that the auxiliary coordinate system and the imaging coordinate system are the right coordinate systems ' ' ' o x y and '' '' '' o x y with the same origin. Assume that there is a target point P, whose coordinates in the auxiliary coordinate system and the imaging coordinate system are respectively The coordinate transformation relation of the target point P from the imaging coordinate system to the auxiliary coordinate system is essentially the vector sum that projects the components of the imaging coordinate system to the auxiliary coordinate system The formula can be simplified as cos , cos , cos , = cos , cos , cos , cos , cos , cos , The above equation is the rotation matrix in the general form of the transformation from the imaging coordinate system j to the auxiliary coordinate system i . For the auxiliary coordinate system   describes the attitude of the imaging coordinate system, which is mathematically called the attitude matrix. All the elements in the attitude matrix are represented by the cosines between the coordinate axes, so the attitude matrix is also called the direction cosine matrix.
When the direction of the UAV nose is positive, that is, when 0 0 k  or 0 1 =0 and 0 k k  , the UAV imaging coordinate system is obtained through the auxiliary coordinate system rotating the fixed angle & around the which can be simplified as That is, there is a coordinate transformation relationship between the imaging coordinate system and the basic coordinate system When the direction of UAV nose is negative, that is, when 0 0 k  or 0 1 =0 and 0 kk  , the UAV imaging coordinate system should be regarded as the auxiliary coordinate system rotating fixed angle   +&  around the z -axis, which can be obtained as follows k , there is no rotation between the imaging coordinate system and the auxiliary coordinate system. When 0 =0 k , the angle between the imaging coordinate system and the auxiliary coordinate system is 180 degrees.

Weighted average of the target position
The UAV carries out target search according to the parallel path planning. When the target is found, the coordinate value of the target point can be preliminarily obtained through the calculation. During the flight, the weather situation will have a certain influence on the flight attitude. In order to solve the influence of UAV flight attitude on target positioning accuracy, we employ the weighted average of target position in this paper. In the process of detecting the target, the UAV will continuously detect the target status of each frame image and calculate the target location. Then the calculated results are compared with the existing target information in the stack and weighted average is made after the same decision condition is satisfied, that is r is the target radius, m is the average of the target positioning error, and determine coefficient can be modified according to weather situation. After it is determined as the same target, the weighted average of the target position coordinates is carried out. The specific weighted average method is to assign a weight value to each position information, and the weight value of single target position information is 0 x .
When it is determined to be the same target, whenever the position information is weighted, the position information of the target in the stack is assumed to be t , the weight value is x , the newly received position information is 0 t , and the weight value is 0 in the stack increments to the weight value of the newly received position information. As the target detection program runs, the position information in the stack will become more accurate by processing the images of more frames.

Summary and outlook
This paper completes the process of UAV target detection and recognition as well as target location and processing. By constructing the training data set, SSD in-depth learning algorithm is used to process the image information captured by the camera, obtain the target pixel coordinates and convert them into imaging coordinate information. Combine the location information of UAV transmitted by flight control with the location relationship between the path location of parallel path search and the basic axis, and use the direction cosine matrix to implement coordinate transformation, the target location is preliminary achieved. Besides, single calculating error is eliminated through the weighted average of the target location, and the coordinate position of the target is more accurate.