A time-encoded approach to detect cooperative targets for UAV landing

To detect visible light cooperative targets during UAV landing, we use a time relay to encode the intensity of lamps. The change pattern of the lamps intensity is presented in this paper. The frames with strong and weak lamps light are called strong frames weak frames respectively. In continuous sequence images, we take weak frames as background and subtract background using strong fames. The geometric transformations are considered when subtracting background and a float threshold value are used to threshold the difference images. We also design several geometric constraints to eliminate interference points. The approaches to find the first weak frames and to track lamps in a sequence images are also presented. The experiments show that our method is robust to scale change, interference points and moving object in background.


Introduction
The Unmanned Aerial Vehicle (UAV) takes on many military and civilian missions, such as early warning, reconnaissance, surveillance, communication, relay, target indication and damage assessment, and makes it an important tool in modern naval warfare [1].Because of the complex landing environment and the limited space, UAV landing is one of the most dangerous phases in a flight mission.To solve this problem, several schemes has been proposed, such as net-catch landing, parachute landing, GPS-based landing, IMU-based landing and manual remote control landing [3][4][5].Among these proposals, GPS navigation are likely to be disabled in wartime, and it is difficult for other proposals to guarantee security, accuracy, speed and autonomy.
For the last decades, computer vision has been widely used in UAV navigation and landing, due to the development of videometrics and image processing.Vision navigation gets the image of the ideal landing site through airborne camera, detect and recognize the cooperative targets, and obtain the relative position and attitude between UAV and the World Coordinate through coordinate transformation.The accuracy and precision of detection of the cooperative targets are very important in vision-based UAV landing.Although a number of cooperative target schemes have been proposed, infrared lamps are the most popular cooperative targets for fixed wing UAV landing.Gui et al. [2] presented a UAV landing system which consisted of four infrared lamps and a airborne camera with a 940 nm optical filter, and detected the centers of infrared lamps using NLOG operator.In this method, The measurement range of distance between the UAV and the ideal e-mail: wangzitju@163.comlanding point was about 450 m -5 m.Hao et al [6] proposed a similar scheme, which consisted a five infrared lamps and a 850 nm optical filter.In his method, the lamp can be detected as far as 80 m.Using infrared lamps and optical filter, cooperative targets detection is robust to illusion, viewpoint, distance and background changing.Actually, an IR filter have to be installed in front of the airborne camera, resulting that the camera is only sensitive to the rays with special wavelength.For most UAVs, it is very hard to add a new camera.So, how to use visible light lamp as cooperate target and how to detect it are valuable engineering problems.
In airborne vision-based landing, visible light lamp detection is different from moving object detection and salient object detection.In moving object detection [7], the background is salient still or dynamic while the object has special motion mode; in salient object detection [8], the object is non-cooperative.Compared with the above two tasks, the cooperative visible lamp detection has three characteristics: • The lamps are stationary relative to the background.
• The camera is moving, and the scale becomes larger as the UAV approaching the ideal landing site.
• The spatial position of the lamps is fixed, but the temporal intensity can be adjusted.
In this paper, we present a time-encoded visible lamp detection approach in UAV landing.In our method, the intensity of all lamps in continuous video frames changes at a fixed frequency.The video frames with higher intensity of lamp are called the strong frame, while those with lower intensity of lamp are called the weak frame.Because the frequencies of lamps intensity and camera video are fixed, the strong frames and the weak frames MATEC Web of Conferences 139, 00049 (2017) appear at fixed frequency together.We get background from the weak frames, and remove background in strong frames using difference image.Consider the disturbance points and scale changing in the video during UAV landing, we design several geometric constraints and compensation method.Finally, the effectiveness of the approach is verified by experiments.

Structure of Proposed Approach
The equipments used in our approach mainly consist of a camera, four visible light lamps and a digital time relay.The parameters of lamps are shown in Table 1.It should be noted that the visible distance is determined by camera's focal length and aperture.The real lamp is shown in  We use a time relay to blink four lamps synchronously at 5 Hz.The frame rate of the video is 25 Hz.Thus the weak frame should appear every 5 frames.We note F as the set of a period of continuous video frame, I i (1 ≤ i ≤ M) as the i th frame, and I i 0 as the first weak frame in F. It should be note that 1 ≤ i 0 ≤ 5 and all I i 0 +n×5 (n ∈ N) are weak frames.The basic idea in this paper is getting background from the weak frames and subtracting the background in the following strong frames to get the position of the lamps.So the first assignment is to find i 0 , in other words, the first weak frame.We assume that the background in adjacent frames are similar.The assumption is tenable, because the camera image the same scene during UAV landing.But the camera is moving, it is impossible that adjacent frames have the same background.We further clarify the assumption that there are translate and scale transformation of background between adjacent frames.The reasons why we not use affine and projective transformation are • affine and projective transformation have too many parameters; • it is difficult to get a lot of stable feature points during UAV landing, because of the spurious feature points of ideal landing site; • the dissimilarity between adjacent frames is relative small as long as UAV's attitude doesn't change drastically.
The second assignment is the estimation of the transformation from weak frame I i 0 +5n to the following four strong frame I i 0 +5n+ j ( j = 1, 2, 3, 4) respectively.And the transformation matrix is noted as S j n , where n = 0, 1, 2, 3 . . .and j = 1, 2, 3, 4. Then the transformed weak frames I � i 0 +5n can be obtained using geometric transformation: where T(•) is geometric transformation.And the background can be removed from the strong frames using difference image: where D i is the difference of I i and the I � i−1 .Then we threshold the difference image, extract 8-connect regions and calculate their centers.The binary image of D i is noted as B i .We take all the centers as the candidates of the position of center.Finally, several geometric constraints of the relative position relation between lamps are presented, to determine lamp position from the candidates.We adopt a floating threshold when thresholding the difference image: • if there are too many candidates, we increase the increase threshold value.• if there is no combination of four centers satisfying the constraints, we decrease the threshold value.
The work flow of the proposed method is shown in Figure 3.

Principle and technical implementation
Before mentioned, several assignments have to be tackled to implement the proposed method.In this chapter, we present the approaches for these problems.

Estimation of transformation
We take each weak frame as the background of the next four strong frames.Because of the motion of the camera, the transformation relationships between weak and strong frames need to be estimated.During UAV landing, only the image of the scene around the ideal landing site or the lamps are stable, due to dynamic background.Naturally, we should only use the image of the ideal landing site or scene around the lamps to estimate the transformation.
For a long time, SIFT+BFMacth+RANSAC has been the most used method for estimation of geometric transformation between image pairs, which is a general approach definitely.Recently, a new method for feature matching in video application, Grid-based Motion Statistics(GMS) [9], has been proposed.GMS uses ORB features and provides a real-time, ultra-robust correspondence system.We use ORB+BFMatch+GMS to detect and match the feature points in weak and strong frames.We note the set of matched feature points in weak and strong frames as ) T } respectively.The transform matrix from weak frame to strong frame is given as: where s is scale factor while d x and d y are translation factor.So, we get the equation: for N pairs matched points, we have 2N equations actually: the equation ( 5) can be easily solved by Least Square method.

Geometric constraints
If we find N connected region centers and N ≥ 4, C 4 N candidates of the position of four lamps need to be examined.We design several geometric constraints according to the relative position of lamps.Figure 4 is the binary image of the difference image.To examine four points in a candidates, the first step is numbering them as Table 3.To illustrate our approach effectively, we note the position of the coordinates of the numbering centers as (x U , y U ), (x L , y L ), (x M , y M ) and (x R , y R ) respectively.The second step is designing several relative position constraints: The Dis(•) above is L 2 distance.The constraint 1 means that x coordinate the top lamp should be smaller than the right one's and be bigger than the left one's; the constraint 2 and 3 mean that the three centers on the left are similar to an equilateral triangle; the constraints 4 means that the three centers on the bottom are in a line; the constraint 5 means that the ratio the distances from the left one and the right one to the middle one are approximately equal to 1.
Table 3.The approach for numbering centers number position U the center with the smallest y L the center with the smallest x expect U R the center with the biggest x expect U and L M the remaining one

Track centers of lamps in sequence images
If four centers coordinates satisfy the constraints above, we use a similar method in [2] to track lamps in sequence images.But the different thing is we only detect lamps in strong frames.As four centers in a strong frame satisfy the constraints above, the maximum distance D xmax and D ymax in x and y axis among four centers and their average coordinates is computed.The average coordinates are computed as: In the next strong frame, the searching is done in the region which takes ( x, ȳ) as the center and 3D xmax and 3D ymax as width and height respectively.However, it is difficult to detect and match feature points in Section3.1 when the searching area is very small.So we choose a threshold for the width and height of searching region.We donate the width and height as W and H respectively, then they are given as: Because of the shake of the UAV, the scene of the lamps is likely to be out of the camera during UAV landing.In this case, if no centers satisfy the constraints in continuous four strong frames, the searching region are set to the whole image otherwise the searching region is not updated.

Further details
We stated earlier that the first assignment is find the first weak frame.Because two adjacent of a weak frame are strong frames, there should be four centers satisfy the constraints in the binary images of the difference images of a weak frame and the previous frame and the next frame.In other words, if I i is a weak frame, I i−1 and I i+1 are strong frames.Therefore, there must four centers satisfy the constraints above in binary image B i and B i+1 .So our approach to detect the first weak is find i 0 so that B i 0 and B i 0 +1 contain four centers satisfy the constraints and the searching region is the whole image.
After thresholding the difference image, some morphological operations are taken to improve the robustness of our approach.The first operation is eliminating connected region whose area is smaller than 4. Then morphological dilation is taken to connect disconnected regions caused by subtracting images.Finally, the connected regions are counted and each center is computed as: Once i 0 is obtained, the positions of four lamps are also obtained and the searching region for all the following strong frames can be reduced as described in Section 3.3.

Experimental results and discussion
To prove the validity of the method, experiments were carried out.The videos were taken by the equipments mentioned in Section 2. We did not optimize any component of our final method.The program was written by OpenCV 2.4.13 using C++ and run on Intel 7700k at 4.2 Hz with 16 GB RAM.The distance from lamps to the camera is about 200 m, measured by a total station.The parameters used in Section 3.2 are shown in Table 4, which is determined by experiments.
Table 4.The parameters of geometric constraints th 1 th 2 th 3 th 4 th 5 1.5 0.2 -0.95 1.0 2.0 The result of our approach is shown in Figure 5.The original image size is 288 × 352 and the result images in Figure 5 and 6 are cropped in order to demonstrate our method effectively.The intensity of the lamp changes with a fixed mode, which can be seen visually.The 217 th and 222 th frame are weak frames while the other four among them are strong frames.The scale change can be seen visually, caused by the change of focal length, as the person and the shrub become bigger.To illustrate the robust of our method, the difference images and binary images are also shown.Unfortunately, the detection of the right lamp was failure in 219 th frame, because the interference point and the other three lamps in the the binary image satisfy the constraints in Section 3.2.However, this failure only effect the next searching region's position and size.And the success of 220 th and 221 th frame show that our method is robust to such failure.In 220 th and 221 th frame, the binary images also show the effectiveness of the geometric constraints in Section 3.2.In Figure 6, a white car went through four searching regions, but the lamps were detected successfully, which show the method is robust to such interference.The computation time of one frame ranges from 40 ms to 100 ms, determined by the size of searching region.The average computation time is about 70ms for one frame, which satisfy the processing frequency of the airborne control system proposed in [2].

Conclusion
To utilize visible light lamps as the cooperative targets for UAV landing, we use a time delay to change the intensity of lamps temporally and analysis the pattern of change.
In our experiments, there is an image in which the intensity of lamps is weak among five continuous sequence images, while the intensity of lamps in the other four image is strong.The former type frames are called weak frames and the latter ones are called strong frames.We take weak frames as background and subtract background from strong frames.The geometric transformations between weak and strong frames are considered and a float threshold value strategy is used.In order to find lamps, we design a approach to eliminate interference points.The method for tracking lamps and finding the first weak frame in sequence images are presented in this paper.The experiments show that the proposed method is robust to moving object in background, scale change and interference points.And the computation time is acceptable.In conclusion, our method is suitable for detecting visible light cooperative targets during UAV landing.The further work can be done on the location accuracy of lamps and the constraints to eliminate interference points.

Fig- ure 1 .
The maximum frequency of time relay is 5 Hz, and the minimum on-off step is 0.1s.The camera we used is Sony FCB-EX2700P, its main parameters are shown in Table 2.The four lamps are installed on an iron shelf, which is shown in Figure2.The shelf has three arms, which are not in the same plane.The length of the arm towards up is 2.1 m, and the other two are 2.1 m and 3.1 m respectively.

Figure 2 .
Figure 2. Four visible light lamps with the iron shelf.

Figure 3 .
Figure 3.The work flow chart of the visible light lamp detection.

Figure 4 .
Figure 4.The binary image of difference image.(a): without interference point; (b): with interference point.

Figure 5 .Figure 6 .
Figure 5.The detection of lamps in scale changed sequence images.

Table 1 .
The parameters of lamp

Table 2 .
The parameters of camera