A novel method for SIFT features matching based on feature dimension matching degree

. We proposes a method for fast matching SIFT feature points based on SIFT feature descriptor vector element matching. First, we discretize each dimensional feature element into an array address based on a fixed threshold value and store the corresponding feature point labels in an address. If the same dimensional feature element of the descriptor vector has the same discrete value, their feature point labels may fall into the same address. Secondly, we search the mapping address of the feature descriptor vector element to obtain the matching state of the corresponding dimensions of the feature descriptor vector, thus obtaining the number of dimensions matching between feature points and feature dimension matching degree. Then we use the feature dimension matching degree to obtain the suspect matching feature points. Finally we use the Euclidean distance to eliminate the mismatching feature points to obtain accurate matching feature point pairs. The method is essentially a high-dimensional feature vector matching method based on local feature vector element matching. Experimental results show that the new algorithm can guarantee the number of matching SIFT feature points and their matching accuracy and that its running time is similar to that of HKMT, RKDT and LSH algorithms


Introduction
Image feature matching is essential and crucial for computer vision, and can be employed in image retrieval [1], image stitching [2], visual reality [3], and scene reconstruction [4][5][6]. Firstly, local features are extracted from each image in these applications, and then the local descriptor with invariance is computed. By applying the nearest neighbor search of features, the most similar feature can be found to achieve the image matching. SIFT [7] is known as the most representative feature extraction and matching method currently, and it had been improved by many researchers [8][9][10]. The feature-points extracted from SIFT descriptors are extremely robust, it is adapt to illumination change and deformation. Thus, it is still the most popular local descriptors to solve the image matching problems. However, the SIFT descriptors is high-dimension, namely 128. It is difficult to find a way to deal with the computational complexity during the matching process. The proposed method is a highdimensional feature vector matching method based on the partial matching of feature vector.
The experimental results show that the performance of our proposed method is similar to that of HKMT, RKDT and LSH.

Related work
SIFT operators are considered to be highly robust, but their descriptors have 128 dimensions. The existing algorithms use the exhaustive method as the search matching algorithm. The exhaustive search algorithm can ensure that the correct matching point pairs are searched out, however, its time cost is high; therefore many approximately nearest neighbor search algorithm comes into being. These algorithms can not only ensure the search precision to a certain degree but also save much time compared with the exhaustive search algorithm. Generally speaking, these algorithms can be divided into two types: those based on tree indexing method and those based on hashing indexing method. The KD tree search method [11] is a typical search matching algorithm based on trees but may fail when its number of dimensions is more than 15 [12] or may even be inferior to the linear exhaustive search algorithm. The best-bin-first (BBF) algorithm [12] is the extension of the KD tree and breaks through the restriction of the number of dimensions and can search well high-dimensional features. On this basis, Hartley et al proposed the RKD-tree search algorithm [13] and improved the traditional BBF search algorithm. With GPU, Hu et al achieved the parallel processing of the RKD-tree [14], thus greatly enhancing the matching speed. But experimental data show that [15] its correct matching rate is less than that of the traditional SIFT image feature point matching algorithm. The IDistance [16] is another type of tree-based indexing algorithm, uses the K-means clustering method to analyze data distribution and then indexes the B+-tree. This reduces the search scope to a certain degree and increases the search speed. However, because the algorithm takes some clustering time to establish its index structure, it is not effective for the applications based on image feature matching. Different from the indexing method based on tree structure, the locality sensitive hashing (LSH) algorithm [17] is another type of approximate nearest neighbor search matching algorithm based on the hashing indexing. The method is: first, the data points are projected to the Hamming space and the indexes are established with a set of hash functions that satisfy certain constraint conditions, thus obtaining the nearest neighbor of an enquiry point with a rather big probability. To solve the problem that the algorithms based on LSH have a rather big space cost, scholars improved many LSH algorithms [18][19], thus ensuring the accuracy rate and search speed and optimizing the space cost. Ref. [20] applied the LSH algorithm to retrieving SIFT descriptors, thus reducing search time and obtaining satisfactory results.
Starting from the matching of each dimensional element of a feature point, this paper relies on the dimensional matching degree of a feature point pair to screen the suspicious matching point pair and then purifies it with the Euclidean distance, thereby obtaining the ultimate matching pair. Compared with the Euclidean distance exhaustive algorithm, the KD tree algorithm and the LSH algorithm, through discretizing the feature address mapping, our algorithm can directly match each dimensional element, obtain the dimensional matching degree among the feature points, screen out a large number of unmatched points with the dimensional matching degree, reduce the search scope, guarantee the accuracy rate and achieve fast search.

The feature dimension matching algorithm(FDMA)
In matching two images, the 128-dimensional SIFT feature descriptor point sets Then for all the descriptors, the k-dimensional element has the following function: as the starting point, we discretize each-dimensional element according to the scale σ: The k-dimensional feature discretized coordinate axis is formed and shown in Fig. 1. (2), the , i j P Q matching pair meets the following condition.
In other words, on the k-dimensional feature discretized coordinates axis, the mapping of ( ) i P k and ( ) j Q k is certainly adjacent or equal: If extended to other 128-dimensional feature discretized coordinates axis, the , i j P Q matching pair is shown in Fig.2. The above analysis draws the following conclusion: For the 128-dimensional SIFT feature descriptor point sets  , 1, 2, obtained after every dimensional feature of one pair of feature points i P and j Q is discretized with a given threshold σ is adjacent or equal, then , i j P Q is the matching pair whose every dimensional distance is smaller than σ and whose Euclidean distance is smaller than 128 σ. Therefore, after using the threshold σ to discretize the feature coordinate axis, eachdimensional feature matching descriptor automatically obtains its adjacent or equal discrete values. In actual application, we transform into unit cell array address the discrete coordinate of the each-dimensional element of feature descriptor in   interval at the discrete coordinates of the j-dimensional feature. The diagram for creating the j-dimensional unit cell array is shown in Fig. 3:   Fig. 3.The diagram for creating the j-dimensional unit cell array. namely their feature dimension matching should be D<1. Therefore, the feature dimension matching degree threshold must be introduced into practical applications. After calculating the feature dimension matching degree between i P and j Q , it should be compared with the given feature dimension matching threshold. If it is larger than the threshold, it is assumed to be mutually matching. The feature dimension matching threshold that has SIFT features generally assumes to be around 0.7; the larger the value, the fewer the matching points. The smaller the value, the more matching errors. The above operations can achieve the suspect matching point pair sets between all the feature points in P and all the feature points in Q , thus establish the suspect feature point matching matrix  Extract the feature point labels in the array according to the address in 5,Obtain the dimension matching degree of each point, and remove the feature points that do not satisfy the dimension matching degree threshold. 7. Purification by Euclidean distance to eliminate mismatches and achieve one-for-one matching 8. Delete the farthest Delete-counts pairs of feature point 9. Output the matching point pairs However, the analysis and experiments show that the matching point pairs do not have one-for-one correspondence. One point in P may correspond to several points in Q or several points in P may correspond to one point in Q . To solve this problem, we calculate the Euclidian distance of the repeatedly matching feature point pair and use the nearest point as the matching point, ultimately obtaining the one-for-one correspondent feature point pair, whose number is ' c . Finally, to purify the experimental results on the algorithm, we introduce the ultimate screening threshold ( '/ ) DeleteCounts floor c beta  and use beta as the screening threshold to calculate the Euclidian distance in the same way, rank it and delete the DeleteCounts feature point pair which is the farthest away from the Euclidean distance and the least matching. Algorithm process and steps are as shown in Table1.

Experimental results and analysis
We use the Visual Studio 2013 to implement the novel feature point matching method proposed in this paper and calculate the matching time and feature points. The configuration of the computer is Intel (R) Core (TM) i5-3210M CPU @ 2.50Hz, 8GB memory. We first perform the scale variation, rotation variation, noise variation, illumination variation and deformation variation of the Lena images to analyze the matching accuracy and speed, and then use the standard image dataset [21] for evaluation.
The FDAM algorithm uses the following parameters: where N is the number of effective discrete segments;  is the feature dimension matching degree threshold; beta is the screening threshold. The threshold value of the exhaustive algorithm for comparison is 0.2.
In order to evaluate the effectiveness of the proposed FDAM algorithm, we compare its performances with those of the three ANN algorithms: the hierarchical k-means tree (HKMT) algorithm, the random KD tree (RKDT) algorithm and the local sensitive hashing (LSH) algorithm and the exhaustive algorithm. We use the algorithms in the FLANN library [22] in the OpenCV3.0 to implement the three algorithms.

Evaluation of artificial transformation test
This paper carries out the artificial transformation of the Lena images (512*512, the number of SIFT feature points is 1998),.In this case, we can know the exact position of the feature points, so as to better analyze the matching results. The experimental results are shown in Fig 4. In Fig. 4(a), salt and pepper noise is added to the Lena image, and the number of SIFT feature points is 1931. In Fig. 4(b), the ratio of areas of the two matched images after scale variation is 1:0.25, and the number of SIFT feature points is 1192. In Fig  4(c), the illumination variation of the Lena image is conducted, and the number of SIFT feature points is 2256. In Fig. 4(d), the Lena image is rotated 45 degrees clockwise, and the number of SIFT feature points is 2927. In Fig. 4(e), the Lena image is deformed, and its upper edge varies into the original 3/4; the number of SIFT feature points is 2105. The running time of the algorithm and its logarithm pairs of matching feature points are obtained, as given in Tables 2 and 3, where the running time only includes that of feature point matching but does not include that of feature point detection.

Evaluation of the benchmark
To further evaluate the search performance of the FDAM algorithm, we use data sets as ours test data set.to compare the matching accuracy and matching speed. The test data set uses six different scene images, with a total of 164 images, and 809471 SIFT feature points. For each set of images, the first image is used as a query image, and its feature is matched with that of another image (reference image) in the image set respectively. We record the time consumption and the number of matched feature used by the Exhaustive, HKMT, RKDT, LSH and our FDAM algorithms.  Table 5 shows the average number of points in the 16 sets of images matched by the algorithms. Compared with the other conventional algorithms, our FDAM algorithm performs better in terms of matching speed. Even when some image sets have a small number of matching points, the FDAM algorithm has a much higher number of matching points and faster matching speed than other algorithms. However, the FDAM algorithm is relatively slower when there are a large number of matching points because it spends some time on ranking during final screening.   The Table 4 and Table 5 shows that with suitable parameters, when images have variations in zoom, illumination, rotation, blur and Viewpoint, the FDAM algorithm does not have much difference in number of matching points from other algorithms and may not affect the robustness of the existing algorithms. Even in the case of viewpoint variation, the number of feature matching points searched is much large than that of other algorithms. This is because the distance function selected is different. Other algorithms use the ratio