Hyperspectral image classification based on multi-layer feature extraction

. The Hyperspectral image classification is an important issue,which has been pursued in recent year. The field of application involves many aspects of life . Hyperspectral images (HSIs) exhibit a limited number of labeled high-dimensional training samples, which limits the performance of some classification methods on feature extraction or feature reduction. In the paper, we propose a supervised method based on the PCA network (PCANet) and linear SVM for HSIs classification. We used PCANet (principal component analysis network) to learn the character features.We verified the influence of these parameters on the performance of PCANet by modifying the key parameters of the experiment. We carry out extensive experiments on India pines dataset . The results demonstrate that our method significantly outperforms PCA+KNN methods . And the results not only are optimistic but also the recognition rate can reach 94.29%.At last, we compared the experimental results of the same algorithm on different data sets and so on.


Introduction
With the rapid development of remote sensors, the acquisition and collection of hyperspectral data has become much easier and more affordable, making hyperspectral image analysis to be one of the most promising techniques in many practical applications, including precision agriculture, environmental monitoring, military surveillance, etc [1] . PCANet is proposed in [2],mainly using Face Recognition and Handwritten digit recognition, and has made great success in printed document recognition, and it is a kind of deep learning framework by cascading PCA as in convolution neural network. Consequently, a great number of classification methods have been proposed in character recognition [3] , especially in scene character recognition [4] , but Hyperspectral images classification using PCANet is only a little in recent 2 years. HSIs classification focuses on extracting effective features, combining with complicated spectral, spatial information ,classifier and so on. Many spectral featurebased methods for HSIs classification have been developed gradually and applied successfully. For example, Deep learning based character feature learning methods have drawn increasing interests from the community，but also some methods are applied based on the support vector machine(SVM) [5,6] . Some are designed based on multinomial logistic regression [7] ,using limited samples [8] and a supervised method based on PCANet and GaussianSVM [9] . Shiqi Yu et al. [1] proposed a feature learning method using convolutional neural networks (CNN) for hyperspectral image classification,which draws significant attention recently. Although these classification methods can make full advantages of the spectral information of HSIs, some noise points often found to be appeared on obtained classification maps,and classification is inconvenient. Therefore, the spatial information or the advanced feature should be taken into account.Recently, numerous works have been made to combine the spatial information with spectral information [10] .
However, data high-dimensionality and lack of labeled samples limit the performance of these methods on feature extraction or feature reduction for classification. It is critical yet challenging to fully excavate and use the information for HSIs classification.One characteristic of the PCANet model seem to challenge common wisdoms in building a deep learning network such as ConvNet [11,12] . PCANet extracts image features during the training procedure is divided into four steps: The first step is pattern preprocessing. It mainly adds zero to the each pattern, so that the dimension of the pattern after zero is integer. Only in this way,can we solve the first more conveniently.
The second step is cascade the principal component analysis as in convolution neural network. It is mainly to extract the feature of patterns and it contains two stage in this step. One is the original input image of the local characteristics of the sliding window is selected to complete the image feature selection operation, and then calcutation the result , next the feature vector will be selected as the main filter of the stage,finally each image will have the filters for a convolution operation. The second stage is similar with the first stage and the input images of second stage come from the the first stage.
The third step is to perform a hash operation. The purpose is to keep the binarization processing after encoding translate into the decimal image.
The fourth step is to calculate block histogram and to feature extraction. The following will explain feature extraction in detail.

The proposed method
PCANet is a deep learning baseline for image classification that performs competitively when extracting features. In this paper, we proposed a new method that Hyperspectral image classification based on multi-layer feature extraction.

PCANet for HSIs
We used a two-level cascaded PCA of PCANet to learn the spectral features. The architecture of PCANet for HSIs is shown in Fig.1 and Fig.2 [2] . In the Fig.1 we first deal with the preprocessed hyperspectral image pixel vector two times, because the first time we deal with the noise, select the useful band, transform the high optical image data cube from the three-dimensional space to the two-dimensional space. After second processing,in fig.1, The red strip represents the input hyperspectral image pattern. By step 1, we can get a red and black strip. The black represents the number of zeros that we make up behind the red strip. Then step 2, we can turn the sample before step 1 into a square, and finally, step 3, get it. The input sample we need is I .The method we propose is zeros, truncation and square up.
In Fig.2 we use the pattern I ,it represents the input Hyperspectral image pattern data and W is the filter parameter. In the input layer, image I is divided into patches utilizing the overlapping strategy. L and L are the number of selected principal components in the twolevel cascaded PCA separately. In the first stage of cascaded PCA, the front L blocks of PCA from the patches are chosen to convolve with the original image, I . This produces L images, which act as the input data of the second stage. In the second stage of cascaded PCA, the structures are designed similarly as the first stage. The front L blocks are also used to convolve with the input data in second stage. Thus, the obtained L × L images represent the input for the last layer. While the histogram blocks are extracted from the output images of the second stage of cascaded PCA. Finally, the features fi of I are learned. A detail description on the PCANet is given in [2] and [9] . Pixel Vector( )

HIS Classification
The proposed method of Hyperspectral image classification based on multi-layer feature extraction contains four main steps: Preprocessing, Sample selection, Spectral features extraction, Linear SVM classification and so on. The diagram of the proposed PCANet-Linear SVM to classify HSIs is shown in Fig.3.

Main steps
Suppose that a pattern has p characteristics, p is a pixel vector, and we added q number of zeros at each the end of pattern, so that the pattern after zero is integer p q, and the specific steps are as follows formula: def: p, q, n ∈ Z ; n = , . . p, , ∈ (1) Only in this way, can we make pixel vector p became a square, its border length and width is n.  46 6 Suppose that N input training sample images of size isn × n are given, and the patch size is k × k . First of all ,the local features are selected for each training image by using the patch, then subtract patch mean from each and obtain [3] : = ̅ , , ̅ , , … , ̅ , (2) Where, x , is a mean-removed patch, Constructing the same matrix for all input images and putting them together to get:

The second stage of PCA
The second stage of PCA process is similar with the first, Just it is noting that the second stage input is output of the first. Assume that l th filter output of first stage be [3] : Then as the same as the first stage,we select the local feature for each image by using the patch,then subtract patch mean from each and obtain: = , , , , , , … , , , , ∈ ℝ , × ， (7) next collect all the average value of the matrix after the first l as output filter: = , , … , ∈ ℝ , × ， (8) To together all the filters outputs as Y = Y , Y , … , Y ∈ ℝ , × ， (9) and select the first L main feature vector as the filter. Finally, each image will have the filters for a convolution operation respectively.

Hash operation
The matrices handled by PCA are done binarization processing, and the result is a matrix that consist zore and one. Then we view the vector of binary bit as a decimal number, so that an output matrix can converst into a single integer image. The calculation formula is as follows [3] :

Histogram operation
The core of PCANet is to use PCA to learn multi stage filter, so the effectiveness of the filter selection is essential to the generation of the characteristics of the expression. Then divide each image into B blocks and the size of the block matrix is obtained. Next calculate the histogram of each block, and then transformed the histogram into a vector. Finally we collect all the vectors together to get the characteristic vector of each training image.

Effects of number of patch
The experimental test data set for the data Indian Pines, including the test data 7390 and the training data of 1844.Each image size is 15 × 15 pixels. In the data sampling, using our sampling method. A large number of experiments are conducted to study the effect of different parameters on the performance of the experiment. One of the first to verify the impact of the size of the slider on the experiment, the recognition rate of the relevant parameters of the following table 1. According to table data, in the case of other parameters unchanged, the large size of the recognition rate is relatively high. But when the block size is increased to nine, the recognition performance is slightly reduced. Because the image local feature extraction by sliding window, sliding window is too small, incomplete extraction of local feature, PCA reconstruction could appear some errors, so recognition rate could down. The following polyline graph is shown in Fig. 4, the network layers are 2, the number of filters is 8, the HistBlock is 7, and the block Overlap rate is 0.5. However, the recognition rate of hyperspectral remote sensing images with different block sizes is as follows:

Effects of the number of filters
In the training process of PCA, each image at first stage perform convolution operation with each filters and result is the input image of the second stage to train. Then each image perform convolution operation with each filter at the second stage respectively. The experimental results are shown in the following the best, at the same time, the training time is the short. Because filter is generated by the the feature vector, and each image needs to perform convolution operation with those filters, so the number of filters will certainly have an impact on the experiment. The more the number of filters, the greater the amount of data, the amount of computation is complex.

Effects of the number of PCA train layer
We know that PCANet has two layer of the PCA processing stage,and images feature extraction are done in the two stages. In order to verify the effect of the number of PCA layers on the experiment, we do the experiment on the structure of two layer PCA processing, three layer PCA processing and four layer PCA processing respectively. The results of the experiment are shown in the following TABLE 3. We can know that the recognition is better than others when the number of PCA layer is three. It also shows that the number of PCA training layer does not mean the higher the recognition rate. The following broken line graph is shown in Fig. 5. The network layers are different, the sampling blocks are all 7, the number of filters is 8, the histblock is 7, and the block overlap rate is 0.5. However, the recognition rate of hyperspectral remote sensing image is different in the case of different network layers, as shown in Fig. 5 below.

Classification of hyperspectral images by PCANet
In order to verify the classification of Hyperspectral images by PCANet, we find a series of Hyperspectral images which are made up of 3 (In there,At least three species of ground, otherwise there is no need to classify.) to t (t ∈ Z & t > 2) of these kinds of ground objects from different places, each of ground object has different curves, a total of many many samples. 20% sample pictures for training and 80% samples for testing.
Owing to these samples are influenced by many factors, such as illumination.In order to verify the PCANet can adapt to the Hyperspectral remote sensing image samples, we compare the experimental results with the original experimental results. The results of the experiment are shown in the TABLE IV . We can know that the recognition rate of Pavia University is 94.13%, which just high 7.57% than Indian Pines data. From the experimental data shows that the recognition effect is good, at the same time, it also shows that PCANet has a good adaptability to different Hyperspectral remote sensing image data .

Conclusions
In this paper, a new Hyperspectral image classification method is proposed. We have compared the experimental data, verified the several factors affecting the identification of PCANet and the adaptability of PCANet, at the same time, we also showed that the reliability of the Hyperspectral image recognition by the PCANet. Then we would consider using big Hyperspectral image data sets to train.The intention is that I hope to achieve better classification accuracy and performance.
On the other hand, further research can be extracted and classified from two dimensions of spectral feature and space spectrum feature. With the condition was allowed, the high spectral image data set could be used so that the features of hyperspectral image data can be extracted better so as to get better classification results.