Polarimetric SAR image classification using 3D generative adversarial network

In this paper, a new architecture of three-dimensional deep convolutional generative adversarial network(3D-DCGAN) is specially defined to solve the unstable training problem of GAN and make full use of the information involved in polarimetric data. Firstly, a data cube with nine components of polarimetric coherency matrix are directly used as the input features of DCGAN. After that, a 3D convolutional model is designed as the components of generator and discriminator to construct the 3D-DCGAN, which considers the effective feature extraction capability of 3D convolutional neural network(CNN). Finally parameters of the network are fine-tuned to realize the polarimetric SAR image classification. The experiments results show the feasibility and efficiency of the proposed method.


Introduction
Polarimetric SAR(PolSAR) data has more characteristics of polarimetric, phase and space, providing reliable data resources for precise land cover classification. However it is difficult to mark samples artificially, so there is a lack of labeled samples in PolSAR images.
At present, many kinds of classification methods based deep neural network have been used in the field of PolSAR image classification, such as Deep Belief Network (DBN) [3], Convolutional Neural Network (CNN) [4] and Generative Adversarial Network (GAN) [5], etc. As un unsupervised learning method, GAN are widely used because they can expand polarimetric data, learn the potential distribution of real data, and generate fake data with the same distribution as real data. However the training process of GAN [6] is unstable and easily falling to the local optimum, various various variants of GAN have been proposed to solve the problem, such as Deep Convolutional GAN(DCGAN) [7], Conditional GAN(CGAN) [8], Wasserstein GAN [9], etc.
Combining the advantages of CNN and GAN, DCGAN has a high feature representation ability. The hidden layer of the discriminator model is all realized by the convolutional layer, and the hidden layer of the generator model is realized by the deconvolution layer. However the PolSAR data has three channels which contains amplitude, phase and spatial information, these operations in 2D space tend to lose the spatial and phase information In this paper, a new type of three-dimensional(3D) DCGAN is specially defined to solve the unstable training problem of GAN and make full use of the information involved in polarimetric data. First of all, a data cube with nine components of polarimetric coherency matrix are directly used as the input features of DCGAN. And then 3D convolutional model is designed as the compoents of generator and discriminator to construct the 3D-DCGAN, which combines the 3D convolutional, pooling and nonlinear operators, as well as backpropagation. Finally parameters of the network are fine-tuned to realize the classification of PolSAR images. The experiments results show the feasibility and efficiency of the proposed method.

Deep Convolutional GAN
DCGAN combines CNN and GAN. On the basis of the original GAN, the hidden layer of the discriminative model is all realized by the convolutional layer, and the hidden layer of the generated model is realized by the deconvolution layer [10].
GAN is composed of generator G and discriminator D. Its objective function is [5]: where, x is the real polarimetric data, z is the random noise, and G(z) is the sample generated for G that conforms to the distribution of real data as far as possible.
The structure of the generator and discriminator is set as shown in Fig 1(a) and (b).

3 D Convolutional GAN
3D convolutional model is employed instead of 2D convolutional model in DCGAN. The input data used in the proposed method are the elements of 3×3 complex coherency matrix T forms a feature matrix with the size m×n×9(m and n are the width and heigh of the image respectively). The process of 3D convolution is shown in Fig. 2. The output value of the neuron at the position of (x, y, z) in the hidden layer i is as follows [11]: where, v is the output of position (x,y,z) of layer i, the convolution kernel size of the ith layer is Hi×Ni×D i, f (.)is the activation function, bij is the bias of the jth data map in the ith layer, m represents a data map in the (i -1)th layer that connects the jth feature map, hnd ijm w is the weight between the neuron on position of (h,n,d ) of the jth feature map on the ith layer and the mth feature map on the (i -1)th layer.
The formula of 3D maximum pooling is as follows: where, u represents 3D input vector of pooling layer, v is output of pooling, m,n, p represents sampling steps and M×N×K represents the 3D area.

Proposed Method
The details of the proposed algorithm for PolSAR data classification are described as follows: (1) The input data used in the proposed method are the elements of 3×3 complex coherency matrix T includingT11, T22, T33 and the real part and imaginary part of T12, T13 and T23.
(2) The sliding window method with step size 1 is adopted to take 32×32 image blocks around each pixel to represent the original pixel value, thus forming N image blocks with size 32×32×9.
(3) The unlabeled training data set, the labeled training data set and the test data set were constructed.
(4) The 3D-DCGAN model was trained with no label training data set. The feature map of each layer is given, and the dimensions of each deconvolution layer are determined, and weights and biases are randomly initialized.
(5) After data is input, each step of training updates the parameters of the discriminant model.
(6) Softmax classifier was trained by using the labeled training data set as the input of the discriminative classification model and the category of each sample in the labeled training data set as the output of the discriminative classification model. (7) The parameters of the whole discriminator are fine-tuned with the training data set to form the trained discriminator.

Experiments
The polarimetric data used in this experiment was acquired by AIRSAR over Flevoland, with a size of 750×1024. Pauli RGB image and the ground truth map are shown in Fig.4(a) and (b). The area contains eleven types of crops: stem beans, rapeseed, bare soil, potatoes, beet, wheat, peas, lucerne, grass, forest and water. 2D-DCGAN and CNN method is used for comparison. Classification results are evaluated by overall accuracy (OA) and Kappa coefficient.   Fig.4 (c)-(e) illustrates that all the three methods can achieve a satisfactory partition of rape seed and forest. However CNN cannot reliably classify grass, wheat, stem beans and bare soil, which directly leading to the lower OA and Kappa coefficient. 2D-DCGAN and the proposed 3D-DCGAN have improved the classification of wheat and beet. The proposed method which adopts the DCGAN model, is superior to CNN and achieves notable improvement in region connectivity, uniformity and robustness. Compared with DCGAN, our method get a higher OA for adopting 3D convolution model instead of 2D convolution model, which can make full use of the spatial information in PolSAR data. At the same time, table1 represents that the OA of the proposed method achieves 98.32%, which improves 3.52% and 7.42% respectively, while Kappa coefficient reaches 94.90%.
In this paper, a new architecture of 3D-DCGAN is proposed. We apply a 3D convolution model attribute to the feature extraction capability of 3D convolutional network is stronger than that of 2D convolutional network. Meanwhile, the introduction of 3D convolution model follows the physical scattering mechanism and holds the polaimetric and spatial information involved in polarimetric data, which can also alleviates the unstable training problem of GAN. The comparative experiments results show the feasibility and efficiency of the proposed method.