Sparsity-Constrained NMF Algorithm Based on Evolution Strategy for Hyperspectral Unmixing

. As a powerful and explainable blind separation tool, non-negative matrix factorization (NMF) is attracting increasing attention in Hyperspectral Unmixing(HU). By effectively utilizing the sparsity priori of data,sparsity-constrained NMF has become a representative method to improve the precision of unmixing. However, the optimization technique based on simple multiplicative update rules makes its unmixing results easy to fall into local minimum and lack of robustness. To solve these problems, this paper proposes a new hybrid algorithm for sparsity constrained NMF by intergrating evolutionary computing and multiplicative update rules (MURs). To find the superior solution in each iteration,the proposed algorithm effectively combines the MURs based on alternate optimization technique, the coefficient matrix selection strategy with sparsity measure, as well as the global optimization technique for basis matrix via the differential evolution algorithm .The effectiveness of the proposed method is demonstrated via the experimental results on real data and comparison with representative algorithms .


Introduction
With the development of satellite technology, com-puter technology and sensor technology, hyperspectral remote sensing technology has made great progress, playing an increasingly important role in agriculture, meteorology, geology, military and other fields. The hyperspectral image produced by it has rich spectral and ground space information, but due to the complexity of the surface and the spatial resolution of the hyperspectral imager, a single pixel is often a multi-substance in a real hyperspectral image. In a real hyperspectral image, a single pixel is often a mixture of spectral properties' multiple substances,which is a mixed pixel.(1) Decom-pose it into the constituents of different substances (Endmember) and their proportion in this pixel (Abundance). This decomposition process is called hyperspectral image mixed pixel decomposition, referred to as HU. It can be divided into a Linear Mixed Model (LMM) and a Nonlinear Mixed Model (NLMM). (10)Each pixel in the LMM is a linear combination of the endmember spectrum and its abundance.Compared with the NLMM construction model, the physical meaning is easy to understand, so it becomes a widely used model.
In the existing unmixing methods, in general, it can be summarized into two categories: one type of algorithm is based on geometric methods, such as pixel purity index (PPI)(3), vertex component analysis (VCA)(4), and simplex growing algorithm. (SGA)(5), etc., such algorithms are based on the assumption that each end element exists at least once, but in many hyperspectral data, the spatial resolution of the data is very low, this assumption may not hold; Another type of algorithm is based on statistics,the method considers the hyperspectral image unmixing problem as blind signal separation (BSS) problem, mainly including independent component analysis (ICA)(6) and NMF, due to the non-negative and Abundance Sum-to-One Constraint(ASC) constraint of hyperspectral data does not satisfy ICA. The method's original signal independence hypothesis limits the application of the ICA method. NMF does not require the existence of pure pixels in the image. The good interpretability of its partial composition is in line with the intuitive cognitive understanding, and the endmember and abundance can be estimated at the same time, making NMF very suitable for hyperspectral unmixing. Since the basic NMF does not make full use of the sparsity prior of the data, a nonnegative matrix factorization method for increasing the desaturation constraint of the L0, L1, and L1/2 regularization terms for the abundance matrix is generated to improve the demixing precision, where L1/2 NMF method of sparsity constraint is a representative method to improve the accuracy of unmixing.
However, the optimization technique based on simple multiplication update rule lacks robustness in practical applications and is easy to fall into local minimum solution. To solve the above problems, this paper proposes a new sparse non-negative matrix factorization optimization algorithm based on the combination of evolutionary strategy and multiplication update. The first section of this paper introduces the basic background and related knowledge of hyperspectral mixing. The second section writes the basic NMF and L1/2NMF and the corres-ponding initialization; the evolution strategy and the local optimization process are described in detail in the third section; the fourth section analyzes and compares the results of the experiment; the fifth section summarizes and analyzes.

Preliminary
The hyperspectral image data is a data cube V, A and S represent the endmember matrix and the abundance matrix, respectively, and the LMM mathematical model can be expressed as: ∈ × represents a hyperspectral image with M bands and N pixels, and E represents possible errors and noises. Since the information of V, A, and S is unknown during the decomposition process,Hyperspectral unmixing is a blind decomposition problem.

NMF
As a statistical method based blind source separation method, NMF has been widely used in signal processing, pattern recognition, computer vision and other fields. Specifically, NMF decomposes a given non-negative matrix ∈ × into two non-negative matrices ∈ × and ∈ × . And satisfy U<min(P,Q), which is designed to minimize the reconstruction error between X and WH. The mathematical formula can be described as: || − || , , . . , ≥ 0 (2) || • || stands for its Frobenius norm. When the error in the LMM model is ignored, V=AS is transformed into a minimum value by the NMF method. NMF is well suited for hyperspectral linear demixing and can be effectively reduced for high spatial and spectral dimensions in hyperspectral data.
NMF has many solutions, the most famous of which is the MURs (7) proposed by Lee and Seung. The iterative solution method is as follows: (•) represents the transpose of the matrix, ".*" and "./" respectively represent the multiplication and division of the matrix.

L1/2-NMF
Since the basic NMF method does not make full use of the sparsity prior of the data, the sparsity limit of the regularization method is often added to the abundance matrix in the objective function: Where || || / = ∑ (ℎ , ) / , , as the abundance matrix Sparse constraints, ∈ as a regularization parameter. When ≤ < 1 in the regularization matrix (0 < < 1) , the sparsity of decreases as q increases. When 1 ≤ < , the sparsity of is basically stable with the change of q (8). Therefore, consider q=1/2 as the parameter of the regularization constraint. The iteration rules are as follows: ← . * ( ./ + ) (7) The stopping criterion is usually used to set the maximum number of iterations or set the tolerance threshold for the error of the objective function. The stopping method used in this paper is to set the maximum number of iterations.

Abundance Sum-to-One Constraint
A pixel can be composed of several end elements, but the sum of the abundances of the end elements is 1, that is, Sum-to-One Constraint(ASC). Since the abundance matrix always satisfies the ASC constraint according to the LMM, we follow the method in the paper. (9)Merging ASC into NMF-based unmixing. The hyperspectral image matrix X and the end element matrix W become an augmented matrix: δ is used to control the influence of ASC in the loss function. 1 and 1 are all 1 column vectors (1,1, … ,1) , and H is updated with and in each iteration of the algorithm.

Proposed method
In order to improve the accuracy of demixing and increase the robustness of the algorithm, this paper proposes a new sparse non-negative matrix factorization optimization algorithm based on evolutionary strategy and multiplication update (ie EMNMF), aiming to find the global excellent solution and improve algorithm efficiency.In order to effectively utilize the given hyperspectral image data, multiple pairs of initial test sets = , are set at initialization. The initial candidate set is clustered.
, is used as the seed matrix, and m represents the m set ma-trix or matrix pair in the candidate set. The t-th iteration uses = , , and represents the m-group matrix pair in the candidate set at the t-th iteration.

K-Means
The K-means clustering algorithm is a NP-hard optimization problem. It is still one of the most widely used clustering algorithms due to its easy implementation, simple, efficient and successful application cases and experience.
The K-means algorithm is an iterative process in order to minimize the square sum J(C) of all samples in the cluster domain to the cluster center. The algorithm flow includes 4 steps: 1) K in the selected data space objects act as the center of the initial cluster, and each object represents the center of a category. 2) For the data objects in the sample, according to their euclidean distance from these cluster centers, they are assigned to the class represented by the most similar cluster center according to the nearest distance criterion. 3) Calculate the mean of all objects in each category as the new cluster center of the category, and calculate the square sum of the distances of all samples to the cluster center of the category, ie the J(C) value. 4) See if the cluster center and J(C) values have changed, and if so, return to the second part, otherwise the clustering ends. After generating the endmember matrix, the corresponding abundance matrix is generated by least squares method:

Multiplication rule
The multiplication rule is to use the classic MURs to gradually obtain the minimum value of the objective function in an iterative manner and minimize the reconstruction error. In the appropriate way, the endmember matrix and the abundance matrix in = , are brought into the following formula for iteration:

Sparse selection
Since the differential evolution algorithm is used to improve the endmember matrix and the better candidate set is selected, the abundance matrix is improved. Since the objective function is the N1/2 sparseness constraint NMF, the candidate is the most sparse abundance matrix is selected to select and the formula for evaluating the sparsity of the abundance matrix is as follows: [ ] represents the abundance matrix of the uth end element.(10)

Differential Evolution (DE)
In order to avoid the algorithm falling into the local minimum, this paper introduces the strategy of differential evolution algorithm to expand the global search ability of the algorithm. DE is a heuristic random search algorithm based on group difference, which is robust to the optimization of real valued parameters.

Experiment analysis
In order to prove the performance of the proposed algorithm, this section compares the EMNMF algorithm with some representative algorithms, and adopts the Spectral Angle Distance (SAD) and the original hyperspectral image widely used in the field of unmixing research. SAD is used to indicate the degree of difference between spectra. Small SAD value means small spectral difference, and its expression formula is: and represent the estimated end element and the reference end element, respectively, and || • || represents the two norm of the vector or matrix. It is found that the EMNMF experiment works best through the SAD indicator.
This paper also tests the performance of the algorithm in real hyperspectral scenarios by selecting real hyperspectral urban images.  shows the abundance distribution of the three materials unmixed by NMF algorithm, L1/2NMF algorithm and EMNMF algorithm. By comparison, the EMNMF algorithm is improved compared to the first two algorithms.

Summary
The experimental design of the algorithm is not only effective on real data, but also reaches the ideal value in important experimental indicators, improves the accuracy and speed of hyperspectral demixing, finds approximate solutions better, and has better global optimality. You can also migrate your ideas to other areas of research to get more value. In addition, the algorithm has more room for improvement, and further research and improvement are still needed: in this paper, detailed research can be carried out on the parameter setting; only some actual data are tested, and more actual data can be tested to test the robustness of the experiment; and the algorithm does not test the simulation data. In the future, simulation experiments can be added to test the performance of the algorithm in more simulation environments.