Multivariate Data Retrieval Modified by Random Noise using Lattice Autoassociative Memories with Eroded or Dilated Input Residuals

Lattice associative memories were proposed as an alternative approach to work with a set of associated vector pairs for which the storage and retrieval stages are based in the theory of algebraic lattices. Several techniques have been established to deal with the problem of binary or real valued vector recall from corrupted inputs. This paper presents a thresholding technique coupled with statistical correlation pattern index search to enhance the recall performance of lattice auto-associative memories for multivariate data inputs degraded by random noise. By thresholding a given noisy input, a lower bound is generated to produce an eroded noisy version used to boost the min-lattice auto-associative memory inherent retrieval capability. Similarly, an upper bound is generated to obtain a dilated noisy version used to enhance the max-lattice auto-associave memory response. A self contained theoretical foundation is provided including a visual example of a multivariate data set composed of grayscale images that show the increased retrieval capability of this type of associative memories.


Introduction
Morphological associative memories (MAMs) were introduced more than two decades ago as a new paradigm for the storage and recall of pattern associations [3]. MAMs are feedforward, fully connected neural networks, in which the interconnection weights between input and output neurons follow a properly defined hebbian type learning rule. In most correlation type associative memory models, the storage and recall stages use conventional algebra whereas computation in MAMs are based on the mathematical theory of minimax algebra [4][5][6]; since the standard minimax matrix algebra is a specific instance of a lattice algebraic structure, MAMs are also named lattice associative memories (LAMs). Once a set of exemplar pattern pairs is imprinted in the neural network, then it is anticipated that such a memory will manifest a certain capability to recall the correct association when presented with an exemplar pattern (perfect input) or with a distorted version (non-perfect input). It is important to reflect that both input cases pose a difficult robustness problem for any associative memory model. For perfect input, it was shown that the canonical auto-associative morphological memories (AMMs) have unlimited storage capacity, give perfect recall for all exemplar patterns, and are robust for exclusively erosive or dilative noise; based on stronger assumptions, similar results were corroborated for hetero-associative morphological memories (HMMs). To dispense with inputs corrupted by mixed random noise, i. e. by a combination of both erosive and dilative noise, the kernel method that binds the max and min AMMs in a two-stage memory scheme was designed for binary [3,7,8] and real valued patterns [9]; in addition, distinct computational perspectives or model extensions were developed to enlarge its recall capability or diversify their applicability in pattern recognition problems. In addition to the kernel method, an algorithm based on induced morphological strong independence for real valued exemplar patterns was proposed in [10] to deal with inputs corrupted by mixed random noise. Parallel to these technical developments, a deeper understanding has been achieved that afforded useful results on the fixed point set of LAAMs as well as a complete mathematical characterization of their response to arbitrary inputs [6,11]. Within this frame of reference, numerical examples and a practical discussion can be found in [6], and a comparison of AMMs with other enhanced associative memory neural networks for grayscale pattern retrieval appears in [11], though only for erosive or dilative random noise.
We point out the relevant fact that the strengths and weaknesses of the min and max auto-associative memories have conducted the different developments of previous work done on MAMs as briefly described above. The recall failure of the min AMM for inputs degraded by dilative noise and similarly, the poor performance of the max AMM for inputs corrupted with erosive noise, and consequently their inapplicability to deal with mixed random noise has been exposed repeatedly since their introduction [3,7] and revisited across the most recent theoretical results discovered about them [6,11]. By capitalizing the robustness to either erosive or dilative noise, respectively, of the min-and max-LAAMs, a practical solution to pattern recall from inputs corrupted by mixed random noise using only one of the canonical LAAMs was initiated by masking the noise contained in the corrupted input pattern [12][13][14]. Based on selective pattern indexing, further improvement of the noise masking strategy and its extension to fuzzy type associative morphological memories have been recently reported in [15,16] and [17]. To regain the LAAMs inherent computational functionality to cope with different amounts of impulsive random noise, in this paper, we introduce a novel technique using thresholded noisy inputs followed by statistical correlation pattern indexing, to generate an erosive or dilative noisy version of the input pattern and show the LAAMs enhanced recall performance by means of an illustrative example.
Our work is organized as follows: Section 2 gives some mathematical background on minimax algebra covering the basic lattice matrix operations used to define and apply the canonical LAAMs, including a brief description of the main theoretical results regarding their recall performance; Section 3 explains the idea of noise thresholding and correlation pattern search that will be used to produce lower and upper noise bounds useful for pattern retrieval. In Section 4, we give an algorithm that employs the proposed technique to boost a single LAAM recall capability for real valued input patterns contaminated with mixed random impulse noise. We close our paper by presenting in Section 5 our conclusions and comments to the research presented here.

Basic Lattice Matrix Algebra
The numerical operations of taking the maximum or minimum of two numbers, usually denoted as functions max(x, y) and min(x, y), will be written as binary operators using the "join" and "meet" symbols employed in lattice theory, i.e., x∨y = max(x, y) and x∧y = min(x, y) [1,2]. It was shown in [5,6] that the algebraic systems (R −∞ , ∨, +) and (R +∞ , ∧, + ′ ) are semirings for the corresponding max-∨, min-∧ operations and their respective additions, + and + ′ over the sets Since we will consider only finite values of x and y, then the sum operation in the aforementioned semirings coincide, that is, x + ′ y = x + y. In what follows, if n is any positive integer, the index set I n is an abbreviation for the list of numbers {1, 2, . . . , n}.
We use lattice matrix operations that are defined component-wise using the underlying structure of R −∞ or R ∞ as semirings. For example, the maximum of two matrices X, Y of size m × n is defined as shown in (1) for all i ∈ I m and j ∈ I n : Inequalities between matrices are also verified elementwise, e. g., X ≤ Y if and only if x i j ≤ y i j for all i, j. On the other hand, the conjugate matrix X * is defined as −X t where X t denotes usual matrix transposition. Let X be a matrix of size m×p and Y be a matrix of size p × n. The max-sum of X and Y, denoted by X ▽ Y is the matrix operation defined by (2) for all i ∈ I m and j ∈ I n . In similar fashion, the min-sum of X and Y, denoted by X△Y is the matrix operation given in (3).
The minimax outer sum of vectors x = (x 1 , . . . , x n ) t ∈ R n and y = (y 1 , . . . , y m ) t ∈ R m , is given by the m × n matrix It is important to observe that the minimax outer sum displayed in (4) is a particular case of the max-sum and minsum matrix operations since, . , x k } ⊂ R n represents a multivariate data set, then a linear minimax combination of vectors from X is any vector x ∈ R n of the form where I is a finite index set and α ξ i ∈ R for all i ∈ I and all ξ ∈ I k . The set of all such minimax combinations is called the real linear minimax span of X and is denoted by LMS R (X). Also, given a real n-dimensional self-transformation, T : R n → R n , then x ∈ R n is called a fixed point of T if and only if T (x) = x (see, e. g. [6]). The previous definitions and equations are the necessary tools to discuss the theoretical background of lattice associative memories as explained in the next subsection.

Lattice Associative Memories
Let (x 1 , y 1 ), . . . , (x k , y k ) be k vector pairs with x ξ = (x ξ 1 , . . . , x ξ n ) t ∈ R n and y ξ = (y ξ 1 , . . . , y ξ m ) t ∈ R m for ξ ∈ I k . For a given set of pattern associations {(x ξ , y ξ ) | ξ ∈ I k } we define a pair of associated pattern matrices (X, Y), where X = (x 1 , . . . , x k ) and Y = (y 1 , . . . , y k ). Thus, X is of dimension n × k with i, jth entry x j i and Y is of dimension m × k with i, jth entry y j i . To store k vector pairs (x 1 , y 1 ), . . . , (x k , y k ) in an m × n lattice associative memory we encode associations as in linear or correlation memories but instead of using the linear outer product, the minimax outer sum is used as follows [3]. The (X, Y)-min lattice associative memory, represented by W XY , and the (X, Y)-max lattice associative memory, denoted by M XY , both of size m × n, are given, respectively, by the expressions, , 0 (20 The expressions to the left of (6) and (7) are in matrix form and the right expression is the i jth entry of the corresponding memory. We remark that based on (4), for each ξ, y ξ (−x ξ ) t is a matrix A ξ of size m × n that memorizes or stores the association pair (x ξ , y ξ ). Therefore, The retrieval of pattern y ξ from pattern x ξ can be expressed using either the min-memory W XY or the maxmemory M XY as computed with the corresponding equation in (8), We point out the important fact that W XY operates on x ξ using the max-sum and, dually, that M XY operates on x ξ using the min-sum. On the other hand, if X Y, then we speak of a lattice hetero-associative memory (LHAM) and, if X = Y we obtain a lattice auto-associative memory (LAAM). In this paper we restrict our discussion to the canonical LAAMs, W XX and M XX whose size is n × n. Also, in the remaining sections, without loss of generality, we will consider pattern entries to be non-negative numbers, i. e., x ξ ∈ R + 0 meaning x ξ i ≥ 0 for all i, ξ.

Main Theoretical Results
The theorems listed here give a quick overview of the theoretical foundations of LAAMs that are of practical significance in applications; detailed mathematical proofs can be found in [3,6] or [7]. In most cases, statements given in the theorems refer only to the min-memory W XX since analogous properties for the max-memory M XX are readily obtained using the principle of duality in algebraic lattices. LAAMs have unlimited storage capacity, the pattern domain can be binary or real valued, give perfect recall for perfect input, and computation is performed in one step, free of any convergence problems [3]. In mathematical form: Theorem 1 W XX ▽ X = X, for any matrix X of size n × k whose columns are the exemplar pattern vectors Notice that the max-sum of W XX with vector x is an example of a lattice matrix transform defined by T W (x) = W XX ▽ x; therefore, Theorem 1 means that any exemplar pattern x ξ for ξ ∈ I k is a fixed point of T W (complete perfect recall of all exemplar data). In addition, the recalled pattern x ′ obtained in the first step is a fixed point for the second application of W XX (one step convergence). The fixed point set of the min-memory W is defined as The next theorem gives a complete algebraic characterization of the recall response of the LAAMs in terms of linear minimax combinations.

Theorem 2
The fixed point set F(X) of W XX and M XX is the same and is equal to LMS R (X).
Consequently, from (5) and (9), the result settled in Theorem 1 means that, for any input vector x ∈ R n the recalled pattern x ′ belongs to F(X). Hence, if α ξ i ∈ R for all i ∈ I n and all ξ ∈ I k , the following equation is valid Observe that the input vector x appearing in (10) can be an exemplar pattern x ξ , a noisy version of it, denoted byx ξ , or even a non-corrupted non-exemplar pattern. In addition, the right-hand side of (10) means also that W XX has many spurious states that are stable. Although equation (10) conveys a deep and nice result, in general does not provide a practical criterion to enhance the recall performance of the canonical LAAMs for inputs corrupted by mixed noise.
Missing parts, occlusions or corruption of exemplar patterns can be considered as "noise". Particularly, if alterations in pattern entries follow a probability law, then we refer to random noise. In addition, to specify the recall capabilities of LAAMs when presented with non-perfect inputs, noise can be classified plainly in three types with respect to numerical ordering. Thus, let I = {1, . . . , n}, then, a distorted versionx of pattern x has undergone an erosive change wheneverx ≤ x or equivalently if ∀i ∈ I,x i ≤ x i . A dilative change occurs wheneverx ≥ x or equivalently if ∀i ∈ I,x i ≥ x i . Let L, G ⊂ I be two non-empty disjoint sets of indices. If ∀i ∈ L,x i < x i and ∀i ∈ G,x i > x i , then the distorted patternx is said to contain mixed noise (random or non-random). Due to the preceding inequalities, eroded and dilated versions of a modified vector correspond respectively to lower and upper bounds ofx.
Theorem 3 Letx γ be a distorted version of pattern x γ , then W XX ▽x γ = x γ holds if and only if the following conditions are satisfied for all i ∈ I n : Similarly, M XX △x γ = x γ holds if and only if the following conditions are satisfied for all i ∈ I n : The fundamental results stated in Theorem 3 and Corollary 4 give the conditions to guarantee perfect recall from a distorted exemplar corrupted by erosive or dilative noise applying the min-memory W XX or the maxmemory M XX . Additional theoretical discussions [7,11], useful commentaries and explanations using numerical examples [3,6], and a wide spectrum of computational experiments with different sets of binary or gray-scale images [3,7,9,11], have demonstrated the surprising fact that W XX and M XX are quite sturdy to deal, respectively, with erosive or dilative noise. However, both LAAMs fail for inputs corrupted with mixed noise, a limitation that has been emphasized several times in all enhanced models developed so far. Nevertheless, from a pragmatic point of view a simple reasonable question comes to our minds. If the canonical LAAMs have a firm mathematical foundation, give results of far reaching applications, and are quite robust to erosive or dilative distortions, is it possible to implement a mechanism that can take advantage of all their known properties and therefore be capable of dealing with mixed noise? The answer is affirmative and we explain our proposal in detail in the following two sections.
To attain our objective we need to relax the concept of "perfect recall" based on an adequate error measure between a LAAM's recalled vector from a given noisy input and its corresponding associated exemplar. Here we use the normalized mean squared error (NMSE), denoted by ρ, to measure the proximity between two vectors [18]. In common practice, ρ returns values that agree more objectively with the perceived differences when comparing, for example, two one-dimensional signal graphs or twodimensional grayscale images. Thus, the idea of "almost perfect recall" can be formalized in the following sense [12,13]: the min-W or max-M memories give almost perfect recall of an exemplar vector x ξ from a noisy input versionx ξ for ξ ∈ I k if and only if each LAAM satisfies the corresponding inequality given in (11), ρ(W XX ▽x ξ , x ξ ) ≤ ϵ and ρ(M XX △x ξ , x ξ ) ≤ ϵ, (11) where ϵ > 0 is a small positive number close to zero.

Noisy Input Bounds for LAAMs
Based on the theoretical results given by Theorem 3 and Corollary 4 in Section 2 as well as the qualitative comments made in the last paragraph of the same section, the following two subsections present an alternative approach to the noise masking technique developed in earlier papers [12][13][14] and its improvements as described in [15][16][17].

Noise Lower-Upper Thresholding
Once a dataset of exemplar vectors are stored in a LAAM we can use either the min-memory W XX or the maxmemory M XX as data recognizers by association. However, before the recalling stage, the data used as input can be altered due to different causes, such as, missing, saturated or changed values. Although data alteration may occur in a "structured" way because of periodic or deterministic variations, we will focus our attention to the more general case of additive random noise with varying degrees of impulsiveness. Thus, let x ∈ R + n 0 be a vector with entries x i belonging to a bounded interval [a, b] of R + 0 for all i ∈ I n . Let ε ∈ (a, b), then a noisy versionx will have entries given by, where, J ∪ K = I n , J ∩ K = ∅, and the constant ε > 0 is added or subtracted following a uniform probability density function on the unit interval (0, 1). The index subsets J and K correspond, respectively, to the unmodified and modified vector entries. Since n = |J| + |K|, the cardinalities of J and K determine the amount of noise present iñ x, and if X = (x 1 , . . . , x k ) is the matrix of all exemplar vectors, then the data domain bounds, a = min(X) and b = max(X), are given by, For certain vector operations we will make use of the constant bound vectors, a and b, whose components are defined respectively by, a i = a and b i = b for all i ∈ I n . Under the assumption that the underlying dataset domain specified by [a, b] can not be extended in a meaningful manner, the modified vector entriesx i given in (12) are Algorithm I lists the steps that generate additive impulse random noise with varying offsets (ε parameter) and variable amounts of noise (% p parameter) where, ε ∈ (a, b) and p ∈ (0, 100). Note that, the arrow '←' designates an assignment operation.

Algorithm I-Impulsive Noise Generation (ING)
[Select an exemplar vector to simulate a noisy version] input exemplar x ξ ∈ X where ξ ∈ I k , dimension n, noise level p, and domain bounds a, b.
[Compute fraction q of unmodified vector entries in x ξ ] q ← 1 − p/100 [Alter vector entries by ±ε with equal probability] expression (non-clipped value) upon verification of a logical condition. Assuming thatx is a noisy vector obtained with Algorithm I, then the proposed thresholding functions (14) or (15) defined component-wise, will produce an erodedẽ or dilated versionõ ofx.
where, τ e = ε − 1, τ d = b − ε, and ε = (a + b)/2. In addition, thresholds τ e and τ d should satisfy the complementary inequalities with respect to the data domain [a, b], i. e., a < τ e ≤ ε−1 and ε−1 ≤ τ d < b. A functional representation of (14) and (15) where the threshold and domain bound parameters appear explicitly. By construction, it is not difficult to see that e ≤x andõ ≥x.

Eroded and Dilated Input Residuals
Before a meaningful association using a LAAM can be established from a noisy exemplar input, a search must be performed to ensure that a stored memory pattern resembles most the given input due to the fact that for any significant amount of noise no prediction can be made in advance of which exemplar was altered. Thus, ifx is the given corrupted input, a similarity measure must be used to find ξ ∈ I k such thatx ≈x ξ . Since the model we consider for impulsive noise depends on signed equiprobable values of ε, index search uses the statistical linear correlation coefficient (Pearson's) calculated between the vectors x andx ξ , or cc(x,x ξ ). Therefore, an index γ ∈ I k is selected such that, Once the index value γ is found, the eroded and dilated residuals of the noisy exemplar used as candidate inputs to the corresponding LAAM, are computed as, The vector operations (17) are devised in order to guarantee thatx e ≈x γ e andx d ≈x γ d . Also, using (13) to (16), it follows that,x e ≤x γ e andx d ≥x γ d . Consequently, in the recall stage,x e can be fed into the min-memory W XX or duallyx d can be used as an adequate input for the maxmemory M XX . The corresponding output vectors are then given by, y e = W XX ▽x e or y d = M XX △x d and because of the robustness of each LAAM it is no surprise to expect that y e ≈ x γ ≈ y d with low error (NMSE) as shown in Section 4. Another useful similarity measure is the cosine of the angle between vectors. However, if both vectors are displaced with respect to their means the centered cosine angle, as it is known, turns to be equivalent to the linear correlation coefficient.

LAAMs Enhanced Retrieval Capability
Integrating the "almost perfect recall" criterion with noise thresholding and the generation of eroded or dilated residuals, a simple procedure applying equations (11) through (17) together with Algorithm I, can be used to test the recall performance of either the min-W XX or max-M XX memories over a run of trials for different amounts of impulsive random noise with varying offset values. Hence, let m > 0 represent the number of trials realized during the recall stage, let h be the percentage increment for the amount of noise corrupting a vector, and letρ stand for the average NMSE error computed over a run of trials. The mathematical pseudo-code shown in Algorithm II lists the steps required to test either LAAM by carrying on the following computational subtasks: 1) input noise generation, 2) correlation coefficient based index search, 3) thresholding low or up a noisy input, 4) erosion or dilation residuation, 5) W XX max-sum with eroded residual calculation (resp., M XX min-sum with dilated residual), and 6) average NMSE determination over a trial run for any amount of noise determined by h. In the code phrase 'expr1 | expr2', the symbol '|' means 'xor' (exclusive or).

Algorithm II-LAAMs Enhanced Recall (LAAM-ER)
[Test recall improvement over m trials of noisy inputs] input matrix X of exemplars, test vector index γ ∈ I k , memory W XX | M XX , dimension n, noise offset ε, domain bounds a, b, number of trials m, noise increment h.  results table T A dataset of 30 images will serve our aim to illustrate the retrieval performance of LAAMs. More specifically, each element in our dataset is a grayscale digital image of size 64 × 64 pixels, whose domain of values are integer numbers belonging to the interval [0, 255]. Following a row-order scan of the underlying image data matrix, an equivalent column vector is obtained for each image. Thus, the exemplar image dataset can be represented by X = (x 1 . . . x k ) where k = 30 and vector dimensionality n equals 4096.  On the other hand, the min-W XX and max-M XX lattice associative memories that store all exemplar images are matrices of size 4096 × 4096. The computer tests consider three runs where the impulse noise offset value ε was set correspondingly, to 128, 96, and 64. Each run goes through m = 50 trials for variable amounts of noise increasing in h = 5% steps. Hence, after applying Algorithm II to any exemplar vector in X, the output table T has 20 rows and 3 columns. A particular row will give in column-order, the % noise level, the average NMSE value, and the inequality check count overall trials.
The information depicted in Figs. 2, 3, and 4 has the following structure. The input image with a certain amount of impulsive noise appears on top (first row). The eroded thresholded input, the eroded generated residual, and the recalled image using the min-W XX lattice memory are displayed, respectively, to the left, at the center, and to the right of the second row. Similarly, the dilated thresholded input, the generated dilated residual, and the recalled image using the max-M XX lattice memory are shown to the left, center, and right of the third row. Note that, in this example,ẽ = T e (x 2 , 127, 0) andõ = T d (x 2 , 127, 255). Numerical results relative to Figs. 2 3, and 4 are listed in Table 1.
As expected from the comments given in the last paragraph of Section 3, after (17), the hit counts of the vector inequalitiesẽ ≤x γ andõ ≥x γ , as well as the vector in- Figure 2. Top, noisy inputx 2 (ε = 128, p = 25%); 2nd row (left to right), eroded thresholdingẽ , eroded residualx e , and min-memory recalled output y e ; 3rd row, dilated thresholdingõ, dilated residualx d , and max-memory recalled output y d . equalitiesx e ≤x γ andx d ≥x γ , all sum up to m (number of tested trials), i. e., inequalities are satisfied no matter the amount of noise the input undergoes. Observe that the ε values equal to 64, 96, and 128, may be interpreted subjectively as "mild", "moderate", and "strong" variations of impulsiveness. In Table 1, the numerical results in lines 5 and 6 for data retrieval altered by moderate impulsive noise as well as the values in lines 8 and 9 for recalling  data modified by mild impulse noise show the level of robustness achieved by the proposed mechanism in using a single LAAM.
The graphs in Figs. 5 and 6 relative to vector x 21 is an example showing an almost linear relationship between the amount of impulsive noise and the average relative error (NMSE) in recalling the corresponding vectors, respectively, with either the min-memory W XX or the maxmemory M XX . More specifically, the recall capability displayed by the performance lines corresponding to ε = 96 and ε = 64 for any p ∈ [0, 100] verify the partial results given in Table 1, showing again, the low order error attained during the retrieval stage by either LAAM enhanced   with the proposed thresholding and residuation technique. Note thatρ < 10 −3 for W XX whereasρ < 5 × 10 −3 for M XX . Also, if ε = 128,ρ < 7.5 × 10 −3 for W XX whereas ρ < 2.5 × 10 −2 for M XX . Thus, in this case, data retrieval from noisy inputs using the min-memory W XX is slightly better than using the max-memory M XX .  A similar recall performance for vector x 29 is depicted in Fig. 7 except that, for ε = 96, better retrieval is obtained with M XX . Specific instances of strong impulsive noisy versions of exemplar vectors x 21 and x 29 together with the outputs recalled with both LAAMs are shown in Figs. 8 and 9. Also, numerical results relative to these figures are listed in Table 2.  We remark that for any exemplar in X tested with Algorithm II, the retrieval performance of either the min-LAAM or max-LAAM in the presence of noisy inputs is quite similar to those example images used in our exposition.

Conclusions
Noise masking [12][13][14] was originally proposed to endow a single LAAM with recall capability for multivariate data degraded by random noise. Furthermore, the noise masking strategy has been recently used to good advantage to increase retrieval performance in max-plus projection autoassociative morphological memories and their fuzzy analogs [15][16][17]. In this paper, we have introduced a different computational method based on eroded-dilated noise thresholds and eroded-dilated generated residuals in pursuance of increasing a single LAAM recall capability in the presence of random impulsive noise. This technique performs better if compared to similar approaches already discussed elsewhere [9,10,19,20], which in turn outperform other well-known past associative memory models such as the Hopfield continuous recurrent neural network or the Anderson-Kohonen linear correlation model, to name a few. Algorithm II involves simple arithmetical operations and works fast on vectors in high dimensional spaces. As its name implies, impulsive noise alters the most a vector's data content and serves as an extreme model for data variation. Nonetheless, in many application scenarios, less aggressive noise probability densities such as uniform or Gaussian are more adequate for modeling data changes. Although not developed here, we point out that the proposed improved technique for LAAMs pattern recall works the same if other noise models based on symmetric probability densities are used. Continuation of this research will pay attention in doing tests with different multivariate data sets, use of asymmetric probability noise models, and the application of other vector metrics to further verify the robustness of our proposed model.