On the use of the Mahalanobis squared-distance to filter out environmental effects in structural health monitoring

This paper discusses the possibility of using the Mahalanobis squared-distance to perform robust novelty detection in the presence of important variability in a multivariate feature vector. The application of interest is vibration-based structural health monitoring with a focus on data-based damage detection. For this application, the Mahalanobis distance can be used to detect novelty using a multivariate feature vector extracted from vibration measurements from a structure at regular intervals during its lifetime. One of the major problems is that changing environmental conditions induce large variability in the feature vector under normal condition, which usually prevents detection of smaller variations due to damage. In this paper, it is shown that including the variability due to the environment in the training data used to define the Mahalanobis distance results in very efficient filtering of the environmental effects while keeping the sensitivity to structural changes.


Introduction
Vibration-based Structural Health Monitoring (SHM) techniques have been around for many years and are still today an active topic of research. Despite this fact, very few industrial applications exist. Two major trends coexist in the field: model-based and data based techniques. Modelbased techniques are often sophisticated and require a high degree of engineering knowledge and more hardware and software resources. They have however more potential to cover all levels of SHM, from damage detection to damage prognosis. On the other hand, data-based techniques are appealing because they are very simple and require less engineering knowledge as well as limited hardware and software. From that point of view, they are ideal candidates for industrial applications. These methods are however generally limited to the lowest levels of SHM: damage detection and in some cases, damage localisation.
Data-based damage detection techniques often consist in detecting a deviation from the normal condition using the data measured on the structure or system to be monitored. This paper focuses on the use of vibration data collected at regular time intervals. A further step consists in feature extraction, i.e. transforming these time series data into meaningful data, usually called features (the most common being the mode shapes and eigenfrequencies). The stochastic nature of the excitation and the unavoidable added noise on the measured data results in features having a stochastic nature. This fact being recognised, it is natural to turn to statistical methods to monitor them and detect any significant deviation from the normal condition. a e-mail: aderaema@ulb.ac.be The three basic elements of data-based damage detection are therefore (i) a permanent sensor network system, (ii) an automated procedure for real-time feature extraction, and (iii) a robust novelty detector. The first element has received much attention in the last decade and the enormous advances in sensors and instrumentation make it possible to deploy very large sensor networks on structures and gather the measured data in central recording units at high sampling rates. The second element is still today a challenge, and for the most widely used features (eigenfrequencies and mode shapes) is an active topic of research [1]. An alternative is to look at other features which can easily be extracted from the time domain data. Several efforts have been made in that direction, such as the use of Hankel matrices based on covariances [2], or peak indicators in the frequency output of modal filters [1]. For the third element, different approaches have been borrowed from statistics, such as the use of control charts [3], outlier analysis using the Mahalanobis squared-distance [4] (which is similar to Hoteling T 2 control charts for individual measurements) or hypothesis testing [2]. This paper deals with the third element of the databased damage detection system and focuses on the use of the Mahalanobis squared-distance. An interpretation of this distance is given by performing an eigenvalue decomposition of the covariance matrix used to compute it. This leads to a decomposition in terms of a sum of independent components corresponding to different directions in the feature vector space. The eigenvalues corresponding to the different directions are used to separate the set of directions into two sets. The set corresponding to the largest eigenvalues is shown to have very little influence on the Mahalanobis squared-distance, while the other set has a major contribution. An interesting application is the possi- bility to filter out variability due to environmental or operational conditions by including this variability in the samples used for the computation of the covariance matrix. By doing so, the subspace including the variability corresponds to the directions in the first set with the largest eigenvalues, and the Mahalanobis distance is almost insensitive to it. This idea is illustrated on the example of a wooden bridge in which the modal data varies significantly due to environmental conditions. A structural change is induced in the form of an added mass, and the results show that by including the environmental variability in the computation of the covariance matrix, the Mahalanobis distance filters it effectively while keeping a high sensitivity to the damage.

The Mahalanobis squared-distance
Consider a set of N feature vectors y i (i=1...N) of dimension n, representing N samples of the "healthy state" of a structure, of which the mean vector y of size n × 1 and the covariance matrix [C] of size n × n can be estimated.
The multivariate feature vectors correspond to the features extracted from the vibration measurements such as a set of eigenfrequencies, modeshapes, FRF or transmissibility functions at given frequencies, etc.
The principle of outlier analysis [4] is, for each sample of the multivariate feature vector y ζ , to compute the Mahalanobis squared-distance given by: Computing D ζ for all the data in the training set used to compute [C], it is possible to set a threshold. If a new sample y ζ of the feature vector results in a value of D ζ above this threshold, it will be considered as an outlier.

Spectral decomposition
In most cases, the features in the data vector are not independent so that the covariance matrix is not diagonal. It is however possible to perform a transformation of the feature vector in order to diagonalize the covariance matrix. This is done by computing the eigenvectors {U i } and eigenvalues σ 2 i of [C] : The orthogonality properties are given by: where [U] is the matrix whose columns contain all the eigenvectors, [S ] is a diagonal matrix containing the eigenvalues σ 2 i in descending order on the diagonal, and Id is the identity matrix. The spectral decomposition of the covariance matrix is given by: and the spectral decomposition of the inverse of the covariance matrix: Assume now the following transformation : The mean and covariance matrix estimated from the N transformed samples η i (i = 1...N) are given by and using the orthogonality condition, one sees directly that the covariance matrix of η is diagonal and that the standard deviation of each component η i is given by σ i : Using the inverse transformation the Mahalanobis squared-distance reduces to: This shows that the Mahalanobis distance can be decomposed in a sum of independent contributions from each component of the transformed variables η ζi = U T i y ζ . The contributions are weighted by the inverse of the associated eigenvalues σ 2 i , which can be interpreted as the variances of the new, transformed variables. If the variance is large, the contribution to the distance is small.

Filtering of the environmental effects
In many cases, when the number of features is large enough, the total variability in the feature vector extracted from the healthy condition can be explained by a smaller number of transformed features, usually called the principal components. Strictly speaking, this occurs when some of the eigenvalues of [C] are equal to zero. The associated eigenvectors form the null-space (or kernel) of the training data. In practice, due to the noise and numerical precision issues, the eigenvalues are not strictly equal to zero, but a significant drop in the eigenvalues can be observed and is used to define the number of principal components which account for most of the variability. An effective null-space is defined by putting a threshold on the singular values, assuming that the singular values below this threshold are only non-zero due to the noise in the training data. In the following, we will talk about 'null-space' or 'kernel' for effective null-space or kernel. A practical way to determine the number p of vectors in the principal subspace is to define the following indicator: and to determine p as the lowest integer such that I > e(%), where e is a threshold value (i.e. 99.9 %) The meaning of this threshold is as follows: p principal components are needed in order to explain e% of the variance in the observed data. Assume that these p principal components have been identified. For a new sample of the feature vector y ζ , the Mahalanobis distance can be decomposed in two parts : where D 1 ζ is the Mahalanobis distance of y ζ projected on the principal components, and D 2 ζ is the Mahalanobis distance of y ζ projected in the null-space of the principal components.
If one now assumes that very large variability exists in the feature vector extracted from the healthy condition due to environmental effects, if this variability is more important than other sources such as noise, it will belong to the set of the first p principal components. Because the Mahalanobis distance scales each independent component with respect to the inverse of its variance, the distance will have a very low sensitivity to the environmental changes. By including the feature vector measured in all possible environmental conditions in the computation of the covariance matrix, the Mahalanobis distance is made insensitive to the environmental conditions. This idea is demonstrated in the next section on a laboratory experiment.

Application: wooden bridge
Consider the wooden bridge shown in Figure 1, equipped with a monitoring system, previously developed and investigated in [5]. The total mass of the bridge is 36kg. A random excitation was applied using an electrodynamic shaker and output-only acceleration measurements were collected at 15 different locations. Mode shapes and eigenfrequencies were extracted from the measurements using output-only stochastic subspace identification. The monitoring was performed over several days during which modal properties varied significantly due to temperature and humidity changes. Out of the 16 modes identified, only modes 6-8, 10 and 12-16 are used in this study. The first 5 modes correspond to rigid body modes, while modes 9 and 11 were not found consistently in all datasets, which is the reason for discarding them. In total, 1880 measurements were performed on the undamaged structure under changing environmental conditions. Samples 1881 to 2008 then correspond to the incremental addition of a local mass of 23.5g, 47g, 70.5g, 123.2g and 193.7g at samples 1881 , 1901, 1924, 1946 and 1966 respectively, which are considered as the damaged condition here. Finally, the local mass was removed (samples 1986-2008) to return to the undamaged condition. Figure 2 shows the variation of natural frequency 6 with respect to the sample number. It is clear that environmental conditions are responsible for frequency changes of an order of magnitude larger than the damage (simulated here with an added mass).
The feature vector is made of 9 natural frequencies and 9 mode shapes measured at 15 locations. The mode shapes are complex and normalised with respect to the first component, so that 14x2 values are used in the feature vector for each mode shape, resulting in a feature vector y of dimension 261. Each component of the feature vector is then normalised with respect to the mean and standard deviations computed on the undamaged samples (1-1880).  Figure 3 shows the evolution of the Mahalanobis distance computed for each of the 2008 samples when considering the first 300, 1000 and 1880 samples for the computation of the covariance matrix. It is clear from the results that the Mahalanobis distance acts as a filter for the environmental variations which have been used for the computation of the covariance matrix. When using the first 300 and 1000 samples, the value of the Mahalanobis distance for some of the healthy samples (not used for the computation of the covariance matrix) is of the same order of magnitude as for the samples with the structural change. In this case, the novelty detection is hindered by the environmental changes. When including all the variability from the environment in the computation of the covariance matrix, robust novelty detection is achieved: the Mahalanobis distance is insensitive to environmental changes and very sensitive to structural changes.  Fig. 3. Evolution of the Mahalanobis squared distance with respect to sample number when the covariance matrix is computed with 300, 1000 and 1880 (all) healthy samples

Conclusion
The Mahalanobis squared-distance is often used to perform novelty detection. By performing an eigenvalue decomposition of the covariance matrix used to compute that distance, we have shown that the Mahalanobis distance can be written as the sum of independent terms which result from a transformation from the feature vector space to a space of independent variables. In general, especially when the size of the features vector is large, there are dominant eigenvalues and eigenvectors associated to the covariance matrix, so that a set of principal components can be defined. Because the associated eigenvalues are high, their contribution to the Mahalanobis distance is low, while the contribution of the other components is high due to the low value of the associated eigenvalues. This analysis shows that the Mahalanobis distance naturally filters out the variability in the training data. This property can be used to remove the effect of the environment in damage detection. This concept has been demonstrated on real data from a wooden bridge for which the feature vector consists in eigenfrequencies and modeshapes collected under changing environmental conditions, as well as damaged conditions simulated with an added mass. The results confirm the ability to filter out environmental effects while keeping a high sensitivity to structural changes.