## Abstract

Signal processing is one of the most important elements of structural health monitoring. This paper documents applications of time-variant analysis for damage detection. Two main approaches, the time–frequency and the time–scale analyses are discussed. The discussion is illustrated by application examples relevant to damage detection.

## 1. Introduction

Structural health monitoring has grown over many years and is now a well-established and distinct area of theoretical and applied research on structural damage and its monitoring. It has become an essential part of maintenance and asset management activities in many areas of engineering. Once acceptable boundaries between safety and damage are established, there are two major tasks associated with these activities. Firstly, relevant parameters used for monitoring damage (detection and location) and subsequent damage accumulation need to be obtained. This task, often called diagnosis, is mostly related to analysis, identification, modelling and signal processing. Secondly, the remaining service life of structures needs to be assessed and possible breakdowns are predicted. This task, often called prognosis, falls into the fields of fatigue analysis, fracture mechanics, design assessment/reliability and statistical analyses.

Despite the fact that damage detection is often an inverse problem, and no universally applicable solution exists to all damage problems, the acquisition of knowledge with respect to diagnosis is enormous. Various damage detection techniques, approaches and methodologies have been developed over the past 30 years. Many recent studies in this area are related to new developments in advanced signal processing, as reported in Worden & Staszewski (2003) and Staszewski & Worden (2003). To date, most damage detection methods use Fourier analysis as the primary signal-processing tool. From the resulting spectra, modal properties or damage-sensitive features are extracted to detect change in the signal properties. However, often the most important aspect is in how these characteristics change in time. Examination of the time-invariant modal properties of a structure neither can typically provide information about whether a system is nonlinear, nor can it identify a sudden change in system properties. The non-stationary nature of measured signals produced by mechanical faults or structural damage suggests that time-variant procedures can be used to detect such faults as an alternative approach to the classical, Fourier-based methods. Recent years have shown an enormous progress in this area. All time-variant methods can be classified into three major groups: time-dependent models, time–frequency and time–scale analyses (Staszewski 1994).

Time-variant modelling can offer a powerful solution to many non-stationary problems, especially when combined with different filtering techniques. Many algorithms have been proposed to obtain time-dependent model coefficients which include adaptive algorithms and Kalman filter methods. Other solutions to the problem of non-stationary signals are given by the concept of time–frequency adaptive methods. A non-stationary signal can be considered as the sum of consecutive quasi-stationary segments for which stationary, classical Fourier-based methods can be applied. The majority of these methods are based on the short-time Fourier transform (FT), which uses a window function to analyse the signal locally. Another solution of the non-stationary problem is offered by a time–frequency distribution, which is a function giving a distribution of the total energy of the signal at particular time and frequency points. The Wigner–Ville (Wigner 1932; Ville 1948) distribution was the first one introduced and has been the most widely studied. The main use of a time–frequency approach for studying non-stationary signals is to analyse time variations of the spectral quantities. A different approach evolves if non-stationary signals are considered as a superposition of a number of components which are more or less localized. This can be done using signal decomposition based on *a priori* chosen functions. Thus, instead of a time–frequency representation, one deals with a time–scale representation. The first work in this area goes back to Gabor (1946) and Hoelström (1966) where shifted Gaussian discrete and continuous functions in the time and frequency domains were studied. If the time/frequency shift is replaced by a dilation or compression of scale, the time–scale decomposition leads directly to the wavelet transform. Although wavelets offer time–frequency analysis, the wavelet transform as a signal decomposition cannot be directly compared with any time–frequency representation, as explained in Mayer (1993). In this context, the continuous wavelet transform will always be considered as a time–scale and the Wigner–Ville distribution (WVD) as a time–frequency method in the mathematical community. This is why both approaches are discussed separately in the present paper.

In the context of knowledge- and signal-based damage detection, there are four major areas of interest where time-variant approaches are relevant. These are: signal pre-processing associated with the experimental data, system identification, feature extraction and analysis of nonlinearities.

The modest aim of the paper is to illustrate these areas. A comprehensive review is not intended in any way. The intention of the authors is to give a flavour of what wavelets can do for damage detection. The examples presented will range from literature material to recent results and applications. The choice of the approaches and examples is dictated by the personal preference of the authors. The scope of the paper does not allow for detailed analysis of illustrated examples. The reader is referred to appropriate literature for further reading.

The structure of the paper is as follows. Sections 2 and 3 illustrate examples of time–frequency and time–scale methods used for damage detection. The focus is on visual characteristics and combined time–frequency features. Section 4 shows how time–frequency characteristics can be used for system identification. This includes cross-sections, ridges and reassignment algorithms. The wavelet-based decomposition of signals and its application for denoising and feature extraction is given in §5. Section 6 demonstrates examples how wavelets can be used to analyse nonlinearities arising from damage and fault phenomena. This includes examples of damping estimation, some preliminary results related to input–output analysis and detection of singularities. Finally, the paper is concluded in §7.

## 2. Time–frequency analysis

The evolution of a signal *x*(*t*) in time can be described using time-domain analysis. This approach leads to various characteristics, such as time of duration, maximum amplitude or statistical parameters. Different properties and parameters are obtained when frequency analysis is performed. The FT *X*(*f*) of the signal not only gives information about the spectral composition but also on physical properties which depend on time. When the spectral content of the signal is changing in time neither the time nor the frequency domain is sufficient to accurately describe the signal properties. Clearly, the energy density |*X*(*f*)|^{2} becomes a function of time and a new signal representation *F*(*t*, *f*) is needed, being a function of both time and frequency.

An infinite number of distributions can be derived from the general class of time–frequency distributions given by Cohen (1966)(2.1)where *ϕ*(*θ*, *τ*) is an arbitrary function called the kernel. Through different choices of this kernel function, different time–frequency distributions may be found. The kernel function corresponds to the type of window used in window-based distributions and determines certain properties, particularly how the energy will be distributed in the time and frequency domains. One of the major differences between each time–frequency method is the way that it handles the problem of uncertainty. The uncertainty principle (Cohen 1995) states that one cannot simultaneously have good frequency resolution and good time resolution. Trying to obtain the highest levels of resolution without inducing spurious artefacts in the transformation is the goal of each method. In this section, three different methods for extracting a time–frequency map of a signal are reviewed: the short-time Fourier transform (STFT), the WVD and the Choi–Williams distribution.

To illustrate each method, the response of a 3 degree of freedom (d.f.) spring–mass system will be analysed (figure 1). The equations of motion for this system are(2.2)A form of damaged is induced into the system by allowing the stiffness of the second spring to decrease linearly in time once 3 s had past. This decrease in stiffness could represent the propagation of a crack through a plate. The parameters are defined as follows: *m*_{1}=*m*_{2}=*m*_{3}=1 kg, *k*_{1}=5400 N m^{−1}, *k*_{2}=2260−(300*t*, *t*>3) N m^{−1} and *k*_{3}=1000 N m^{−1}. The system is excited at the third d.f. using a random input. The system response at d.f. two, shown in figure 2, is calculated using the explicit fourth- and fifth-order Runge–Kutta formulae, and will be the signal analysed in the following subsections.

### (a) Short-time Fourier transform

The original approach for performing time–frequency analysis relies on the well-known FT. By applying the FT to sections of a signal, an assessment of the time-varying nature of the frequency spectrum can be found. Termed the windowed FT, it is performed by translating a window of a given length, *h*(*t*), along the signal of interest, *x*(*t*)(2.3)

The FT is then taken within the window, allowing for a measurement of the frequency content for that segment of the signal(2.4)

The translating window provides the means for breaking up the signal into sections as well as enforcing periodic assumptions. Common forms of this window include the Hamming, Hanning or Gaussian functions. The window function *w*(*t*) can be chosen in such a way that its FT, i.e. *W*(*f*), is also a window function. Thus, the windowed FT given by equation (2.4) is called a STFT. The width of the window is kept constant throughout the analysis and specifies the resolutions of time and frequency information that will be obtained. If a wide window is used, a finer resolution of frequency information is found, but with worse time resolution. Unfortunately, both good time and frequency resolution cannot be achieved. Therefore, the STFT tends to provide more accurate frequency information for signals that change very slowly in time.

If a Gaussian function is chosen as a window(2.5)where *α*>0, the optimal solution is obtained (i.e. the Gaussian function leads to optimal time and frequency resolutions) and the windowed FT is called the Gabor transform. One way of improving the time resolution is to use windows that overlap with one another. This will provide more time points in the time–frequency map and create a more smooth appearance, but cannot increase the resolution of any given point.

The function extracted using the STFT can be displayed in a two-dimensional time–frequency map, where at each time point a frequency spectrum is provided that is based on the averaged characteristics of the signal for the period of time within the window. This map is called a spectrogram and is computed by taking the square of the modulus of *F*_{t}(*ω*) for each window. The spectrogram for the example 3-d.f. problem discussed earlier is shown in figure 3. The STFT appears to do very well at picking out the three modal frequencies of the system and showing their variation in time past the 3 s threshold. Aspects of this plot to examine are the width of the frequency bands identifying the modes, the sharpness of the transition in time between constant frequency values and linearly varying ones, and the accuracy of the frequency values as compared with the instantaneous measurements shown in the plot. There is some uncertainty in the modal frequencies in this plot caused by the large width of the frequency bands. The time variation of the frequencies, on the other hand, is fairly accurate because the values are changing at a very slow rate.

### (b) Wigner–Ville distribution

The time–frequency resolution of the STFT is limited by the size of window that is used in the analysis. The WVD seeks to increase the limits of resolution by weighting the signal *f*(*t*) with time and frequency translations of itself, rather than a window function. The distribution is defined by taking the FT of the instantaneous auto-correlation function, as (Claasen & Mecklenbrauker 1980)(2.6)where *x*^{*} indicates the complex conjugate of *x*. The WVD is a measure of the overlap of the signal at a past time with the signal at a future time point. The drawback is that this creates interferences (cross terms) which show up as highly oscillating terms resulting from the superposition of separate spectral components of the signal, and can cause misinterpretation of the signal phenomena. These interferences can be removed by averaging, but will reduce the resolution.

Another drawback to the WVD is the decrease in frequency range over the STFT. When taking a FT, the resulting spectra have a periodicity equal to the rate at which the signal was sampled. The WVD distribution, however, has a periodicity equal to only half the sampling rate. To increase the frequency range of the representation, the corresponding analytic signal is first extracted using the Hilbert transform. By finding the WVD of the analytic signal, aliasing can be mitigated.

The WVD of the analytic signal corresponding to the response of the 3-d.f. system is shown in figure 4. There is a distinct improvement in the resolution of the three modal frequencies over the STFT. However, the cost is that cross terms have been induced into the time–frequency map. Two new frequencies are appearing, one at the average between the lower two modes and another at the average of the higher two modes. There is a third cross term at the average of the lowest and the highest modes, but it is difficult to see. Though the new frequencies appear to oscillate, their presence could cause difficulty in assessing what modes are present in the system from the time–frequency map. In addition, note the decreased visibility of the highest mode. The change in frequency of the modes is easily seen in this figure, though the exact point at which the modes transition from constant values is not as sharp.

### (c) Pseudo Wigner–Ville distribution

One of the ways to help alleviate the problems of cross terms in the WVD is to apply a window to the equation. The window function helps to smooth the distribution while emphasizing the signal around time *t*, provided that the window peaks at *τ*=0. This new distribution is called the pseudo Wigner–Ville distribution (PWVD) and is defined as(2.7)

The PSVD of the response of the 3-d.f. system is shown in figure 5. The presence of the cross terms does decrease compared with the WVD, but at the cost of a slight increase in the broadness of the modal frequency values.

### (d) Choi–Williams distribution

The window function used in the PSVD is just one choice for a kernel function in Cohen's general class of distributions. There are many other functions that can be used to try and smooth out the cross terms resulting from the WVD. If the kernel function in equation (2.1) is used as(2.8)the Choi–Williams distribution is obtained (Choi & Williams 1989). The value of *σ* in equation (2.8) specifies to what degree the cross terms will be attenuated. Increasing *σ* will not only increase the amount of attenuation, but also decrease the time–frequency resolution of the distribution.

The Choi–Williams distribution was used to examine the response of the 3-d.f. system. Figure 6 shows the Choi–Williams distribution of the response record. The attenuation of cross terms is better than seen with the PWVD, but there is a decrease in frequency resolution as compared with the WVD. The highest mode is also difficult to identify.

### (e) Summary

In this section, four different methods for computing the time–frequency distribution of a signal were investigated. It was seen that the most common approach, the STFT, provided a decent approximation to the frequencies present in the signal, and their changes. The WVD greatly improved the resolution to which the frequency values are known, but at the cost of the introduction of artefacts in the distribution. The pseudo Wigner–Ville and Choi–Williams distributions provided results somewhere in between. Both had less artefacts present than the Wigner–Ville transform, but with worse resolution. However, the resolution was not as poor as the STFT. The choice of which distribution to use is up to the individual, based on the trade-offs between artefacts and frequency resolution for his/her application of interest.

## 3. Time–scale analysis

The FT can be considered as a decomposition of a function into a linear combination of vectors given by Fourier coefficients. This decomposition does not give any local information about the function due to the infinite support of the trigonometric functions used in the analysis. One of the most recent, rapidly evolving methods which provides for locality is the wavelet transform. Wavelets have been gaining popularity since their first influx into mainstream mathematics, signal processing and engineering in the late 1980s.

Wavelets are a class of basis functions that have well-defined and specific properties that distinguish them from other basis functions. Signal decomposition based on wavelets is very similar to FTs, which use dilations of sinusoids as the bases. Since the sinusoids are not localized in time, the STFT must rely on a window function to obtain the desired time information from a signal. Wavelets, on the other hand, are localized both in time and frequency and can therefore be dilated and translated along a signal to extract both time and frequency information.

Wavelet analysis covers many diverse areas. Altogether the wavelet transform can be classified as continuous and discrete. In general, continuous wavelets are better for time–frequency analysis and discrete wavelets are more suitable for decomposition, compression and feature selection. However, the choice of wavelets is not always clear. The wavelets of Grossman–Morlet discussed in this section are the best known continuous wavelets. Daubechies wavelets used in §5 form probably the best known orthogonal wavelet basis. Biorthogonal wavelets (e.g. dual B-wavelets) are special cases of discrete wavelets. They consist of two wavelet sets: one for decomposition and another for reconstruction. Wavelets have even opened new ways of signal representation which are beyond the classical time–scale wavelet analysis. These approaches include: time–frequency wavelets (e.g. Malvar and harmonic wavelets), wavelet packets and matching pursuits, which have the ability to analyse a function in the combined time–frequency–scale domain. These developments are not discussed in this paper. The reader is referred to Mayer (1993), Chui (1992) and Mallat (1998) for more details related to wavelet analysis.

### (a) Continuous wavelet transform

The continuous wavelet transform, *W*_{ψ}(*u*, *s*), is obtained by convolving the signal *x*(*t*) with the translations *u* and dilations or scales *s* of a mother wavelet *Ψ*(3.1)where(3.2)

The frequency content of the wavelet transform is represented in terms of scales which are inversely related to frequencies. The squared amplitude of the continuous wavelet transform is therefore called the scalogram. Despite the time–scale nature of wavelets, the relationship between scales and frequencies can be found for most commonly used wavelet functions. This allows for time–frequency maps to be formed from scalograms.

Since the WT works in a similar manner to the STFT, by convolving the signal with a function that varies in both time and frequency, it suffers from similar limitations in the resolution of the time–frequency map. Both the transforms are confined by the uncertainty principle, which limits the area of a time–frequency atom in the overall time–frequency map. The biggest difference between the two transforms is that the atoms in the WT map are not a constant shape. In the lower frequencies, the atoms are wider, providing a better resolution in frequency and worse resolution in time, whereas in the upper frequencies the atoms are taller, providing better time resolution and worse frequency resolution. This variable resolution can be advantageous in the analysis of certain types of signals where low and high frequencies can bee seen at the same time in the combined time–frequency domain.

A Morlet wavelet (Grossman & Morlet 1985) defined as(3.3)was used to analyse the response signal from the 3-d.f. system. The Morlet wavelet is a complex-valued wavelet which provides information about both the magnitude and the phase of the signal. Complex-valued wavelets are more useful in the computation of modal frequencies. In contrast, real wavelets are often used to detect sharp signal transitions. Recent developments in this area include various modified wavelet functions which have tunable central frequencies and other useful time–frequency properties. This includes: shifted Morlet (Staszewski 1994), harmonic (Newland 1994), low oscillation (Addison *et al*. 2002) and other types of wavelet functions (e.g. Harrop *et al*. 2002) that allow sharp transitions to be detected. The wavelet transform of the 3-d.f. response signal in figure 7 demonstrates the variable nature of the wavelet time–frequency map. The lowest of the three modes has the best frequency resolution, but the worst time resolution.

## 4. Time–frequency features

Once a time–frequency map has been formed, a damage-sensitive feature must be chosen. These features and the corresponding statistical classification scheme must be developed on a case-by-case basis taking into account the specific properties of the structure being monitored and specific properties of the damage that is to be detected. The wide range of structures and damage types has led to a large field of research in extracting different damage-sensitive time–frequency features. This section shows examples of wavelet-based characteristics which can be used for damage detection.

### (a) Time–scale cross-sections

The square of the scalogram can be interpreted as an energy density distribution over the time–scale plane. Often time–frequency and time–scale characteristics show too much redundancy and do not exhibit significant features related to damage. One possible solution is to analyse specific areas of these characteristics. The wavelet power spectrum defined as(4.1)integrates the scalogram over time values to give a weighted average power spectrum, which has a different energy distribution if compared with the classical power spectrum.

Both spectra are constant time-bandwidth products but the time and the frequency windows involved in the analysis are different, i.e. both windows are almost the same at low frequencies, but significantly differ at high frequencies due to the nature of wavelets. This can be observed in figure 8, where the classical and the wavelet power spectra are presented for a ball-bearing vibration signal. Low-frequency components in the wavelet-based spectrum exhibit a damaged outer race in the bearing. The wavelet power spectrum is relatively smooth in higher frequencies when compared with the Fourier power spectrum. This is due to the fact that the wavelet frequency window analyses the signal over a much wider frequency range at high frequencies than the Fourier window.

The cross-sections of the scalograms for a given scale (frequency) value can be used to obtain damping estimates for linear systems as shown in Staszewski (1997), Ruzzene *et al*. (1997) and Lamarque *et al*. (2000) and shown in figure 9. Here, a 2-d.f. system with closely spaced modes (25 and 30 Hz) is analysed. The wavelet transform from the impulse response of the system is shown in figure 9. Damping ratios can be estimated from the slope of the straight line of wavelet modulus cross-section , for the given value of dilation (or frequency) *s*_{0}, as shown in figure 10 for the first (25 Hz) mode of vibration.

### (b) Wavelet reassignment

Just as the pseudo Wigner–Ville and Cohen distributions were proposed to decrease interference terms in the WVD, time–scale reassignment has been proposed to help improve the localization of the wavelet transform (Auger & Flandrin 1995; Chassande-Mottin 1998; Chassande-Mottin & Flandrin 2000). In fact, this method was first applied to the SFFT (Kodera *et al*. 1978).

Each of these methods seeks to improve the *readability* of the time–frequency map. In other words, the ability to understand precisely how the frequency spectrum is changing in time. In a method such as the wavelet transform or spectrogram, one is obtaining an estimation of the frequency content within a certain frequency bin size (frequency resolution) and within a certain time window (time resolution). The time–frequency atom is therefore displaying the averaged content of what is found in that time/frequency region. This averaged information is then assigned to the geometric centre of that atom. There is no indication of the importance of the individual components that went into the averaging procedure. The purpose of the reassignment technique is to compute which point contributed most for a given time–frequency atom. That point, which can be thought of as the centre of mass of the atom, is where the energy of the given atom is assigned. The result is the focusing of information to a more precise location of where it actually occurred.

The method works as follows. The energy distribution of the wavelet transform is defined as the square of the modulus, . However, like the other distribution presented earlier, it can also be derived through a smoothing of the WVD(4.2)

At each point where a scalogram value *S*(*u*, *s*) is computed, time–scale reassignment also computes the local centroid . The scalogram value is then moved from the point where it was computed to a new time–scale point deduced from this centroid. The resulting reassigned scalogram is defined as(4.3)with the centroid being given by(4.4)(4.5)where represents the time centroid and the frequency centroid. The frequency value can then be converted to a scale value using the following transform(4.6)where *ω*_{0} is the mean frequency of the wavelet energy density.

A more efficient algorithm is available for computing the centroid, involving the computation of two additional wavelet transforms (Chassande-Mottin & Flandrin 2000). Thus, compared with conventional scalograms based on a single continuous wavelet transform, the three continuous wavelet transforms required by reassigned scalograms significantly increase the computational cost of the method. However, fast algorithms are possible and can be used to help reduce the increased time costs. To show the effectiveness of this method, it has been applied to the continuous wavelet transform shown in figure 7. As shown in figure 11, reassignment results in a focusing of both the time and the frequency information in the time–frequency map, without any observable detriment to the representation.

### (c) Ridges and skeletons

Figure 7 shows that the energy of the analysed signal is mainly concentrated around three curves representing the modes of vibration. These curves exhibit the varying frequency of the analysed signal. The concept of ridges and skeletons of the wavelet transform, developed in Tchamitchian & Torresani (1992), is related to the problem of extracting the instantaneous amplitude and frequency content from a signal.

The asymptotic signal *x*(*t*), i.e. a signal that exhibits slowly varying amplitudes compared with phase variations, can be described using the analytic representation(4.7)where is the Hilbert transform of *x*(*t*) defined as(4.8)

If the wavelet *ψ*(*t*) is also asymptotic, the continuous wavelet transform of *x* can be represented using the scalar product(4.9)The above equation shows that the major contribution to the transform comes from the stationary points of the phase function(4.10)which are determined by the following relationship:(4.11)Functions *ϕ*_{x} and *ϕ*_{ψ} in equation (4.7) are the instantaneous phases of the signal and wavelet, respectively.These stationary points *t*_{s} are in fact functions *t*_{s}(*u*, *s*) of time and scale. For , the continuous wavelet transform *W*_{ψ}(*u*, *s*) can be expressed as (Tchamitchian & Torresani 1992)(4.12)where ‘^{*}’ denotes the complex conjugate and *C*(*u*, *s*) is a correction function. The important conclusion is that for a set of time–scale points (*u*, *s*) for which *t*_{s}(*u*, *s*)=*b*, the continuous wavelet transform *W*_{ψ}(*u*, *s*) behaves like the analytic signal *Z*_{x}(*t*). The curve *s*=*r*(*u*) of these points is called the ridge of the transform. Figure 12 gives an example of the ridge for the wavelet transform shown in figure 7. The values of the wavelet transform restricted to its ridge, i.e. for *s*=*r*(*u*), form the skeleton of the wavelet transform. From the ridges and skeletons, the signal can be reconstructed. One can also obtain information about the amplitude and frequency modulation of the signal, which is useful for the identification of non-stationary and nonlinear systems and damping estimation, as shown in Staszewski (1997, 1998) and illustrated in §6.

## 5. Discrete wavelet transform

The continuous wavelet transform transfers one-dimensional signals into two-dimensional time–scale maps. This approach is associated with the redundancy. The discrete wavelet transform limits the analysis to a subset of wavelet coefficients from the time–scale map in such a way that the original signal can still be recovered. Wavelet coefficients are widely used for damage detection. The major task is to relate features from selected coefficients to various types of damage. After a short introduction to the discrete wavelet transform, two approaches used in engineering applications are discussed.

### (a) Background

Discrete wavelets are often associated with dilation equations and scaling functions. The basic scaling function *ϕ*(*t*) can be defined as(5.1)where the values *c*_{n} are wavelet coefficients. These coefficients must satisfy certain conditions, which are discussed in Newland (1993). The scaling functions are then used to construct the corresponding wavelets *ψ*(*t*) from the following formula(5.2)

Any arbitrary signal *x*(*t*) can be represented as a weighted sum of wavelet functions following the expansion(5.3)

The discrete wavelet transform gives the span of the analysed signal at different resolutions according to wavelet scales. Wavelet functions can be constructed in a way such that they form a family of orthogonal bases. In contrast to the continuous wavelet transform, described in §3, orthogonal wavelets are concise and decompose functions without any redundancy. The major drawback of the analysis is the fact that the discrete wavelet transform based on orthogonal wavelets leads to translation variance which does not allow for localization of signal features.

### (b) Wavelet denoising

The presence of noise in the sensor data is a common problem in many applications. Analysing a signal with noise is not a trivial task and requires finding a basis which concentrates the signal energy over a few coefficients. By retaining only specific wavelet coefficients, undesired features can be removed from the signal. One particular application is the procedure of denoising.

Various procedures for the selection of wavelet coefficients are available. Often high-level wavelet coefficients are removed since they represent high frequencies attributed to the noise. Alternatively, a thresholding procedure can be applied within each level separately to track specific frequency bands. In Donoho & Johnstone (1994), a threshold at two standard deviations, i.e. , is used for denoising. The so-called optimal threshold value *c*_{t} introduced in Mallat (1998) is chosen as(5.4)

Often the attenuation of wavelet coefficients yields better denoising than coefficient selection. This requires the amplitude of all noisy coefficients that are above *c*_{t} to be decreased by *c*_{t}. The procedure, called a soft thresholding, uses the limit (Mallat 1998)(5.5)

The noise standard deviation can be estimated from a median measurement of the highest level wavelet coefficients as (Mallat 1998)(5.6)where *m*=log_{2}*N* and 0≤*n*<(*N*/2). The advantage of using the median rather than the mean is that it can isolate outliers of potentially high amplitudes.

Figure 13 shows an example of optical fibre data (Staszewski *et al*. 2000) denoised using the tenth-order Daubechies wavelet function (Daubechies 1992). Figure 14 shows the same sensor data with the noise removed using the thresholding procedure with the median filter.

### (c) Wavelet levels

The discrete wavelet transform is often performed using a unit time-interval, i.e. values of time for which 0≤*t*<1. This approach, called the circular wavelet transform, is similar to the fast Fourier transform (FFT) analysis. Wavelet functions are then wrapped around the unit interval. As a result, the wavelet decomposition of *x*(*t*) can be represented in this interval as (Newland 1993)(5.7)where the coefficients *a*_{0}, *a*_{1}, *a*_{2}, *a*_{3}, … give the amplitude of all contributing wrapped wavelets. The above equation shows that the analysed signal *x*(*t*) can be represented as a sum of so-called wavelet levels given by(5.8)

All levels are in fact reconstructed signals from the appropriate wavelet coefficients *a*_{k}. The levels are usually numbered from either −1 or 0. Each of the levels displays a different frequency band of the analysed signal and gives the contribution to the whole signal energy. Higher levels correspond to high frequencies whereas lower levels exhibit low frequencies of the signal. In contrast to the discrete wavelet transform, described in §5*a*, wavelet levels offer translation invariant analysis. This is the major advantage for damage feature analysis.

Figure 15 shows an example of wavelet levels for the ultrasonic Lamb wave data gathered from a composite plate. The original signal given at the top is accompanied by the eight highest wavelet levels. Level 4 clearly displays at 0.11 ms the reflection feature due to the delamination in the plate. This feature is not visible in the original signal.

## 6. Analysis of nonlinear systems

There are many cases where damage causes a structure to go from a system that can be accurately modelled as a linear system to a system which exhibits a nonlinear dynamic response. Nonlinear vibration from faulty machinery or damaged structures results from various phenomena. One of the most common examples of nonlinearities in system response is associated with the formation of fatigue cracks that open and close (crack breathing) during dynamic loading. These phenomena lead to parametric vibration and bilinear characteristics. Local tooth faults in gearboxes can lead to a varying nature of tooth stiffness. The loss of preload in bolted connections often results in a rattle exhibited by singularities in vibration signals. Since nonlinearities may result from damage in a system, the distinction between linear and nonlinear behaviour is an important problem in damage identification. Examples of these phenomena and solutions offered by wavelets are given in this section.

### (a) System identification

It is well known that many types of nonlinearities cause a varying nature for the restoring forces and natural frequencies of the system. This varying nature of vibration can be studied by means of wavelet analysis. A nonlinear single d.f. system can be represented by the general equation of the free decay as(6.1)where gives linear and nonlinear damping and restoring forces. The solution of the above system can be obtained in the form of slowly varying, time-dependent amplitude and phase functions (Kryloff & Bogoliuboff 1943)(6.2)Here, *A*(*t*) gives the exponentially decaying envelope which exhibits the damping of the system while describes the instantaneous frequency of the system. For example, it is well known that Coulomb friction leads to a linear decay envelope and constant instantaneous frequency. Natural frequency and damping coefficients are often represented as functions of the free vibration envelope. A graphical representation of vibration behaviour in the form of natural frequency versus free vibration envelope is called a backbone curve. Obviously, the backbone of a linear system does not depend on the envelope and is constant. A single d.f. nonlinear system with cubic stiffness, i.e. , shows the nonlinear dependency of the natural frequency on the free response envelope (Nayfeh 1978). Both the damping and backbone characteristics can be obtained from wavelet analysis, as shown in Staszewski (1997).

Following Staszewski (2000), a simple 2-d.f. system with cubic stiffness nonlinearity is analysed here. The differential equation which governs the system is given by(6.3)where *m*_{1}=*m*_{2}=1, *c*_{1}=0.02, *c*_{2}=0.01, *k*_{1}=2.0, *k*_{2}=1.2, *k*_{3}=70.0 and non-zero initial displacement condition *x*_{1}(0)=1.0. Figure 16 shows two vibration modes, with estimated frequencies of 0.22 and 0.33 Hz, in the wavelet transform and its ridge. The ridge detection procedure, employing the wavelet amplitude, was applied separately for each mode. Finally, figure 17 shows the backbone curve and exponentially decaying amplitude for the first mode of vibration; the frequency variation of the backbone curve clearly shows the cubic stiffness nonlinearity.

### (b) Input–output analysis

Input/output analysis has proven invaluable in approaching structural dynamic problems in many areas of mechanical engineering. This includes applications of cross-spectra, transfer functions and frequency response functions (FRFs). The non-stationary or nonlinear behaviour which may result from damage is either destructive on input/output functions or inappropriate to describe the input–output relationship.

The input–output relationship can be studied using the cross-wavelet analysis, as described in Kyprianou & Staszewski (1999) and Staszewski (2000). The cross-wavelet transform can be defined for the input *x*(*t*) and output *y*(*t*) of the system as(6.4)This transform enhances similarities between the wavelet transform of the input and output signals at different scales.

Following Staszewski (2000), a 3-d.f. lumped parameter system with a cubic stiffness nonlinearity, shown in figure 18, has been used to demonstrate the method. The values of parameters used in the simulations were: *m*_{1}=1 kg, *m*_{2}=2 kg, *m*_{3}=3 kg, *k*_{1}=2.0×10^{4} N m^{−1}, *k*_{2}=3.0×10^{4} N m^{−1}, *k*_{3}=1×10^{4} N m^{−1}, *k*_{4}=2×10^{4} N m^{−1} and *c*_{i}=0.001×*k*_{i} in N m^{−1} s^{−1} for *i*=1, 2, 3. Here, the only nonlinearity is the cubic stiffness between the ground and mass *m*_{1} which resulted in the additional force N. The mass *m*_{2} of the system was excited using a chirp signal *x*(*t*) which started at frequency 1 Hz and linearly increased to 300 Hz in 2.5 s. A fourth-order Runge–Kutta integration scheme was used to generate the responses from the system. The signal and wavelet sampling frequencies for this study were taken to be 2048 Hz each. The base angular frequency *ω*_{0} of the Morlet wavelet was chosen to be 1.75*π* rad s^{−1} to satisfy the wavelet admissibility condition. The cross-wavelet transform was calculated over 200 scales starting from a minimum scale value of 0.0025. The amplitude of the cross-wavelet transform between the output *y*_{2} and the chirp excitation *x* is shown in figure 19. The three-dimensional and contour plots of the cross-wavelet amplitude give the overall information about the analysed system in both the time and frequency domains; three modes of vibration can clearly be observed.

In addition to the cross-wavelet transform, the wavelet-based FRF can be also used for the input/output analysis of mechanical systems. The method is the natural extension of the classical FRF which is the ratio of the complex response spectrum to the input spectrum. The wavelet-based FRF can be defined as (Staszewski & Giacomin 1997)(6.5)Here, and are the wavelet transforms of the input and the output of the system, respectively, and ‘^{*}’ denotes the complex conjugate of the function. The wavelet-based FRF represents the ratio of output-to-input in the time–scale domain and thus fully characterizes time-variant physical systems. The assumption here is that all operations involving the impulse response function (IRF) and the FRF which hold for one dimension can be expanded to create the necessary framework for the two-dimensional time–frequency analysis.

### (c) Detection of singularities

Many types of faults and damages exhibit either an abrupt change, discontinuity or impulse in a signal, or the sudden shift of the signal's mean value to a different level. These features are often described as singularities and irregularities.

A singularity is defined in mathematics as a point at which the derivative vanishes, or is not unique (as in the case of a self-intersecting curve), whereas a regularity is a point at which a function is analytic, i.e. it is differentiable in its neighbourhood. Singularities and irregular features often carry important information about possible structural damage. However, in many cases, these signal characteristics are masked by the response of the system to higher amplitude ambient vibration sources.

Wavelet analysis can identify and extract features from a system's acceleration time-histories that indicate the presence of singularities and identify when they occurred. One of the most widely used features is based on the Lipschitz exponent (in mathematics also known as the Hölder exponent). The wavelet transform is used to obtain a time-based function by calculating the Lipschitz exponent at each time point in the signal. Since singularity points have no continuous derivatives, identification of these points is possible by finding times when the Lipschitz exponent suddenly drops to a value of 0 or below. These drops in the Lipschitz exponent are damage-sensitive features, as shown in Robertson *et al*. (2003). A statistical classifier can be used to quantify when changes in these features are significant. Recent studies show that multifractal spectra based on the Hölder exponent and their sensitivity to various conditions in rotating machinery can also be obtained (Peng *et al*. 2003).

The Lipschitz regularity is defined as follows. If a signal *x*(*t*) has a Lipschitz exponent *α* over [*a, b*], then there exists *A*>0 such that (Mallat 1998)(6.6)where |*WT*(*u*, *s*)| is the wavelet transform modulus of *x*(*t*). The exponent *α* can be calculated at a specific time point by finding the slope of the logarithm of the modulus at that time versus the logarithm of the scale vector *s*(6.7)

(6.8)

The Lipschitz exponent analysis is used to investigate the acceleration response of a structure subject to a harmonic base excitation. A physical representation of this structure is shown in figure 20. The non-symmetric bumpers cause the structure to exhibit a rattle during one portion of the harmonic excitation. Figure 21 shows the response of the structure at three excitation levels as measured by accelerometers mounted on the outer structure in the in-axis and off-axis directions. The rattle produced by these impacts is evident in the sensor measurements that are off-axis from the excitation. The short oscillations of increased magnitude in these measurements are indicative of the rattle. These same oscillations are not readily apparent in the in-axis data, particularly if one does not have the off-axis measurements for reference. The purpose of this application is to determine whether the proposed method for singularity detection can be used to identify the time when the rattle is occurring by looking only at the in-axis response measurements.

Extraction of the Lipschitz exponent was performed using the Morlet wavelet transform for all of the in-axis data, as shown in figure 22. The singularities associated with the rattle are clearly visible in this plot as sudden dips in the exponent value at each time they occur during the oscillatory cycles. To set a threshold for identifying discontinuities, the drop values from just the first 1800 data points of the low-level response signal were fit to an extreme value distribution (Worden 2002). These data points were considered to represent the normal, i.e. no damage, operating condition of the system. Any drop value exceeding a 99.7% confidence interval, based on the extreme value distribution, was labelled as a discontinuity. The identified discontinuities for the three acceleration responses are marked by circles in figure 22.

Lipschitz exponent analysis was successful for identifying discontinuities in this problem mainly due to the band-limited nature of the excitation. A low Lipschitz exponent indicates a region in a signal where there is energy spanning the entire frequency range. Events such as an impulse or rattle will induce a broadband frequency excitation to the structure. Therefore, if a broadband random excitation is used to excite the structure, impulse events are not as easily distinguishable using the Lipschitz exponent approach.

## 7. Conclusions

Time–scale and time–frequency methods have become important for many applications. The progress in this area is enormous; recent years have seen a number of different approaches, techniques and methodologies. A number of examples presented in this paper show that wavelet analysis forms a set of effective signal-processing techniques which not only complement, but also significantly improve damage detection results based on classical Fourier analysis. However, most of these applications are largely limited to academic research due to the complexity of the mathematical background and algorithms connected with the analysis. It appears that toolboxes of disparate techniques are required to facilitate further progress in structural health monitoring.

## Acknowledgments

Sincere thanks to Dr Gareth Pierce from Strathclyde University for permission to use Lamb wave responses.

## Footnotes

One contribution of 15 to a Theme Issue ‘Structural health monitoring’.

- © 2006 The Royal Society