## Abstract

An overview of the principles and techniques of time-series methods for fault detection, identification and estimation in vibrating structures is presented, and certain new methods are introduced. The methods are classified, and their features and operation are discussed. Their practicality and effectiveness are demonstrated through brief presentations of three case studies pertaining to fault detection, identification and estimation in an aircraft panel, a scale aircraft skeleton structure and a simple nonlinear simulated structure.

## 1. Introduction

*Fault detection, identification* (*localization*) *and* (*magnitude*) *estimation* (collectively referred as *fault diagnosis* or as *fault detection and identification*—*fdi*) in vibrating structures, such as aerospace and mechanical structures, marine structures, buildings, bridges and offshore platforms, are of paramount importance for reasons associated with proper operation, maintenance and safety. Furthermore, they become crucial for the assessment of ageing infrastructure.⇓

For these reasons, a number of non-destructive fault detection techniques have been developed over the past several years (see Braun 1986; Doherty 1987; Doebling *et al*. 1996, 1998; Natke & Cempel 1997; Salawu 1997; Zou *et al*. 2000; Balageas 2002; Boller & Staszewski 2004; Staszewski *et al*. 2004). Most are ‘local’, requiring access to the vicinity of the suspected fault location; furthermore, they are typically time consuming and costly. They are usually based upon radiography, eddy current, acoustic, ultrasound, magnetic and thermal field principles.

In recent years, significant attention has been paid to fault detection via *vibration-based methods* (Doebling *et al*. 1996, 1998; Salawu 1997; Zou *et al*. 2000; Farrar *et al*. 2001). These appear as particularly promising and offer a number of potential advantages, such as no requirement for visual inspection, ‘automation’ capability, ‘global’ coverage (in the sense of covering large areas of the structure) and capability of working at a ‘system level’. Furthermore, they tend to be time effective and less expensive than most alternatives. The fundamental principle upon which vibration-based methods are founded is that small changes (faults) in a structure cause behavioural discrepancies in its vibration response. The goal thus is the reliable detection of such discrepancies in a structure's vibration response and their precise association with a specific cause (fault of specific type and magnitude).

*Time-series methods* form an important category within the broader vibration-based family of methods. The term *time-series* has originated in statistics (see Box *et al*. 1994) and refers to a time-ordered sequence of random (stochastic) scalar or vector observations (*random signal*(*s*)). Within the present context, these include the excitation and/or vibration response of a given structure. Time-series analysis uses *statistical* tools for developing (identifying and estimating) mathematical models describing one or more measured random signals and analysing their observed and future behaviour. By their own nature, time-series methods account for *uncertainty*, while fundamentally they are of the *inverse type*, in that the models developed are *data based* rather than physics based (although the former are obviously related to the latter and the underlying physics).

Time-series methods for fault detection and identification in vibrating structures offer a number of potential advantages over alternatives. These include the following.

No requirement for physical or finite element models.

No requirement for complete structural models; in fact, they may operate on partial models with a limited number of measurable excitation and/or response signals.

Inherent accounting for (measurement, environmental and so on) uncertainty through statistical tools.

Statistical decision making with specified performance characteristics.

On the other hand, as complete structural models are not employed, time-series methods may identify (locate) a fault only to the extent allowed by the type of model used. Of course, the methods may also be combined with complete structural models; this subject will not be, however, treated in the present paper.

The *goal* of this paper is a concise and tutorial overview of the principles and techniques of Gaussian time-series methods for fault detection, identification (localization) and estimation in vibrating structures. The paper does not attempt to provide an exhaustive survey of the literature, as such. Certain new results and methods are also briefly presented. The methods are classified, and their features and operation are discussed. Their practicality and effectiveness are demonstrated through brief presentations of three case studies pertaining to fault detection, identification and estimation in an aircraft panel, a scale aircraft skeleton structure and a simple nonlinear simulated structure.

The rest of the paper is organized as follows: the problem statement is presented in §2 and the structure of time-series methods in §3; an overview of time-series model representations is given in §4; non-parametric methods are presented in §5 and parametric methods in §6; three case studies are presented in §7; and concluding remarks are summarized in §8.

## 2. Problem statement

Let _{o} designate the vibrating structure under study in its *nominal* (healthy) state. In addition, let _{A}, …, _{D} designate the structure under *fault types* (*fault modes*) A, …, D, respectively. Each fault type includes a continuum of faults (each one of distinct magnitude), characterized by common nature (for instance, damage of various possible magnitudes to a specific structural element). The structure under each such fault is designated as , with the subscript designating fault type and the superscript fault magnitude. The symbol is also used for designating the fault itself.

Over the course of its service life, the structure is assumed to be periodically inspected for faults (for instance, following a major loading or the completion of a certain operation cycle). It is at such times that fault diagnosis is sought. This is to say that the *off-line* problem (as opposed to the online problem in which a structure is continuously monitored in real time) is presently treated. During inspection, the structure is in a currently unknown (to be determined) state designated as _{u}.

Suppose that during the inspection phase, behavioural data, i.e. force excitation *x*_{u}[*t*] and/or vibration response *y*_{u}[*t*] (*t*=1, 2, …, *N*) signals, are obtained from the structure in its current (unknown) state (note that *t* refers to discrete time, with the corresponding analogue time being (*t*−1).*T*_{s} and *T*_{s} standing for the sampling period; the subscript ‘u’ designates the current/unknown state of the structure).

For convenience and simplicity of presentation, it is assumed that the excitation and response signals are *scalar* (the *univariate* case). The extension to the more general *multivariate* (vector signal) case requires the establishment of pertinent vector statistics and the use of corresponding models. Despite their phenomenal resemblance to their univariate counterparts, multivariate models generally have a much richer structure and give rise to parametrization, identifiability and other important questions (see Lütkepohl 1991), while also requiring multivariate statistical decision-making procedures. As a consequence, this extension is not necessarily a straightforward task (see Basseville & Nikiforov 1993; Gertler 1998).

Let the complete signal records be obtained during inspection designated as and , and let the excitation–response signals be collected into the vector (*t*=1, 2, …, *N*) (lower/upper case bold face symbols designate vector/matrix quantities, respectively; by convention, all vectors are column vectors), the complete record of which is designated as .

Note that all collected signals need to be suitably pre-processed (Doebling *et al*. 1998; Fassois 2001). This may include low or band pass *filtering* within the frequency range of interest, signal *subsampling* (in case the originally used sampling frequency is too high) as well as proper *scaling*. The latter is used not only for numerical reasons, but also for counteracting—to the extent possible—different operating (including excitation levels) and/or environmental conditions. In the common case of *linear* systems, scaling typically involves subtraction of each signal's sample (estimated) mean and normalization by its sample (estimated) standard deviation. In case of multiple excitations, care should be exercised in order to ensure minimal cross-correlation among them.

Given the data , collected during the inspection phase, the characterization of the current state of the structure, which is the problem generally referred as *fault detection and identification* (*fdi*), may be thought of as consisting of three conceptually distinct subproblems.

*Fault detection*is the binary decision-making subproblem pertaining to the presence or not of a fault in the current structure. The two available possibilities thus are either_{u}=_{o}(the current structure is healthy) or_{u}≠_{o}(the current structure is faulty).*Fault identification*(also referred as*fault localization*or*fault classification*) follows fault detection and is the multiple decision-making subproblem pertaining to the identification of the particular type of fault incurred. In our present notation, the available possibilities include fault type (mode) A (_{A}) through fault type (mode) D (_{D}).*Fault estimation*is the subproblem concerning estimation of the exact fault magnitude (damage level).

## 3. The structure of time-series methods

### (a) The operational viewpoint

From an *operational viewpoint*, the tackling of the aforementioned subproblems via time-series methods requires (in addition to ) the availability of similar data records from the nominal (healthy) and each considered faulty state of the structure. In other words, in present terms, **z**_{o}[*t*] (*t*=1, 2, …, *N*) corresponding to _{o}, as well as **z**_{A}[*t*], …, **z**_{D}[*t*] (*t*=1, 2, …, *N*) corresponding to _{A}, …, _{D}, respectively (in reality, data records corresponding to various possible fault magnitudes within each fault type are required). Note that, due to obvious reasons, this may not be necessarily possible on a real structure. In such a case, either data coming from similar (if available) structures under the given conditions, or more realistically, structural *models* (laboratory scale models or mathematical—like finite element—models) capable of providing reasonably accurate representations of the actual structural dynamics under various conditions may be used.

These datasets are all obtained and processed in an initial *baseline phase*. On the other hand, the current data acquisition, processing and decision making are taking place in a second operational phase that has already been referred as the *inspection phase*.

### (b) The conceptual viewpoint

From a *conceptual viewpoint*, time-series methods include *analysis* and *statistical decision making*. (i) The analysis part includes signal or system characterization and parametric or non-parametric modelling. The aim is the extraction, from each dataset, of a characteristic quantity designated as (a function of ), which is instrumental in the decision making part. (ii) The statistical decision making is the part where decisions are made by ‘comparing’, via formal statistical hypothesis testing procedures, the current characteristic quantity *Q*_{u} to its counterparts *Q*_{o}, *Q*_{A}, …, *Q*_{D} pertaining to the various possible structural states (o, A, D, …, D, respectively).

*Fault detection* is then formulated as a binary composite hypothesis testing problem which may be generally expressed as:(3.1)with ∼ designating a proper relationship (such as equality, inequality and so on).

*Fault identification*, on the other hand, is formulated as a multiple hypothesis testing problem which may be generally expressed as:(3.2)Augmenting this formulation with the original null hypothesis *H*_{0} leads to combined treatment of fault detection and identification.

*Fault estimation* is a generally more complicated issue. It requires proper formulation and the use of interval estimation techniques.

The design of a binary statistical hypothesis test (such as that of equation (3.1)) may be based upon the probabilities of *type I* and *type II* error occurrence. The first probability—designated as *α* and also referred as the *type I risk*—is the probability of rejecting the null hypothesis *H*_{0} although it is true (*false alarm*). The second probability—designated as *β* and also referred as *type II risk*—is the probability of accepting the null hypothesis *H*_{0} although it is not true (*missed fault*). The designs presented in this paper are based upon selected type I error occurrence probability (*α*) and use the probability density function of a relevant random quantity under the null (*H*_{0}) hypothesis of a healthy current structure. The computation of the type II risk depends upon a number of factors and is, in general, more complicated. In selecting *α*, it should be born in mind that a decrease/increase in it results in a corresponding increase/decrease in *β*. The reader is referred to references such as Basseville & Nikiforov (1993, subsection 4.2) and Montgomery (1991, subsection 3.3) for details on statistical hypothesis testing.

### (c) Types of time-series methods

The diagram of a ‘general’ time-series fdi method is depicted in figure 1. Note that the baseline phase is indicated by the dashed frame, while the inspection phase is in its exterior. In addition, note that although the characteristic quantity *Q* appears as a function of the response, it may or may not be a function of the excitation as well. Methods falling into the first category are referred as *excitation*–*response* methods, while those falling into the second are referred as *response-only* methods.

Depending upon the way the characteristic quantity *Q* is constructed, time-series methods may be also classified as parametric or non-parametric. *Parametric* are those methods in which the statistic is constructed through parametric time-series representations, such as the autoregressive moving average (ARMA) representation (see §4; also Fassois 2001). *Non-parametric* are those methods in which the statistic is constructed via non-parametric time-series representations, such as spectral models (see §4). A rough comparison of the main characteristics of non-parametric and parametric time-series methods is presented in table 1.

## 4. Overview of time-series representations

The objective of this section is to provide a concise overview of Gaussian time-series representations used in fault detection and identification. As already indicated, for the sake of simplicity, the *univariate* case (scalar excitation and response signals) is considered. Let *h*[*t*] designate the impulse response function describing the excitation–response causality relationship between an excitation and a response location on a *linear* time-invariant structure (figure 2). Let *n*[*t*] designate zero-mean stationary Gaussian corrupting noise that is of unknown autocovariance structure and mutually uncorrelated with the excitation *x*[*t*]. Then(4.1)

Using the backshift operator , the above may be rewritten as(4.2)with *H*() designating the structure's discrete-time transfer function.

Assuming *x*[*t*] to be a random stationary excitation, *y*[*t*] will also be stationary in the steady state. In addition, *y*[*t*] will be Gaussian if *x*[*t*] and *n*[*t*] are jointly Gaussian. In such a case, each signal is fully characterized by its first two moments (mean and autocovariance). For *y*[*t*], these are(4.3)with *E*{.} designating statistical expectation; *j* is the imaginary unit; *τ* is the time lag; *ω* is the frequency in rad s^{−1}; and *T*_{s} is the sampling period. Note that *γ*_{yy}[0] is the variance of the response *y*[*t*], and *S*_{yy}(ω) its auto-spectral density defined as the Fourier transform of the autocovariance (Kay 1988 p. 3; Box *et al*. 1994 pp. 39–40). These quantities may be related to those of *x*[*t*] and *n*[*t*] as follows:(4.4)where *H*(*jω*) is the structure's frequency response function (frf) obtained by setting in the second of equations (4.2), * designates complex conjugate; |.| is the complex magnitude; and *S*_{nn}(*ω*) is the noise auto-spectral density, while the (complex) cross-spectral density *S*_{xy}(*ω*) is defined as the Fourier transform of the cross-covariance function . Note that, by convention, complex functions are designated by incorporating the imaginary unit in their argument.

### (a) Response-only representations

In this case, the properties of the excitation *x*[*t*] (and also of the noise *n*[*t*], if it is present) are assumed known. Typically, both signals are zero mean, serially and mutually uncorrelated. Without loss of generality, it is presently assumed that *x*[*t*]≡*w*[*t*] (zero-mean uncorrelated (white) sequence) and *n*[*t*]≡0.

*Non-parametric representations*. The response may be non-parametrically described via its mean *μ*_{y} and autocovariance *γ*_{yy}[*τ*] or the auto-spectral density , with designating the excitation's variance (from the rightmost of equation (4.4)). Time-frequency representations or polyspectra or wavelet analysis (Staszeswski 1998; Peng & Chu 2004) may be used in the non-stationary or nonlinear cases.

*Parametric representations*. The parametrization of equation (4.1) leads to an ARMA representation of the form (Box *et al*. 1994, pp. 52–53)(4.5)with *a*_{i}, *c*_{i}, *A*() and *C*() designating the AR and MA parameters and polynomials, respectively. iid stands for identically independently distributed, while (., .) designates normal distribution with the indicated mean and variance. *n*_{α} and *n*_{c} are the model's AR and MA orders, respectively. The model parameter vector is . It should be noted that the excitation *w*[*t*] may be shown (Box *et al*. 1994, p. 134; Ljung 1999, p. 70) to coincide with the model-based *one-step-ahead prediction error* and is also referred as the *residual* or *innovations*. Other parametrizations are generally possible (especially in the multivariate case—see Söderström & Stoica 1989). In the scalar case, the model poles (eigenvalues) may be alternatively used (this is commonly done through the use of modal models; also see the remark in §6*a* and §6*b*(iv)).

Alternatively, equation (4.1) may be set into state space form (Söderström & Stoica 1989, p. 157; Box *et al*. 1994, pp. 163–164)(4.6)with ** ψ**[

*t*] designating the model's state vector and

**[**

*v**t*] a zero-mean uncorrelated vector sequence with covariance

**Σ**

_{v}. The model parameter vector may be selected as , with vec(.) designating the column vector produced by stacking the columns of the indicated matrix on top of each other (the previous comment on different parametrizations applies; also see §6

*b*(iv) for a selection that is invariant under changes in the state space basis).

Time-varying versions of the above (characterized by time-dependent parameters) may be used in the non-stationary case (Petsounis & Fassois 2000; Poulimenos & Fassois 2006), while various nonlinear models, like nonlinear ARMA (NARMA) models, may be used in the nonlinear case (Leontaritis & Billings 1985; Sakellariou & Fassois 2002).

### (b) Excitation–response representations

In this case, *n*[*t*] (figure 2) is assumed to be stationary, zero mean but of unknown autocovariance structure, and mutually Gaussian and uncorrelated with the excitation *x*[*t*].

*Non-parametric representations*. A complete non-parametric representation includes the mean values *μ*_{x} and *μ*_{y} and the auto- and cross-covariances *γ*_{xx}[*τ*], *γ*_{yy}[*τ*] and *γ*_{xy}[*τ*] (equivalently the auto- and cross-spectral densities *S*_{xx}(*ω*), *S*_{yy}(*ω*) and *S*_{xy}(*jω*)). These quantities are interrelated through the relationships of equations (4.4). In addition, the (squared) coherence function is defined as (Bendat & Piersol 2000, p. 196)(4.7)with *S*_{nn}(*ω*) designating the noise auto-spectral density. Time-frequency representations or polyspectra or wavelet analysis (Peng & Chu 2004) may be used in the non-stationary or nonlinear cases.

*Parametric representations*. The parametrization of equation (4.1) may lead to an AutoRegressive Moving Average with eXogenous excitation (ARMAX) representation of the form (Söderström & Stoica 1989, p. 149; Ljung 1999, p. 83; Fassois 2001)(4.8)with parameter vector . *A*(), *A*() and *C*() are the AR, X and MA polynomials, respectively, and *na*, *nb* and *nc* the corresponding orders. Like before, *w*[*t*] is a zero-mean uncorrelated (white) signal, which coincides with the model-based *one-step-ahead prediction error* and is also referred as *residual* or *innovations*.

Alternatively, it may lead to the Box–Jenkins representation (Söderström & Stoica 1989, pp. 148–154; Ljung 1999, p. 87; Fassois 2001)(4.9)with parameter vector .

On the other hand, the output error (OE) representation models the excitation–response dynamics (Ljung p. 85; Fassois 2001)(4.10)with parameter vector .

A state space representation may also be used. This may be of the form of equation (4.6), but augmented by terms that include the excitation *x*[*t*] (see Ljung 1999, pp. 97–101 for additional versions)(4.11)with ** ψ**[

*t*] designating the model's state vector and

*[*

**v***t*] a zero-mean uncorrelated vector sequence with covariance

*Σ*_{v}. The model parameter vector may be selected as (the comment of the previous subsection on different parametrizations applies).

Time-varying versions of the above may be used in the non-stationary case (Poulimenos & Fassois 2004), while a variety of nonlinear models, like nonlinear ARMAX (NARMAX) models (Leontaritis & Billings 1985; Chen & Billings 1989; Sakellariou & Fassois 2002; also see the case study of §7*c*), neural network models (Masri *et al*. 2000) or frequency-domain ARX models (Adams 2002) may be used in the nonlinear case.

## 5. Non-parametric methods

Non-parametric methods are those in which the characteristic quantity *Q* is constructed based upon non-parametric time-series representations (see §4). The generic structure of these methods follows that of the general time-series fdi method (figure 1).

Three non-parametric methods are briefly presented in the following: a spectral density function-based method (Sakellariou *et al*. 2001), a frf-based method (Rizos *et al*. 2001) and a coherence measure-based method (Rizos *et al*. 2002). For additional non-parametric methods, including time-frequency- and wavelet analysis-based methods, the interested reader is referred to Farrar & Doebling (1997), Staszeswski (1998), Worden *et al*. (2000), Worden & Manson (2003), Peng & Chu (2004) and references therein.

### (a) Spectral density function-based method

This method attempts fault detection, identification and magnitude estimation via characteristic changes in the auto-spectral density function (defined via the third of equation (4.3)) of the measured vibration *y*[*t*] when the excitation *x*[*t*] is not available (response-only case). The method's characteristic quantity thus is *Q*=*S*_{yy}(*ω*)=*S*(*ω*). Let the vibration response data record be segmented into *K* non-overlapping segments (each of length *L*). Then the Welch auto-spectral density estimator (sample spectrum) is defined as (Kay 1988, p. 76)(5.1)with the superscript *i* designating the *i*th segment and *a*[*t*] a proper data window (note that estimators/estimates are designated by a hat). This estimator, multiplied by 2*K* and normalized by the true auto-spectral density *S*(*ω*), may be shown to follow a (central) chi-square distribution with 2*K* degrees of freedom (see appendix A for the distribution's definition) or explicitly (Kay 1988, p. 76; Bendat & Piersol 2000, p. 309)(5.2)

*Fault detection*. Fault detection is based upon confirmation of statistically significant deviations (from the nominal/healthy) in the current structure's auto-spectral density function at one or more frequencies through the hypothesis testing problem (for each *ω*)(5.3)

Towards this end, owing to equation (5.2), the following quantity follows (for each frequency *ω*) *F* distribution with 2*K*, 2*K* degrees of freedom (ratio of normalized chi-square distributions; see appendix A):with and designating the sample (that is estimated) auto-spectral density for the healthy and current structure, respectively (the first obtained in the baseline phase and the second in the inspection phase). Under the null (*H*_{0}) hypothesis (current structure is healthy), the true auto-spectral densities coincide (*S*_{o}(*ω*)=*S*_{u}(*ω*)) and the above quantity becomes(5.4)

Equality of the true auto-spectral densities *S*_{o}(*ω*) and *S*_{u}(*ω*) is then examined at the *α* risk level (type I error probability of *α*) via the statistical test(5.5)with *f*_{(α/2)}, *f*_{1−(α/2)} designating the *F* distribution's *α*/2 and 1−(*α*/2) critical points (*f*_{α} is defined such that ; figure 3).

*Fault identification*. Fault identification may be, in principle, achieved by performing hypotheses testing similar to the above separately for faults from each potential fault mode.

*Fault estimation*. Fault estimation may be achieved by possibly associating specific quantitative changes in the auto-spectral density (at one or more frequencies) with specific fault magnitudes.

Note that signal scaling is particularly important in order to properly account for different excitation levels, while the environmental conditions should be kept constant.

### (b) Frequency response function (frf)-based method

This method is similar to the previous one, except that it requires the availability of both the excitation and response signals and uses the frf magnitude as its characteristic quantity (*Q*=|*H*(*jω*)|) (see the first central of equations (4.4)). Yet, it may be also used in case the excitation is unavailable, but more than one response are available (Sakellariou & Fassois 2000).

The frf magnitude is estimated via Welch estimates (*K* non-overlapping data segments, each of length *L*; see previous subsection) of the auto- and cross-spectral density functions *S*_{xx}(*ω*) and *S*_{xy}(*jω*), respectively, using the first central of equations (4.4)(5.6)

This estimator may be shown to follow a distribution approximated as Gaussian (see Bendat & Piersol 2000, p. 338; alternatively see Koopmans 1974, p. 284 for a discussion that is based upon the *F* distribution)(5.7)with the mean coinciding with the true frf magnitude and the indicated variance (*γ*^{2}(*ω*) designates the coherence function, see equation (4.7)).

*Fault detection*. Fault detection is based upon confirmation of statistically significant deviations (from the nominal/healthy) in the current structure's frf at one or more frequencies through the hypothesis testing problem (for each *ω*)(5.8)

Towards this end, the difference of the frf magnitude estimates in the nominal and current states is considered ( is obtained in the baseline phase and in the inspection phase) which, due to equation (5.7) and mutual independence of , (owing to the fact that the estimates corresponding to the nominal and current states are based upon distinct data records), follows Gaussian distribution with mean (the true difference) and variance , i.e.(5.9)

Under the null (*H*_{0}) hypothesis (current structure is healthy), the true frf magnitudes coincide (|*H*_{0}(*jω*)|=|*H*_{u}(*jω*)|) and so do the respective estimator variances . Hence, follows the Gaussian distribution(5.10)The variance is generally unknown, but may be estimated in the baseline phase through equation (5.7). Treating this estimate as a fixed quantity, i.e. a quantity characterized by negligible variability (which is reasonable for estimation based upon a large number of samples), the equality of |*H*_{o}(*jω*)| and |*H*_{u}(*jω*)| is examined at the *α* (type I) risk level through the statistical test (figure 4)(5.11)with *Z*_{1−(α/2)} designating the standard normal distribution's 1−(*α*/2) critical point (typical values are *α*=0.05, *Z*_{1−(α/2)}=1.96). Alternatively (treating as a random variable), the *t* distribution may be used (see appendix A; also §6*c*; note that the *t* distribution approaches normality for long data records).

*Fault identification*. Fault identification may be, in principle, achieved by performing hypotheses testing similar to the above separately for each potential fault from each fault mode.

*Fault estimation*. Fault estimation may be achieved by possibly associating specific quantitative changes in the frf magnitude (at one or more frequencies) with specific fault magnitudes.

### (c) Coherence measure-based method

This method is founded upon the heuristic premise that under constant experimental and environmental conditions, the overall coherence (coherence over the complete frequency range; see equation (4.7) for the definition of coherence) decreases with fault occurrence. This is due to the fact that nonlinear effects are introduced, or strengthened, with fault occurrence. Like the previous method, this also pertains to the excitation–response case.

The coherence estimator is obtained by replacing the auto- and cross-spectral densities in the first of equations (4.7) by their respective Welch estimates (obtained via *K* non-overlapping data segments). The mean and variance of this estimator may be shown to be (for large *K*; see Carter *et al*. 1983; Bendat & Piersol 2000, pp. 333–335)(5.12)with *γ*^{2}(*ω*) designating the true coherence value.

Next, consider a frequency discretization *ω*_{i} (*i*=1, 2, …, *n*) (frequency resolution *δω*) and define the *coherence measure*(5.13)which consists of the sum of individual coherence values and constitutes the method's characteristic quantity (*Q*). For a large number of discrete frequencies *n* (*n*→∞), the coherence measure estimator approximately follows, thanks to the central limit theorem (see appendix A), Gaussian distribution, i.e.(5.14)with *Γ* designating the measure's true value. Note that in obtaining this expression, the bias terms (second term on the right-hand side of the leftmost of equations (5.12)) are neglected (which is justified for either large *K* or for true coherence being close to unity), while the variance of is equal to the sum of the individual variances due to the fact that the coherence estimates (*i*=1, 2, …, *n*) are mutually independent random variables (Brillinger 1981, p. 204).

*Fault detection*. Fault detection is based upon confirmation of a statistically significant reduction in the coherence measure *Γ*_{u} of the current structure (compared to *Γ*_{o} of the healthy structure) through the statistical hypothesis testing problem(5.15)

Towards this end, the difference of the coherence measure estimates in the nominal and current states of the structure is considered(5.16)which, due to equation (5.14), follows normal distribution with the indicated mean (true *Γ* difference) and variance. Note that is obtained in the baseline phase and in the inspection phase and are mutually independent.

Under the null (*H*_{0}) hypothesis (current structure is healthy), and , thus(5.17)

The variance is unknown, but may be estimated in the baseline phase based upon equations (5.12) and (5.14) by replacing the true coherence by its estimate. The presence or not of a statistically significant reduction in the coherence measure *Γ*_{u} of the current structure (compared to *Γ*_{o} of the healthy) may be then examined through a proper statistical test. Since the null hypothesis is composite, the establishment of such a test at any selected (type I) risk level *α* requires knowledge of the unavailable mean *δΓ*≤0. To overcome this difficulty, a conservative procedure is adopted by allowing for a risk level *α* in the worst possible (under *H*_{0}) case of *δΓ*=0. By treating the estimated variance as a fixed quantity (quantity with negligible variability), this leads to the following test characterized by a maximum type I (false alarm) risk of *α*:(5.18)with *Z*_{1−α} designating the standard normal distribution's 1−*α* critical point (typical values are *α*=0.05, *Z*_{1−α}=1.645).

*Fault identification*. The method is not appropriate for fault identification because different types of faults, or faults of the same type but of different magnitudes, may cause the same reduction in the coherence measure.

*Fault estimation*. Assuming that only one type of fault is possible, fault estimation may be achieved by possibly associating specific quantitative reductions in the coherence measure with specific fault magnitudes.

## 6. Parametric methods

Parametric methods are those in which the characteristic quantity *Q* is constructed based upon parametric time-series representations (see §4; also Fassois 2001). They are applicable to both the response-only and excitation–response cases, as each situation may be dealt with through the use of proper representations (also see Basseville & Nikiforov 1993; Natke & Cempel 1997; Gertler 1998). Parametric methods may be classified into the following categories.

*Model parameter-based methods*. These attempt fault detection and identification using a characteristic quantity*Q*which is the function of the parameter vector,**θ***Q*=*f*(), of a parametric time-series representation (model). In these methods, the model has to be re-estimated during the inspection phase based upon the current signal(s) .*θ**Residual-based methods*. These attempt fault detection and identification using characteristic quantities,*Q*, that are functions of model residuals generated by driving the current signal(s) through pre-determined representations (models) corresponding to the various states of the structure (healthy structure and structure under specific magnitude faults of types A, …, D). In this case, , with*e*_{Xu}[*t*] designating the residual generated by driving**z**_{u}[*t*] through the model corresponding to the*X*structural state (*X*=o, A, …, D). No model re-estimation is needed in the inspection phase.*The functional model-based method*. Conceptually, this may be thought of as a sort of combined method that uses characteristic quantities constructed based upon a model parameter*k*representing the fault magnitude on one hand and model residuals*e*_{Xu}[*t*] on the other. A prime advantage of the method is that it allows for effective identification of the fault type in the important case that the continuum of possible fault magnitudes is considered. In addition, it allows for accurate fault magnitude estimation. It is noted that model estimation is required in the inspection phase.

The three families of parametric methods are briefly presented in the following.

### (a) Model parameter-based methods

This category of methods attempts fault detection and identification via changes in the parameter vector * θ* of a suitable parametric representation (see Isermann 1993; Natke & Cempel 1997 is more focused on structures; Gertler 1998). In the typical case,

*Q*=

**is the methods' characteristic quantity (sometimes a transformed version of the parameter vector may be used).**

*θ*Let designate a proper (say maximum likelihood; see Söderström & Stoica 1989, pp. 198–199; Ljung 1999, pp. 212–213; Fassois 2001) estimator of the parameter vector * θ* based upon excitation and/or response measurements . The particular model structure employed (see §4) depends upon the specific problem at hand. For sufficiently long signals (

*N*large), the estimator is (under mild assumptions) Gaussian distributed with mean equal to its true value

*and covariance*

**θ***(see Söderström & Stoica 1989, pp. 205–207; Ljung 1999, p. 303 on issues relating to the numerical computation/estimation of*

**P**_{θ}*)(6.1)*

**P**_{θ}*Fault detection*. Fault detection is based upon testing for statistically significant changes in the parameter vector * θ* between the nominal and current structures through the hypothesis testing problem(6.2)

Towards this end, observe that the difference of the parameter vector estimators corresponding to the nominal and current states of the structure (the first obtained in the baseline phase using and the second in the inspection phase using ) follows Gaussian distribution(6.3)with **P**_{o} and **P**_{u} designating the estimator covariance matrices for the nominal and current structure, respectively.

Under the null (*H*_{0}) hypothesis and the quantity , below, follows chi-square distribution with *d* (parameter vector dimensionality) degrees of freedom (as it may be shown to be the sum of squares of independent standardized Gaussian variables; see appendix A; also Ljung 1999, p. 558)(6.4)with *δ P*=2

**P**_{o}. Since the covariance

**P**_{o}corresponding to the healthy structure is unavailable, its sample (estimated) version is used in practice. Treating this sample covariance as a deterministic quantity, i.e. a quantity characterized by negligible variability (which is reasonable for large

*N*) leads to the following test constructed at the

*α*(type I) risk level (figure 5; also see Ljung 1999, p. 559 for an alternative approach based upon the

*F*distribution)(6.5)with designating the chi-square distribution's, with

*d*degrees of freedom, 1−

*α*critical point.

*Fault identification*. Fault identification could, in principle, be based upon the multiple hypotheses testing problem of equations (3.2), comparing the current parameter vector to those corresponding to different fault types, say . Nevertheless, such an approach will work only for faults of specific magnitudes, but will generally fail to account for the continuum of fault magnitudes possible within each fault type.

A *geometric approach*, aiming at circumventing this difficulty, has been introduced by Sadeghi & Fassois (1997, 1998). This uses point and covariance estimates of * θ* collected into the vector . For each fault type (the continuum of faults of all possible magnitudes), a suitable geometrical representation (typically linear, in the form of a hyperplane) is constructed within the (

*ρ*-dimensional) space (figure 6)(6.6)with 's designating the

*i*th hyperplane's coefficients. These hyperplanes are constructed during the baseline phase using linear regression and vibration datasets obtained from the structure under faults of various magnitudes from each fault type.

In the inspection phase, following fault detection, fault identification is based upon computation of the distance of the estimated current point in the *ρ*-dimensional space from each hyperplane (figure 6). The current fault mode is identified as that associated with the hyperplane to which the distance is minimal. This involves the solution of a constrained minimization problem (the distance may be of the Euclidean or other proper type).

*Fault estimation*. Within the context of the geometric approach, fault estimation may be achieved via computation of the distance of from the point corresponding to the healthy structure (figure 6; distances must have been suitably graded in the baseline phase; also see Sakellariou & Fassois 2000).

Obviously, the model parameter-based methods may also be used with alternative representations, such as *modal models*. In such a case, fault detection and identification are attempted based upon changes incurred in the system's modal parameters (Hearn & Testa 1991; Doebling *et al*. 1996; Farrar & Doebling 1997; Salawu 1997; Rizos *et al*. 2001; also see §6*b*(iv)).

### (b) Residual-based methods

These attempt fault detection, identification and estimation using characteristic quantities that are functions of residual sequences obtained by driving the current signal(s) through suitable pre-determined (estimated in the baseline phase) models _{o}, _{A}, …, _{D}, each one corresponding to a particular state of the structure (healthy structure and structure under specific magnitude faults of types A, …, D; figure 7). The methods have a relatively long history—mainly within the general context of engineering systems (see Mehra & Peschon 1971; Willsky 1976; Basseville 1988; Frank 1990; Basseville & Nikiforov 1993; Natke & Cempel 1997 is more focused on structures; Gertler 1998).

Let _{X} designate the model representing the structure in its *X* state (*X*=o or *X*=A, …, D under specific fault magnitudes). In addition, let the residual series generated by driving **z**_{u}[*t*] through each one of the above models be designated as *e*_{ou}[*t*], *e*_{Au}[*t*], …, *e*_{Du}[*t*] and be characterized by respective variances , , …, (the first subscript designating the model and the second the excitation and/or response signal(s) used).

Fault detection, identification and estimation may then be based upon the fact that under the hypothesis *H*_{X} (i.e. the structure is in its *X* state, for *X*=o or *X*=A, …, D under specific fault magnitudes), the residual series generated by driving the current signal(s) through the model _{X} possesses the property(6.7)where iid designates identically independently distributed random variables. It is tacitly assumed that(6.8)implying that the excitation and environmental conditions are the same in the baseline and inspection phases.

A first method within this category is based upon examination of the residual series obtained by driving the current signal(s) through the aforementioned bank of models (estimated in the baseline phase). The model matching the current state of the structure should generate a residual sequence characterized by minimal variance. A second method is based upon the likelihood function evaluated for the current signal(s) under each one of the considered hypotheses (*H*_{o}, *H*_{A}, …, *H*_{D}). The hypothesis corresponding to the largest likelihood is selected as corresponding to the current state. A third method is based upon examination of the residual series obtained by driving the current signal(s) through the aforementioned bank of models (just as in the first approach above). The model matching the current state of the structure should generate a white (uncorrelated) residual sequence. These methods use classical tests on the residuals and offer simplicity and no need for model estimation in the inspection phase. Yet, they are subject to performance limitations, as certain subtle faults may go undetected (see Willsky 1976; Basseville & Benveniste 1983; Basseville & Nikiforov 1993). A fourth method is based upon the examination of residuals associated with subspace identification.

Concise descriptions of the four methods follow.

#### (i) Method A: using the residual variance

In this method, the characteristic quantity is the residual variance (also see Sohn & Farrar 2001; Sohn *et al*. 2001 for a related scheme).

*Fault detection*. According to the preceding discussion, fault detection is based upon the fact that the residual series *e*_{ou}[*t*], obtained by driving the current signal(s) through the model _{o} corresponding to the nominal (healthy) structure, should be characterized by variance which becomes minimal (specifically equal to ; see equation (6.8)) if and only if the current structure is healthy (_{u}=_{o}).

The hypothesis testing problem of equation (3.1) may be then expressed as (6.9)

Under the null (*H*_{0}) hypothesis, the residuals *e*_{ou}[*t*] are (just like the residuals *e*_{oo}[*t*]) identically independently distributed (iid) zero-mean Gaussian with variance (equations (6.7) and (6.8)). Hence, the quantities and follow (central) chi-square distributions with *N*_{u}−1 and *N*_{o}−*d*−1 degrees of freedom, respectively (sum of squares of independent standardized Gaussian random variables; see appendix A). Note that *N*_{o} and *N*_{u} designate the number of samples used in estimating the residual variance in the healthy and current cases, respectively (typically *N*_{o}=*N*_{u}=*N* ), whereas *d* designates the dimensionality of the model parameter vector. Consequently, the following statistic follows *F* distribution with *N*_{u}−1 and *N*_{o}−*d*−1 degrees of freedom (ratio of two independent and normalized *Χ*^{2} random variables; see appendix A):(6.10)

The following test is then constructed at the *α* (type I) risk level:(6.11)with *f*_{1−α}(*N*_{u}−1,*N*_{o}−*d*−1) designating the *F* distribution's 1−*α* critical point.

*Fault identification*. Fault identification may be similarly achieved through pairwise tests of the form(6.12)An alternative possibility may be based upon obtaining the residual series *e*_{Au}[*t*], …, *e*_{Du}[*t*], estimating their variances, and declaring as current fault that corresponding to minimal variance. Note that by including *e*_{ou}[*t*], fault detection may also be treated. The advantage of this approach is that only the signal(s) are used in the inspection phase.

As with the model parameter-based methods, however, these approaches may work only for faults of specific magnitudes, but not for the continuum of magnitudes possible within each fault mode.

*Fault estimation*. Fault estimation may be achieved in the limited case in which only one type of faults is possible. In that case, specific values of the residual variance may be potentially associated with specific fault magnitudes.

#### (ii) Method B: using the likelihood function

*Fault detection*. In this method, fault detection is based upon the likelihood function under the null (*H*_{o}) hypothesis of a healthy structure (see Gertler 1998, pp. 119–120). The hypothesis testing problem considered is(6.13)with **θ**_{o} and **θ**_{u} designating the healthy and current structure's parameter vectors, respectively. Assuming independence of the residual sequence, the Gaussian-likelihood function for the data , given , is (Box *et al*. 1994, p. 226)(6.14)with *e*[*t*,** θ**] designating the model's residual (one-step-ahead prediction error) characterized by zero mean and variance

*σ*

^{2}, and

*f*

_{w}(.) its probability density function.

Under the null (*H*_{0}) hypothesis, the residual series *e*_{ou}[*t*] generated by driving the current signal(s) through the nominal model _{o} is (just like *e*_{oo}[*t*]) identically independently distributed (iid) Gaussian with zero mean and variance (equations (6.7) and (6.8)). Decision making may be then based upon the likelihood function under *H*_{0} (* θ*=

*θ*_{o}) evaluated for the current data, by requiring it to be larger or equal to a threshold

*l*(which is to be selected) in order for the null (

*H*

_{0}) hypothesis to be accepted, i.e.(6.15)

The evaluation of the likelihood requires knowledge of the true innovations variance . If this quantity is known, or may be estimated with very good accuracy in the baseline phase (which is reasonable for estimation based upon a large number of samples *N* ) so that it may be treated as a fixed quantity (negligible variability), the above decision-making rule may (following some algebra) be re-expressed as(6.16)where *l*^{★} designates the resulting (to be selected) threshold.

The statistic _{N} follows (under the *H*_{0} hypothesis) *Χ*^{2} distribution with *N* degrees of freedom (sum of squares of mutually independent standardized Gaussian variables; see appendix A), and this leads to the following test at the *α* risk level (figure 5):(6.17)with designating the chi-square distribution's 1−*α* critical point.

Note that under the assumption of a fixed , this method leads to a test statistic which is similar to that of equation (6.10) (method A). The difference is that in method A, the variability of the estimate is accounted for, thus leading to an F distribution of the pertinent statistic. Of course, the two tests coincide for *N*_{o}→∞ in equation (6.10), as the F distribution then approaches the chi-square distribution (see appendix A).

*Fault identification*. Fault identification may be achieved by computing the likelihood function for the current signal(s) for the various values of * θ* (

**θ**_{A}, …,

**θ**_{D}) and selecting that hypothesis which corresponds to the maximum value of the likelihood, i.e.(6.18)with

*X*=A, …, D (faults of specific magnitudes). Obviously, also by including the nominal model, the scheme may be used for combined fault detection and identification.

*Fault estimation*. Fault estimation may be achieved in the limited case in which only one type of faults is possible. In that case, specific values of the likelihood may be possibly associated with specific fault magnitudes.

#### (iii) Method C: using the residual uncorrelatedness

In this method, the characteristic quantity is a function of the residual series autocovariance sequence.

*Fault detection*. Fault detection may be based upon assessment of the uncorrelatedness (whiteness) of the residual series obtained by driving the current signal(s) through the nominal (healthy) model _{o}. In this case, the hypothesis testing problem of equation (3.1) may be expressed as(6.19)with designating the normalized autocovariance (correlation coefficient; Box *et al*. 1994, p. 26) of the *e*_{ou}[*t*] residual sequence (see equation (4.3) for the definition of *γ*[*i*]). The method's characteristic quantity thus is (*r* being a design variable).

Under the null (*H*_{0}) hypothesis, *e*_{ou}[*t*] is (just like *e*_{oo}[*t*]) identically independently distributed (iid) Gaussian with zero mean, and the statistic below follows *Χ*^{2} distribution with *r*−*d* (*r*−*d*−1 in case the mean is also estimated) degrees of freedom (*d*=dim(* θ*); Box

*et al*. 1994, p. 314)(6.20)with designating the sample (estimated)

*ρ*[

*i*] and

*N*the number of signal samples (estimation based upon the autocovariance estimator , with designating the residual sample mean). The hypothesis testing problem of equation (6.19) then leads to the following test at the

*α*(type I) risk level (figure 5):(6.21)with designating the chi-square distribution's 1−

*α*critical point.

*Fault identification*. Fault identification may be achieved by similarly examining which one of the *e*_{Xu}[*t*] (*X*=A, B, …, D) residual series is uncorrelated. As with the previous methods, only faults of specific magnitudes (not the continuum of fault magnitudes) may be considered.

*Fault estimation*. The approach, as such, is not suitable for fault estimation.

#### (iv) Method D: using residuals associated with subspace identification

This method is motivated by stochastic subspace identification (the response-only case is presently treated; see Basseville *et al*. 2000 for details). It attempts detection of changes in the *modal space* by considering the parameter vector(6.22)where * λ* designates the vector containing the eigenvalues of a discrete-time system representation (the eigenvalues of the system matrix

*in the state space representation of equation (4.6)),*

**A***designates the matrix with columns being the vectors*

**Φ**

*ϕ*_{λ}where (with

*φ*_{λ}being the corresponding eigenvectors and

*the output matrix in equation (4.6)).*

**C***Fault detection*. Fault detection is based upon forming (for a given * θ*) the system's (

*p*+1)th order (

*p*large enough) observability matrix as follows:(6.23)with

**≜diag(**

*Λ**) (diagonal matrix with the system eigenvalues), and also the system's block Hankel matrix (for the case of equation (4.6))(6.24)with*

**λ***q*≥

*p*+1 and

*γ*

_{yy}(

*τ*) designating the response's theoretical autocovariance (see equation (4.3)).

The main idea for fault detection lies in the fact that under the hypothesis of the observability matrix and the block Hankel matrix corresponding to the same system (common * θ*),

_{p+1}(

*) and*

**θ**_{p+1,q}have the same left kernel space. Whether this is the case or not may be checked through the following procedure.

Use an available parameter vector * θ* in order to form the observability matrix

_{p+1}(

*). Then pick an orthonormal basis of the left kernel space of the matrix*

**θ**

**W**_{1}

_{p+1}(

*) (*

**θ**

**W**_{1}is a selectable invertible weighting matrix), in terms of the columns of a matrix

*of co-rank*

**S***n*(the system order), such that(6.25)(6.26)with (

*r*designating the output dimensionality). Note that the matrix

*is not unique. It may be, for instance, obtained through the singular value decomposition of , and implicitly depends upon the parameter vector*

**S***; hence, it may be designated as*

**θ***(*

**S***). The block Hankel matrix corresponding to the same system should then satisfy the property(6.27)(*

**θ**

**W**_{2}being an additional selectable invertible weighting matrix).

Now assume that the parameter vector **θ**_{o} corresponding to the healthy system is available from the baseline phase. For checking whether the current data actually correspond to the healthy system, the following residual is defined (compare with equation (6.27)):(6.28)where is the sample block Hankel matrix obtained from the current data . This residual, which should be zero for the theoretical block Hankel matrix under *H*_{0} (**θ**_{u}=**θ**_{o}) (see equation (6.27)), has zero mean under *H*_{0} (**θ**_{u}=**θ**_{o}) and non-zero mean under *H*_{1} (**θ**_{u}≠**θ**_{o}).

Testing whether the current system (with parameter vector **θ**_{u} and associated with the sample block Hankel matrix) coincides with the healthy system (with parameter vector **θ**_{o}) is based upon a ‘statistical local approach’, according to which the following ‘close’ hypotheses are considered:(6.29)with *δ** θ* designating an unknown but fixed error vector. Let

*(*

**M**

**θ**_{o}) be the Jacobian matrix designating the mean sensitivity of

*ζ*

_{N}and , with designating the expectation operator when the actual system parameter is

**θ**_{o}. Note that these matrices do not depend upon the sample size

*N*and may be estimated from the healthy structure during the baseline phase.

Provided that is positive definite, the residual *ζ*_{N}(**θ**_{o}) of equation (6.28) asymptotically follows Gaussian distribution, i.e.(6.30)This indicates that a deviation in the system parameters is reflected into a change in the mean of *ζ*_{N}.

Let and be consistent estimates of * M*(

**θ**_{o}) (assumed to be full column rank) and , respectively, obtained in the baseline phase (details in Basseville

*et al*. 2000). Under the null (

*H*

_{0}) hypothesis, the statistic

_{N}, below, follows

*Χ*

^{2}distribution with rank

*degrees of freedom, i.e.(6.31)Based upon this, a suitable statistical test may be constructed for deciding (at a certain risk level) whether the residual sequence has mean that is significantly different from zero (for instance as in §6*

**M***a*).

*Fault identification*. Fault identification could be similarly based upon consecutive tests of the above form, each one corresponding to each one of the parameter vectors **θ**_{A}, …, **θ**_{D}. Nevertheless, as with other methods, this may only work with faults of specific magnitudes but not for the continuum of fault magnitudes possible within each fault type.

*Fault estimation*. The method, as such, is not immediately suitable for fault estimation.

### (c) The functional model-based method (fmbm)

The functional model-based method (fmbm) has recently been introduced (Sakellariou & Fassois 2002; Sakellariou *et al*. 2002*a*,*b*) in an attempt to (i) effectively tackle the problem of identifying faults of any type in the important case that the continuum of possible fault magnitudes is considered, (ii) accurately estimate fault magnitude, and (iii) provide a ‘combined’ solution to the subproblems of fault detection, identification and magnitude estimation. The method's schematic is given in figure 8. Its cornerstone is its unique ability to accurately represent a structure in a given *fault mode* (fault type) for the mode's *continuum* of fault magnitudes. This is achieved using a single representation (model) that is directly parametrized in terms of fault magnitude. This representation is referred as a stochastic *functional model* (*FM*), and depending upon the problem treated, it may assume various forms (Sakellariou & Fassois 2002; Sakellariou *et al*. 2002*a*).

Letting *k* represent fault magnitude (within a particular fault mode *X*), a simple linear form for the structural dynamics under fault mode *X* is the Functional AutoRegressive with eXogenous excitation (FARX) representation(6.32)(6.33)This model form may be viewed as the union of a continuum of conventional ARX models, each one corresponding to a particular fault magnitude *k*. *x*_{k}[*t*], *y*_{k}[*t*] and *w*_{k}[*t*] are the excitation, response and innovations (prediction error) signals, respectively, corresponding to the particular *k* (note that *k*=0 corresponds to the healthy structure).

It should be noted that the above model resembles the conventional ARX representation (equation (4.8) with *C*()≡1), but explicitly accounts for the continuum of fault magnitudes within the fault mode it represents. This is accomplished by allowing its parameters and innovations variance to be functions of the fault magnitude *k*. As indicated by equations (6.33), the model parameters are specifically assumed to belong to a *p*-dimensional functional subspace spanned by the mutually independent functions *G*_{1}(*k*), …, *G*_{p}(*k*) (*functional basis*). The constants *a*_{ij} and *b*_{ij} designate the model's AR and X, respectively, coefficients of projection. A suitable functional model, corresponding to each particular fault mode, is estimated in the baseline phase using data obtained from the structure under various (sufficiently large number of) fault magnitudes (details in Sakellariou *et al*. 2002*a*).

*Fault detection*. Fault detection may be based upon the functional model (obtained in the baseline phase) corresponding to *any* specific fault mode (say *X* ). This model is now re-parametrized in terms of the currently unknown fault magnitude *k* and the innovations variance (the coefficients of projection being available from the baseline phase; *e*[*t*] represents the re-parametrized model's innovations—corresponding to *w*[*t*] in equation (6.32))(6.34)

Estimates of *k* and are obtained based upon the currently available signal(s) and the nonlinear least squares estimator(6.35)with RSS standing for residual sum of squares. Owing to the non-quadratic nature of the RSS criterion with respect to *k*, the estimator is nonlinear and thus obtained using the Gauss–Newton and Levenberg–Marquardt schemes (Ljung 1999, pp. 326–329). Assuming that the structure is indeed under a fault belonging to the specific fault mode (or else healthy), the estimator may be shown to be asymptotically (*N*→∞) Gaussian distributed, with mean equal to its true *k* value and variance , the latter corresponding to the Cramer–Rao lower bound (Sakellariou *et al*. 2002*a*). This variance is also estimated, with its estimate shown to be asymptotically (*N*→∞) uncorrelated with .

Since the healthy structure corresponds to *k*=0 (zero fault magnitude), fault detection may be based upon the hypothesis testing problem(6.36)

Under the hypothesis that the structure is indeed under a fault belonging to the specific fault mode (or else healthy), the random variable *t*, below, follows a *t* distribution with *N*−2 degrees of freedom (as it may be shown to be the ratio of a standardized normal random variable over the square root of an independent and normalized chi-square random variable with *N*−2 degrees of freedom; see appendix A), i.e.(6.37)which leads to the following test at the *α* (type I) risk level (figure 9):(6.38)with *t*_{α} and *t*_{1−(α/2)} designating the *t* distribution's (with the indicated degrees of freedom) *α* and 1−(*α*/2) critical points, respectively.

*Fault identification*. Once a fault is detected, fault identification is achieved through successive estimation (using the current data ) and validation of the re-parametrized FARX models (*X*=A, …, D) (equation (6.34)) corresponding to the various fault modes. The procedure stops as soon as a particular model is successfully validated; the corresponding fault mode then being identified as current.

Model validation may be based upon statistical tests examining the hypothesis of excitation and residual sequence uncrosscorrelatedness and/or residual uncorrelatedness (see §6*b*(iii)).

*Fault estimation*. Once the current fault mode has been identified, an interval estimate of the fault magnitude is constructed based upon the , estimates obtained from the corresponding re-parametrized FARX model. Thus, using equation (6.37), the interval estimate of *k* at the *α* risk level is(6.39)

## 7. Case studies

### (a) Case study A: fault detection in an aircraft stiffened panel

In this case study, fault detection for the skin of an aircraft stiffened panel is considered via the coherence measure-based method (§5*c*). For a detailed presentation, the reader is referred to Rizos *et al*. (2002).

*The panel and the experimental set-up*. The structure used in the study (figure 10*a*) is a lightweight (887 g) trapezoidal 2024-T3-bare aluminium stiffened (via ribs) panel of dimensions 450×320×320 mm (large base–small base–height) obtained from a Vought A-7 Corsair aircraft. The fault considered is a 25 mm long sawcut (figure 10*b*) on the skin (skin thickness∼2 mm).

The panel is suspended through strings and tested under free–free boundary conditions (figure 10*c*). The excitation is a horizontal random Gaussian force applied at point F (figure 10*a*) through an electromechanical shaker equipped with a stinger. The exerted force is measured via an impedance head, and the resulting acceleration (at points A, B and C; figure 10*a*) is measured via miniature accelerometers (sampling frequency *f*_{s}=6 kHz). A series of experiments are performed for both the healthy and faulty states of the panel. The coherence measure is estimated within the [0.38, 2.00] kHz frequency range using *N*=492×10^{3} samples per signal (frequency resolution 0.3662 Hz, number of non-overlapped segments *K*=30, Hanning windowing).

*Baseline phase*. Interval estimates of the coherence measure corresponding to the three (points A, B and C) excitation–response pairs are obtained from experimental vibration test data with the healthy panel.

*Inspection phase*. Interval estimates of the coherence measure (three excitation–response pairs) are re-obtained for the panel in its current (faulty) state. Figure 11*a* depicts the coherence measure interval estimates (at the *α*=0.05 risk level) for each panel state (healthy–faulty) and for each one of the vibration response measurement positions (points A, B and C). As expected, the coherence measure decreases with fault occurrence.

The test statistic (left-hand side of equation (5.18)) is depicted in figure 11*b* and confirms (for positions A and C) the statistically significant reduction in the coherence measure, and, hence, fault presence by being greater than the critical point corresponding to the *α*=0.05 risk level.

### (b) Case study B: fault detection and estimation in an aircraft skeleton structure

In this case study, fault detection and estimation in a scale aircraft skeleton structure is considered via the functional model-based method. For a detailed presentation, the reader is referred to Sakellariou *et al*. (2002*a*).

*The structure and the experimental set-up*. The scale aircraft skeleton structure used (figure 12) has been constructed at the University of Patras based upon the Garteur SM-AG19 specifications (see Balmes & Wright 1997). It consists of six solid beams representing the fuselage, the wings, the horizontal and vertical stabilizers, and the right and left wing tips. All parts of the skeleton are constructed from standard aluminium and are jointed together via two steel plates and screws (total skeleton mass∼50 kg).

The faults considered correspond to the placement of a variable number of small masses (simulating local elasticity reductions), of approximately 6.5 g each, at point B of the structure (figure 12). Each such fault is designated as *F *^{k}, with *k* representing the fault magnitude (grams of added mass). The corresponding state of the structure is designated as ^{k}. *F *^{0} designates a zero magnitude fault, thus referring to the healthy state of the structure (_{o}).

Fault detection and estimation are based upon vibration testing of the structure under free–free boundary conditions (figure 12). Excitation (in the form of a random Gaussian force) is applied vertically, at the right wing tip (point A), through an electromechanical shaker equipped with a stinger. The exerted force, along with the resulting vertical vibration accelerations, are measured via an impedance head and lightweight accelerometers, respectively (sampling frequency *f*_{s}=128 Hz, signal length *N*=1000 samples).

*Baseline phase*. AutoRegressive with eXogenous excitation (ARX) modelling of the healthy structure based upon 4 s (*N*=1000 sample) long excitation and single response signals leads to an ARX(18,18) representation. Fault mode modelling (for the single fault mode characterized by mass placement at point B) is based upon signals obtained from a series of 21 experiments, one corresponding to the healthy structure (*k*=0 g of added mass) and the rest corresponding to various fault magnitudes (faults *F *^{k} with *k*∈[6.5, 130] g; increment *δk*≈6.5 g). The functional ARX (FARX) modelling procedure leads to a FARX(18,18) model (equations (6.32) and (6.33)) with functional basis consisting of the first *p*=7 Chebyshev type II polynomials (Abramowitz & Stegun 1970).

*Inspection phase*. Two test cases, the first corresponding to the healthy structure (*F *^{0}) and the second to a fault characterized by added mass of 37.7 g (*F *^{37.7}), are considered. Fault detection and estimation results are, for each test case, pictorially presented in figure 13. A normalized version of the cost function of equation (6.35) is, for each case, shown as a function of the fault magnitude *k*. The corresponding zooms depict the true value of the added mass (dashed line), along with the corresponding interval estimate (shaded strip; the middle line indicates the point estimate and the left and right lines the lower and upper confidence bounds at the *α*=0.05 risk level).

*Test case I*(*healthy structure*). As it would have been rightly expected, no fault is detected in the first case, as the fault magnitude's interval estimate includes the nominal*k*=0 value (figure 13*a*). The proximity of the*k*point estimate to the true,*k*=0, value is quite remarkable.*Test case II*(*F*^{37.7}*fault; added mass of*37.7*g*). A fault is clearly detected in this case, as the fault magnitude's interval estimate does not include the*k*=0 value (figure 13*b*). The accuracy attained in estimating the fault magnitude is, again, excellent.

### (c) Case study C: fault detection and identification in a two-DOF system with cubic stiffness

In this case study, fault detection, identification and estimation in a simple two degree-of-freedom nonlinear system characterized by cubic stiffness (spring *k*_{3}; figure 14) is considered through the functional model-based method. For a detailed presentation, the reader is referred to Sakellariou & Fassois (2002).

*The system and the faults*. System simulation is based upon discretization of the equations of motion with time-step *T*_{s}=6.666×10^{−4} s (sampling frequency *f*_{s}=1500 Hz). Fault detection and identification is based upon measurement of the force excitation *F* (subsequently designated as *x*) and the vibration displacement response *x*_{2} (subsequently designated as *y*).

The faults considered correspond to stiffness changes in *k*_{ 2} (*k*_{ 2} fault mode; figure 14). Each individual fault is represented as , with the subscript *k*_{ 2} indicating the fault mode and the superscript *k* the exact fault magnitude.

*Baseline phase*. A nonlinear ARX (NARX) model of orders (4,4) is used for representing the healthy system (details in Sakellariou & Fassois 2002)

Fault mode modelling (for the fault mode) is based upon a series of 17 experiments, one corresponding to the healthy system (*k*=0% variation in *k*_{2}) and the rest corresponding to various fault magnitudes (faults with *k*∈[−32,32%]; increment *δk*=4% change in *k*_{2}). The signals used are, in all cases, *N*=2000 samples long. A functional NARX (FNARX) model with functional basis consisting of the first two (zeroth and first degree, thus *p*=2) Chebyshev type II polynomials (Abramowitz & Stegun 1970) is selected for representing the fault mode.

*Inspection phase*. Two test cases are considered via Monte Carlo experiments (10 runs per case). Monte Carlo fault detection and estimation results are pictorially presented in figure 15 and fault identification results in figure 16 (type I risk *α*=0.05).

*Test case I*(*healthy structure*). In this case, the fault magnitude interval estimate includes the*k*=0 value in each one of the 10 runs (figure 15*a*), thus*no fault*is (rightly) detected. The excellent accuracy achieved in estimating the correct*k*value in all 10 runs is remarkable.*Test case II*(*fault*; 3%*reduction in k*_{2}). In this case, a small magnitude fault is injected. Yet, its detection is again remarkably accurate for all 10 runs (figure 15*b*). In addition, the statistic (equation (6.20)) corresponding to the fault mode is, for all 10 runs, below the critical point (figure 16), thus correctly identifying (isolating) the current fault in the*k*_{2}stiffness.

## 8. Concluding remarks

In this paper, the principles and techniques of time-series methods for fault detection, identification and estimation in vibrating structures were presented, and certain new methods were introduced. As demonstrated, time-series methods offer inherent accounting of uncertainty, statistical decision making and the relaxation of the requirement for complete structural models, including physical or finite element models. The methods were classified as non-parametric or parametric, reflecting upon the way each method's characteristic quantity is constructed (see figure 17 for a pictorial classification of time-series methods). Non-parametric methods are generally simpler, but mainly focus on the fault detection subproblem. Parametric methods, on the other hand, are more elaborate, but offer the possibility of increased accuracy along with more effective tackling of the fault identification and estimation subproblems.

Parametric methods were further classified as: (i) model parameter based, in which the characteristic quantity is a function of the model parameter vector, (ii) residual based, in which the characteristic quantity is a function of model residuals, and (iii) the functional model based, in which a characteristic quantity is selected as a model parameter and another as a function of the model residuals.

The practicality and effectiveness of the methods were demonstrated through brief presentations of three case studies pertaining to fault detection, identification and estimation in an aircraft panel, a scale aircraft skeleton structure and a simple nonlinear simulated structure.

It is evident that due to their stated advantages, time-series methods will attract increasing attention in the future. The fault detection subproblem has received most of the attention thus far. It is, however, clear that significant work has to be devoted to the fault identification and magnitude estimation subproblems as well. On the other hand, one of the strengths of time-series methods, which is their reliance upon relatively simple (typically partial) mathematical models identified from operating data records (as opposed to detailed physical or finite element models), may generally constitute a limiting factor with regard to the latter two subproblems. It seems that the limits of the methods in this respect have not been sufficiently explored.

An additional issue that merits attention is the need for behavioural datasets corresponding to various fault conditions. This may not be necessarily possible (also see the discussion in §3*a*), and although the problem may be handled via data obtained from either laboratory scale models or mathematical (like finite element) models, it seems to set a practical limitation for (at least) certain cases. The fact that this primarily concerns fault identification and estimation (but not necessarily fault detection) is certainly encouraging, but it appears practically important to explore potential approaches for circumventing it.

In addition, further development of non-parametric and parametric methods suitable for the multivariate case is expected to be sought, so that information from more measurement locations may be included in the decision making. Furthermore, methods suitable for non-stationary structures (structures with time-varying properties) as well as nonlinear structures are expected to be further developed. Other important issues that need to be addressed include effective fault detection and identification under varying operating and/or environmental conditions; a difficult but practically important problem. Also significant is the transition from the current, mainly Gaussian, to a broader non-Gaussian time-series framework that may be more appropriate for certain applications.

## Acknowledgments

The authors are indebted to three anonymous referees whose comments helped in improving the manuscript.

## Appendix A Central limit theorem and statistical distributions associated with the normal

* The central limit theorem (CLT)* (Stuart & Ord 1987, p. 273; Nguyen & Rogers 1989, vol. I, p. 420; Montgomery 1991, p. 46). Let

*Z*

_{1},

*Z*

_{2}, …,

*Z*

_{n}designate mutually independent random variables each with mean

*μ*

_{k}and (finite) variance . Then, for

*n*→∞, the distribution of the random variable approaches the Gaussian distribution with mean and variance .

* The chi-square distribution.* Let

*Z*

_{1},

*Z*

_{2}, …,

*Z*

_{n}designate mutually independent, normally distributed, random variables, each with mean

*μ*

_{k}and standard deviation

*σ*

_{k}. Then the sum(A1)is said to follow a (central) chi-square distribution with

*n*degrees of freedom (

*X*∼

*Χ*

^{2}(

*n*)). Its mean and variance are

*E*(

*X*)=

*n*and var(

*X*)=2

*n*, respectively. Note that imposing

*p*equality constraints among the random variables

*Z*

_{1},

*Z*

_{2}, …,

*Z*

_{n}reduces the set's effective dimensionality, and thus the number of degrees of freedom, by

*p*(Stuart & Ord 1987, pp. 506–507).

For *n*→∞, the *Χ*^{2}(*n*) distribution tends to normality (Stuart & Ord 1987, p. 523).

The sum is said to follow non-central chi-square distribution with *n* degrees of freedom and non-centrality parameter *λ*=(μ_{k}/σ_{k})^{2}. This distribution is designated as *Χ*^{2}(*n*;λ) (Nguyen & Rogers 1989, vol. II, p. 33).

Let * x*∈

^{n}follow

*n*-variate normal distribution with zero mean and covariance

**Σ**(

*∼(*

**x****0**,

**Σ**)). Then the quantity

**x**^{T}

**Σ**

^{−1}

*follows (central) chi-square distribution with*

**x***n*degrees of freedom (Stuart & Ord 1987, pp. 486–487; Söderström & Stoica 1989, p. 557; Gertler 1998, p. 120).

**Let**

*Student's t-distribution.**Z*be the standard (zero mean and unit variance) normal variable. Let

*X*follow a (central) chi-square distribution with

*n*degrees of freedom and be independent of

*Z*. Then the ratio(A2)is said to follow a Student or

*t*(central) distribution with

*n*degrees of freedom (central, because it is based on a central chi-square distribution; Nguyen & Rogers 1989, vol. II, p. 34). Its mean and variance are

*E*(

*T*)=0 (

*n*>1) and var(

*T*)=

*n*/(

*n*−2) (

*n*>2), respectively (Stuart & Ord 1987, p. 513).

The (central) *t* distribution approaches the standard normal distribution (0,1) as *n*→∞ (Stuart & Ord 1987, p. 523).

** Fisher's F-distribution.** Let

*X*

_{1}and

*X*

_{2}be mutually independent random variables following (central) chi-square distributions with

*n*

_{1}and

*n*

_{2}degrees of freedom, respectively. Then the ratio(A3)is said to follow a (central)

*F*distribution with

*n*

_{1},

*n*

_{2}degrees of freedom (

*F*∼

*F*(

*n*

_{1},

*n*

_{2})) (central, because it is based on central chi-square distributions; Nguyen & Rogers 1989, vol. II, p. 34). Its mean and variance are (

*n*

_{2}>2) and (

*n*

_{ 2}>4), respectively (Stuart & Ord 1987, p. 518).

Note that for the distribution's 1−*α* critical point,

The (central) *F* distribution approaches normality as *n*_{1}, *n*_{2}→∞. For *n*_{2}→∞, *n*_{1}*F* approaches a (central) chi-square distribution with *n*_{1} degrees of freedom (Stuart & Ord 1987, p. 523).

## Footnotes

Copyright 2003–2006 by S. D. Fassois and J. S. Sakellariou. All rights reserved.

One contribution of 15 to a Theme Issue ‘Structural health monitoring’.

- © 2006 The Royal Society