## Abstract

Chemicals released in the air can be extremely dangerous for human beings and the environment. Hyperspectral images can be used to identify chemical plumes, however the task can be extremely challenging. Assuming we know *a priori* that some chemical plume, with a known frequency spectrum, has been photographed using a hyperspectral sensor, we can use standard techniques such as the so-called matched filter or adaptive cosine estimator, plus a properly chosen threshold value, to identify the position of the chemical plume. However, due to noise and inadequate sensing, the accurate identification of chemical pixels is not easy even in this apparently simple situation. In this paper, we present a post-processing tool that, in a completely adaptive and data-driven fashion, allows us to improve the performance of any classification methods in identifying the boundaries of a plume. This is done using the multidimensional iterative filtering (MIF) algorithm (Cicone *et al.* 2014 (http://arxiv.org/abs/1411.6051); Cicone & Zhou 2015 (http://arxiv.org/abs/1507.07173)), which is a non-stationary signal decomposition method like the pioneering empirical mode decomposition method (Huang *et al.* 1998 *Proc. R. Soc. Lond. A* 454, 903. (doi:10.1098/rspa.1998.0193)). Moreover, based on the MIF technique, we propose also a pre-processing method that allows us to decorrelate and mean-centre a hyperspectral dataset. The cosine similarity measure, which often fails in practice, appears to become a successful and outperforming classifier when equipped with such a pre-processing method. We show some examples of the proposed methods when applied to real-life problems.

## 1. Introduction

Chemical plumes resulting from natural or anthropogenic emissions in the atmosphere can be unexpected and toxic. The detection and classification of such plumes in an efficient way would reduce the risk of harmful exposures [1,2].

Recent advances in technology have provided hardware capability to easily measure the reflection energy from a large region and to generate hyperspectral images that can be used to spot and classify chemical plumes. A hyperspectral image is, in fact, a picture taken over a large number of frequencies by means of special sensors. The data produced are a hypercube whose layers are two-dimensional images each corresponding to a different frequency. Hence each pixel of the hypercube contains a spectral signature of the corresponding physical region [3]. Given a hyperspectral image of an area where a chemical plume might be present, the plume detection problem can be divided into three subproblems. Namely, observing the presence of a plume, recognizing the chemical or chemicals contained in it and classifying the pixels of the hyperspectral image.

Solving entirely and accurately the whole plume detection problem is not an easy task. For this reason, at the end of 2012, Dimitris Manolaski and his research group at the Massachusetts Institute of Technology Lincoln Laboratory, Cambridge, MA, USA, launched the Chemical-Detection Algorithm Challenge with the idea of spurring on the scientific community to solve a single subproblem. In particular, they were interested in the classification stage. They gave four test datasets, each of them containing a hypercube of an area where some known chemical was released in the atmosphere, the signature of the chemical was released as well as the ground truth classification for each pixel. Plus they provided two blind datasets where the plume position was unknown, but it was known that a specific chemical, with a given signature, was present. The goal of the challenge was to devise a classification technique able to outperform those available in the literature.

Motivated by this challenge, we developed, first of all, a post-processing technique, based on the multidimensional iterative filtering (MIF) method [4,5], which improves the classification accuracy of any given classifier. The MIF algorithm allows a non-stationary signal, associated with a linear or nonlinear system, to be decomposed into simpler components on which a time–frequency analysis can be performed. The development of methods for the decomposition of nonlinear signals is a new and fast growing research area in signal processing. The interest in these kinds of algorithms has been sparked by the publication in 1998 of the breakthrough paper by Huang *et al.* [6], in which the first method of its kind, the so-called empirical mode decomposition (EMD) method, was proposed. The advantage of using methods such as EMD or MIF versus other classical techniques available in the literature for the decomposition and time–frequency analysis of a signal, for example the windowed Fourier transform or the wavelet transform, is that these newly developed methods are designed to handle naturally non-stationary signals originated by nonlinear systems, like most real-life signals. Furthermore, these techniques are completely data driven, so there is no need to make any *a priori* assumptions on the data we want to decompose.

We note that there are many other EMD-like algorithms, such as the synchrosqueezed wavelet transform [7], the sparse time–frequency representation [8,9], the partial differential equation transform [10,11] and the ensemble EMD (EEMD) [12,13]. They are all aimed at handling non-stationary signals associated with nonlinear systems. We chose to use MIF because it is easy to implement, ready to be applied to two-dimensional signals, computationally fast and most importantly it has been proved to be convergent under mild conditions, at least when applied to one-dimensional signals.

The other main result of this paper is the development of a pre-processing algorithm, also based on MIF, which allows us to decorrelate and mean-centre a raw hypercube, enabling us to use the naive cosine similarity (COS) measure, also known as the spectral angle mapper, as a classification technique. In fact, thanks to such a pre-processing technique, the COS measure produces a classification that is always comparable and that sometimes outperforms those produced by other commonly used classifiers. The advantage of coupling the COS classifier with the proposed pre-processing method is that in this way there is no need to estimate the average signature and the covariance matrix of pixels not belonging to the plume, unlike any other commonly used classification method.

The rest of this paper is organized as follows: in §2, we formalize the problem and briefly review some of the most commonly used classification methods. In §3, we introduce the MIF algorithm as a decomposition algorithm for one and higher dimensional signals. In §4, we present the proposed pre- and post-processing methods based on the MIF algorithm. Their performance is demonstrated by means of examples in §5.

## 2. The plume detection problem and classification algorithms

A hyperspectral image is a three-dimensional array, whose entries are real non-negative values, produced using hyperspectral sensors. While a standard camera sensor captures only three frequencies, the so-called RGB channels, a hyperspectral sensor captures several frequency channels at once [14]. The output of such sensors can be represented as , where *d* is the number of frequency channels and *h*×*v* is the number of pixels in the image: each pixel, *p*_{ij}, corresponds to a *d*-dimensional vector representing its spectral signature. Assuming we are given a hyperspectral image of an area where some chemical, potentially toxic, has been released and whose signature is given, the problem under study is the classification of pixels in *F* as belonging or not to the chemical plume. In particular, we want to assign to each pixel a score value in [0,1] where 0 corresponds to a chemical-free pixel and 1 to a pixel definitely containing the chemical we are looking for.

Among the classification algorithms that can be used to tackle this problem the most commonly used are the matched filter (MF), the normalized matched filter (NMF), which is also known as the adaptive cosine or coherence estimator (ACE), and the plain COS measure also known as the spectral angle mapper [15,16].

Assuming we have a single chemical released in the atmosphere whose spectral signature is *s*_{c}, given a pixel *p* with signature *s* its classification score value *y*(*s*) can be computed in one of the following ways:
2.1
2.2
2.3where is the mean signature over all the pixels in *F* which are outside the plume, called, for simplicity, the background pixels, and is the covariance matrix computed by considering each pixel in the background as an observation and each frequency as a variable.

We observe that if we consider the whitening and mean-centring transformation 2.4then the ACE and COS classifiers produce the very same score values 2.5

Our goal in using all these classifiers is to properly distinguish between pixels containing the known chemical and the background pixels.

Given the scores *y*(*s*) of the signatures of the pixels of a hyperspectral image *F*, for each value of a threshold *τ*∈[0,1], we can classify the pixels into four groups. If the pixel does contain the chemical and it receives a score higher than the threshold *τ*, it is counted as a true positive, while if the assigned score is lower than *τ* it is counted as a false negative. The other cases are: if the pixel does not contain the chemical and it receives a score lower than *τ*, it is counted as a true negative; otherwise is a false positive.

To quantify the performance of a classifier, the receiver operating characteristic (ROC) curves are commonly used [17].

A ROC curve is plotted in a Cartesian plane whereby on the vertical axis we have the ratio of the true positive to the total number of pixels that do contain the chemical, called the true positive rate (2.6), and on the horizontal axis the ratio of the false positive to the total number of pixels that do not contain the given chemical, called the false positive rate (2.7). Both the true positive and false positive rates are computed for different values of the threshold *τ*,
2.6and
2.7

The ROC curve is a non-decreasing curve that goes from (0,0) to (1,1) as we vary the threshold *τ*. The classification produced using a random guess has a corresponding ROC curve which is the straight line connecting (0,0) and (1,1). The larger the area under the ROC curve (AUC), the better the performance of the classifier. For more details on ROC curves, we refer the reader to [17].

In the next section we introduce the MIF method which we use to devise the proposed pre- and post-processing techniques.

## 3. Multidimensional iterative filtering

Finding features, structures or quasi-periodicities in a non-stationary signal generated by a non-linear system can be quite challenging. However, many real-life problems require such signals to be handled and analysed. We can think, for instance, of financial or climatic studies where the identification of recurrent patterns in data collected over time would be extremely valuable. For this reason, in the last two decades, many tools have been devised to deal with such signals. The goal is to decompose a signal into a finite number of simpler components called intrinsic mode functions (IMFs) plus a trend. Intuitively speaking, an IMF is a function oscillating around zero. Huang *et al.* [6] defined an IMF as a function such that: in its entire domain, the number of extrema and the number of zero crossings must either be equal or differ at most by 1; at any point, the mean value of the upper envelope defined by connecting its local maxima with splines and the lower envelope obtained by connecting its local minima is zero. From these intuitive definitions, it follows that sinusoidal functions are proper IMFs, but IMFs are not limited to those functions. We can have changes in the amplitude and/or the frequency as well as discontinuities in an IMF or in its derivatives.

There have been many algorithms developed to decompose a signal into IMFs. The commonly used ones work either by iteration or by optimization. The pioneer work, called the EMD method and published by Huang *et al.* in 1998 [6], has an iterative structure, the same as the iterative filtering method developed by Zhou and co-workers [18]. Examples of methods which are based on optimization are the sparse time–frequency representation by Hou *et al.* [8,9] as well as the synchrosqueezed wavelet transforms proposed by Daubechies *et al.* [7]. For an overview of such techniques, we refer the reader to [4].

In this paper, we focus on the MIF algorithm [5,18], which is an extension to higher dimensions of the iterative filtering method that is known to be convergent under mild assumptions, at least for one-dimensional signals, and numerical experiments have confirmed its stability [4]. We point out here that the other known iterative method, the EMD algorithm, suffers from known instabilities and its convergence is still an open problem. While a variant of the EMD method, called the EEMD, allows the instability issues to be solved and has been extended to higher dimensions [13], its convergence remains an open problem since the EEMD is based on the repeated usage of the EMD algorithm.

In the MIF algorithm,^{1} each IMF is produced by convolving iteratively the signal with a low pass filter *w*(**t**), , such as a Fokker–Planck filter [4], which has the nice property of being compactly supported and smooth on its entire domain. In particular, given the signal *s*(**x**), , the MIF algorithm works as follows.

Regarding the selection of the filter support *Ω*, we want to base it adaptively on the signal. Following what is proposed in [18] for the one-dimensional case where the length of the support is computed as 4*N*/*K*, where *N* is the number of sample points in the one-dimensional signal and *K* is the number of its extrema, we compute for each dimension of the signal the average support length. Then we use this information either to find the radius of a spherical support or to identify the radii of an ellipsoidal .

For the stopping criterion, there are several possibilities. One way of doing this, as explained for the one-dimensional case in [4,6], is to consider the relative change 3.1and discontinue the inner loop as soon as this relative change is less than a given threshold. The value of s.d. is usually set around 0.001 in our experiments.

The algorithm stops as soon as the residual *r*(*t*) has a dimension which contains at most one extremum. So, in the end, the given signal is decomposed as , where *m* is the number of IMFs produced by the algorithm.

## 4. Pre- and post-processing algorithms based on multidimensional iterative filtering

In this section, we propose both a post- and a pre-processing technique based on the MIF algorithm.

The post-processing method aims to increase the performance of any given classifier. Assuming we are dealing with a chemical plume which is not punctiform, so that at least a few pixels can be classified as containing a certain chemical, the performance of a classifier is effected by noise or single sensor misfunctioning [16]. The idea, as pointed out by Wager & Walther in their talk on ’Smooth chemical vapor detection’ given during the 2012 DTRA Conference in Boulder, CO, USA, is that pixels classified as containing the chemical we want to detect have, by assumption, to be surrounded by other pixels containing that very same chemical. We can improve the classification through the reinforcement of the spatial correlation. To do so, using the MIF method, we decompose the non-stationary and bidimensional signal, produced by a classifier, into high- and low-frequency components in an adaptive and data-driven fashion. Then, we remove the high-frequency components, which do not contain information related to the plume, preserving instead the low-frequency components. In doing so, we reduce the rate of misclassified pixels, thus improving the performance of any possible classification algorithm, not only those reviewed in this work. The choice of the MIF method is very natural since, unlike any other technique available in the literature, this method is designed to handle naturally non-stationary signals; that is, it does not require any *a priori* assumptions about the signal itself and its convergence has been proved when applied to one-dimensional signals.
*Post-processing method (PostP)*. Given the classification matrix , produced by a given classifier, we decompose *C* using the MIF method into its first IMF *I*_{1} and a remainder/trend *R*. We then remove the first IMF from *C* producing the new classification . We represent this method using the operator , .

We observe that, since the convergence of this method applied to the decomposition of two-dimensional signals is still an open problem, we checked the meaningfulness of the outcome of such a post-processing using the one-dimensional convergent MIF method. We consider, in fact, the classification scores first along each row and then along each column of the classification matrix, and vice versa. The results of these two procedures prove to be close to that obtained by post-processing directly the two-dimensional classification using the two-dimensional MIF method.

The other method we present here is a pre-processing technique. The goal is to devise a completely data-driven decorrelation and mean-centring technique that can be applied to a hyperspectral dataset before classifying it with the plain COS measure. The idea behind this pre-processing method comes from the observations we made previously regarding the ACE and COS classifiers. In particular, we know that ACE requires *a priori* knowledge of the mean signature and covariance matrix of the background, and we observed that if we can somehow whiten and mean-centre the hyperspectral image pixel signatures then the COS and ACE classifiers produce the very same classification scores (2.5). To achive this goal, we propose the following procedure. First of all, we subtract from each pixel signature *s* the average signature *μ* of the entire hyperspectral dataset. This is done in order to remove possible artefacts introduced uniformly in all the pixel signatures by the sensor or the device used to capture the hyperspectral image. Then we decompose each pixel signature, using the MIF method, into IMFs and a trend. Hence we subtract such a trend from the original signature. The result is a hyperspectral dataset which is mean-centred and decorrelated in a local way and it is ready to be classified using the plain COS measure.
*Pre-processing method (PreP)*. Given a hypercube *F* and the signature of the (*i*,*j*) pixel, we first compute , the mean signature of all the pixels in *F*, and we subtract it from each pixel signature to produce . Then, using the MIF method, we decompose into its IMFs {*I*_{n}}_{n=1,…N}, for some , and a trend *r* so that . Finally, we remove from the signature the remainder/trend *r*. So each pixel signature becomes . We can summarize the entire procedure using the operator .

The proposed pre-processing technique can be applied to hyperspectral images. It allows, in particular, the hypercubes to be decorrelated and mean-centred. Since we simply decorrelate and we do not whiten the dataset, we do not expect the COS and ACE classifications to match in general. However, from the experiments we have run, we observe that COS equipped with such a pre-processing method always performs similarly to and sometimes even better than the ACE classifier. Therefore, from a COS classification point of view, removing the global and local trends, as is done in the proposed pre-processing method, proves to be an equivalent procedure to the mean-centring and whitening of the data, as is shown in the next section.

## 5. Examples

In this section, we show the performance of the proposed pre- and post-processing methods when applied to the plume detection problem. We consider the datasets provided for the 2012 DTRA Chemical Detection Challenge.^{2} They contain both real-world and synthetic hyperspectral images, and the signatures of the chemical contained in the plume are always included. In this work, we focus on the real-world datasets, since the analysis of the synthetic ones appear to be trivial in general. We test our methods first on a dataset where the ground truth is known, since the actual position of the chemical plume is given, and then on a blind one that has been provided for the contest. With the former we can evaluate the performance of each classifier using ROC curves.

We point out here that the ground truths provided in the datasets for the 2012 DTRA Chemical Detection Challenge assume a classification of the hyperspectral image pixels into three groups: ‘inside the plume’, ‘outside the plume’ and ‘close to the boundaries of the plume’, as shown, for instance, in figure 1*b*. As explained in §2, in this field of research, the goal is to devise a method that can be used to classify pixels using just two classes: pixels containing the known chemical and background pixels. The reason why the ground truth includes the additional class of pixels close to the boundaries of the plume is because classifying such pixels as either inside or outside the plume is a difficult problem. Therefore, as suggested by Manolakis and his research group for the 2012 DTRA Chemical Detection Challenge, we use the ground truth to check the performance of a classifier only on pixels classified as either ‘inside the plume’ or ‘outside the plume’. This means that the ROC curves presented in this section are built solely using the pixels that are not contained in the set ‘close to the boundaries of the plume’. It is clear that, if we use only ROC curves built in the aforementioned way, we can compare the performance of different classifiers on pixels outside and inside the plume, but we are unable to compare them on pixels close to the boundaries of the plume. For this reason, we show both ROC curves and the pixel classification values produced by each classifier in order to give readers a more comprehensive picture of the performance of each classifier.

In the blind case, instead, it is impossible to plot ROC curves since the ground truth is unknown. In this second case, we show the performance of a classifier by simply plotting the pixel classification values, also known as detection maps.

We plot the pixel classification values using the ‘flipud’ colourmap, option ‘hot’, in Matlab. In all the examples presented in the following, the colourmap is calibrated identically.

We would like to note that we design MIF and pre- and post-processing techniques for hyperspectral data classifications not to compete with the existing classification methods, but to use them as complementary strategies to further improve their performance. In this paper, we present the performances of the proposed pre- and post-processing techniques when applied to standard classifiers such as ACE, MF and COS. We point out that they can be easily adapted for other classifiers as well.

### (a) Case of known plume position: hypercube ‘Location 1 released r134a’

The first dataset we tested contains the hypercube plotted in figure 1*a* taken in an area where chemical r134 has been released. The signature of this chemical is provided in the dataset. The given ground truth is shown in figure 1*b*, where pixels inside the plume are coloured black, while those close to the boundaries of the plume are coloured orange. We observe from this figure that the boundary region, which we are excluding in the evaluation of the ROC curves, as explained previously, is rather wide. It became evident that using only ROC curves and AUC values provides only partial information on the true performance of each classifier.

We started by applying the ACE classifier to this hypercube. The outcomes are plotted in figure 2*a*, where we clearly see the contribution of the noise. The corresponding ROC curve is shown in figure 3*a*. Its AUC value is approximately 0.98919. If we apply the post-processing method PostP described in the previous section, we can clean the classification, preserving the shape of the plume as shown both in figure 2*b* and by the ROC curve plotted in figure 3*a*. Its AUC value is approximately 0.99508. We compared this result with the post-processing methods based on wavelet transform and on the multidimensional empirical EMD (MEEMD) method, which is a multidimensional version of the EEMD proposed by Huang and co-workers [13]. For the first one, we used the Matlab noise removal tool Stationary Wavelet Transform Denoising 2-D, where we chose the Daubechies db2 wavelet to get the cleaned classification shown in figure 4*a*, whose ROC curve is plotted in figure 3*a*. The AUC value in this case is approximately 0.99507. For the MEEMD method, we used the code provided in [13] and set the number of perturbations to 10. The post-processed classification is obtained by subtracting the first two IMFs from the original classification. The corresponding cleaned classification is depicted in figure 4*b*, its ROC curve is shown in figure 3*a* and the AUC value is exactly equal, up to machine precision, to that obtained using the PostP approach.

We point out that, since in all the examples under study the ROC curves tend to be close to each other and the differences are concentrated in the top left corner of the ROC curve plots, from now on we opt to plot ROC curves using a log scale along the horizontal axis (see figure 3*b*).

Another observation regards the shape of the boundaries produced using these post-processing techniques. As we have already mentioned, the pixels close to the boundaries of the plume, shown in orange in figure 1*b*, are completely excluded in the computation of the ROC curves. This is the reason why all the classifications end up having almost identical ROC curves and AUC values, even when they have quite different pixel classification values, as in the case of the wavelet transform post-processing versus the one obtained with the PostP and the MEEMD techniques.

To have a better understanding of the true performance of these different post-processing methods, we can compare directly the classifications depicted in figures 2*b* and 4*a*,*b* with the ground truth shown in figure 1*b*. From this comparison, it is evident that when using the wavelet transform, while the noise is removed in a reasonable way, the shape of the plume is partially lost due to excessive blunting. Since in this context it is very important to identify the boundaries of the plume with high accuracy, the wavelet transform proves not to be the best option as a post-processing technique. More in general, in all the examples we tested, standard techniques for noise removal, such as Fourier or wavelets transform, tend to have problems producing accurate boundaries. This is due to two reasons: these kinds of signals are in general non-stationary, and the standard transformations often use bases that are determined *a priori*. Both limitations are overcome by the use of the MIF method as a post-processing technique. In fact, the MIF technique can handle a non-stationary signal and does not require any *a priori* assumption or knowledge of it, as confirmed in this example by the shape of the plume and its boundary after MIF post-processing (figure 2*b*).

Regarding the results obtained via MEEMD we point out that, even if the results are comparable with that produced with the proposed method, this approach has disadvantages. First of all it is still an open problem to prove the convergence of this technique even in one dimension. In fact, in this case, the method reduces to the so-called EEMD technique, whose convergence is not known. Whereas we recall that, even if the convergence of the MIF method is still an open problem as well, its one-dimensional version, the iterative filtering technique, has been proved to be convergent under mild conditions [4]. Furthermore, the MEEMD has been found to be much slower than the proposed method. On an average personal computer, it takes around 4 s to produce the cleaned classification plotted in figure 2*b*, while the MEEMD runs for more than 190 s on the same computer to produce the result shown in figure 4*b*.

We observe that, with this dataset, the MF classifier gives results similar to ACE, so we avoid presenting them and move directly to the COS classifier.

The pixel classification produced by COS applied to the raw hypercube is shown in figure 5*a*, where we reversed the pixel classification values. We needed to do so in order to have meaningful results. In fact the corresponding ROC curve, plotted in dashed black in figure 6*a*, would be below the random guess curve without the reversion. If we apply the PostP method to the reversed COS classification, we can improve the classification performance. This is verified by the increase in the area under the ROC curve (plotted in solid black in figure 6*a*). However, even with post-processing, the COS classification of the raw hyperspectral image still needs to be reversed. To solve this issue, we devised and applied the pre-processing method PreP based on the MIF algorithm and described in the previous section. The COS classification of the pre-processed hypercube is shown in figure 5*b*. Its ROC curve, plotted in dashed blue in figure 6*a*, is now directly above the random guess curve without any need to reverse the classification. Nevertheless, its performances are worse than those produced by the reversed COS classification of the raw dataset—the dashed black curve is clearly above the solid blue one. However, if we apply the PostP method to the COS classification of the pre-processed hypercube, we get a better performance than any other COS classification with or without post-processing. In fact, the solid blue ROC curve is above all the others in figure 6*a*. We can also try to apply the wavelet transform as a denoising post-processing method. If we plot the corresponding ROC curve, its performances are slightly worse than those obtained using the proposed method—the solid magenta curve is slightly lower than the solid blue one in figure 6*a*. Furthermore, the boundaries end up being excessively blunted (figure 7*b*) for the reasons explained above, while the proposed post-processing technique allows us to produce boundaries that look more natural (figure 7*a*).

So, overall, the best performances are obtained from the combination of the pre- and post-processing techniques, both based on MIF, applied to the COS classifier. We point out that the range of values produced by the COS classifier applied to the raw dataset is in the narrow interval [0.8,0.84] after the reversion, while the one corresponding to the COS classification of the pre-processed dataset is in the wider interval [0,0.5] without any need for reversal. The performance of the COS classifier equipped with the pre-processing method becomes comparable to that of the ACE and MF algorithms.

A question that naturally arises at this point is why the pre-processing method works so well in enhancing the COS classification. The reason is evident from figure 8*a*,*b*. Removing the global and local trend from the signature of each pixel allows us to properly measure the angle between the pixels and the chemical signatures. Furthermore, we observe that all the datasets provided for the aforementioned challenge always have the property that the signature of the chemical seems to be subtracted from the signature of the pixels contained in the plume. It remains an open problem to understand the reason for such a behaviour of the signatures. From this last observation, it is evident that only after removing the trend of the pixel signatures does the direct comparison of the signatures made by the COS classifier produce a meaningful classification. We observe that the combination of pre- and post-processing of the COS classification works well in increasing the performance of this classifier for all the datasets provided for the 2012 DTRA Chemical Detection Challenge.

### (b) Blind case: hypercube ‘Location 2 released sf6 blind’

As another example we consider one of the blind datasets provided for the challenge. The hypercube image is shown in figure 9*a*. It is known that this image contains a plume of the sf6 chemical, whose signature is given.

If we apply the ACE classifier to the hypercube as it is, we obtain the classification depicted in figure 9*b*. We can apply the PostP post-processing method to reinforce the spatial correlations (figure 10*a*). The results produced by the Matlab noise removal tool Stationary Wavelet Transform Denoising 2-D, with the choice of the Daubechies db2 wavelet, are depicted in figure 10*b*. We obtained similar results from the post-processing of the MF classification (figure 11*a*). Using the proposed PostP technique (figure 11*b*) we were able to clean the classification in a more effective way than with a standard technique (figure 12*a*).

With the plain COS classifier, as for the previous dataset, we needed to reverse the classification in order to produce meaningful results (figure 12*b*). If we apply the proposed PreP pre-processing method to the hypercube, we can remarkably increase the performance of the classifier, as shown in figure 13*a*, so that there is no need to reverse the classification and its performance is better than that of the other classifier tested. In fact, the COS method equipped with the proposed pre-processing method is producing a more natural plume shape than the one produced by any other classification technique tested. So the pre-processing technique allows us to transform a known failing classification method into what appears to be an extremely successful classification algorithm. We observe that the numerical range of the COS classification of the raw hypercube spans approximately the interval 0.92 and 0.94, while that of the pre-processed hypercube is between 0 and 0.5.

Also in the case of the COS classifier, the application of the PostP post-processing technique helps in cleaning and improving the classification (figure 13*b*).

## 6. Conclusion

Inspired by the DTRA 2012 Chemical Detection Algorithm Challenge posted by Dimitris Manolakis and his research group at MIT Lincoln Laboratory, in this paper we tackle the problem of the accurate detection of the boundaries of a chemical plume hidden in a hyperspectral image by means of classification algorithms. In particular, we devise post- and pre-processing algorithms aimed at improving the performance of known classifiers.

The post-processing technique we present in this paper is based on the MIF method [5], which is an algorithm for the decomposition of non-stationary signals. This post-processing technique is a uniform procedure which allows the performance of any classifier used in the chemical plume detection problem to be improved by means of an adaptive and data-driven cleaning and smoothing of its classification. The choice of the MIF method appears to be ideal, since it is an algorithm developed to handle naturally non-stationary signals, which is the case for almost any real-life signal. Furthermore, the MIF algorithm does not require any *a priori* assumptions to be made on the kind of data we are analysing, which is in contrast to other techniques such as the wavelet transform and similar methods. We tested this approach on the datasets provided for the aforementioned challenge, and we show some of these results in the Examples section.

Secondly, we propose a pre-processing technique which, independently from the dataset under study, allows us to decorrelate and mean-centre the data. The goal of this procedure is to make the score values produced by the COS classifier equal to those produced by the ACE classifier, as shown in (2.5). This is made possible by using the MIF method, which allows a signal, in particular a non-stationary one, to be decomposed in an adaptive and data-driven way. From the tests run, and partially shown in the Examples section, we note that the chemical plume boundaries detected using the COS classifier equipped with the PreP pre-processing method appear to be meaningful. These results are confirmed by the ROC curves and the shape of the detected plumes. In some cases, the detected boundaries prove to be more meaningful and reasonable than those produced using other standard classification methods. In conclusion, given the COS measure, which is known to be a failing classification technique, the proposed pre-processing technique allows it to be transformed into a successful classifier which can even outperform in some cases other commonly used classifiers.

A possible direction for future research is in the analysis of plumes containing mixtures of two or more known chemicals. Traditionally such mixtures have been assumed to be linear; however, it is known that such an assumption is far from reality. New ways for tackling this problem have to be devised. We point out also that, from an algorithmic point of view, the convergence of the MIF method has been proved only for the one-dimensional case [4]. For higher dimensions, its convergence has yet to be studied.

## Data accessibility

The algorithms proposed in this paper can be found at: www.cicone.com.

## Authors' contributions

All three authors contributed to the development of the algorithms as well as to the analysis and interpretation of the data. A.C. and H.Z. worked on drafting the article and revising it critically. All three authors gave final approval for publication.

## Competing interests

The authors declare that they have no competing interests.

## Funding

This work was supported by NSF Faculty Early Career Development (CAREER) Award DMS–0645266, DMS–1042998, DMS–1419027, and ONR Award no. N000141310408. A.C. acknowledges support from the National Group for Scientific Computation (GNCS—INdAM) ‘Progetto giovani ricercatori 2014’, and from the Istituto Nazionale di Alta Matematica (INdAM) ‘INdAM Fellowships in Mathematics and/or Applications co-funded by Marie Curie Actions’.

## Acknowledgements

The authors thank the anonymous referees for all the constructive and valuable comments that have contributed to the improvement of this paper.

## Footnotes

One contribution of 13 to a theme issue ‘Adaptive data analysis: theory and applications’.

↵1 The MIF code can be found at www.cicone.com

↵2 The datasets can be obtained from the National Science Foundation (NSF). However, they are provided only to researchers and teams supported by the NSF itself.

- Accepted November 26, 2015.

- © 2016 The Author(s)

Published by the Royal Society. All rights reserved.