## Abstract

Spark-ignited internal combustion engines have evolved considerably in recent years in response to increasingly stringent regulations for emissions and fuel economy. One new advanced engine strategy ustilizes high levels of exhaust gas recirculation (EGR) to reduce combustion temperatures, thereby increasing thermodynamic efficiency and reducing nitrogen oxide emissions. While this strategy can be highly effective, it also poses major control and design challenges due to the large combustion oscillations that develop at sufficiently high EGR levels. Previous research has documented that combustion instabilities can propagate between successive engine cycles in individual cylinders via self-generated feedback of reactive species and thermal energy in the retained residual exhaust gases. In this work, we use symbolic analysis to characterize multi-cylinder combustion oscillations in an experimental engine operating with external EGR. At low levels of EGR, intra-cylinder oscillations are clearly visible and appear to be associated with brief, intermittent coupling among cylinders. As EGR is increased further, a point is reached where all four cylinders lock almost completely in phase and alternate simultaneously between two distinct bi-stable combustion states. From a practical perspective, it is important to understand the causes of this phenomenon and develop diagnostics that might be applied to ameliorate its effects. We demonstrate here that two approaches for symbolizing the engine combustion measurements can provide useful probes for characterizing these instabilities.

## 1. Introduction and background

It is now well understood that dilution of the in-cylinder fuel and air charge with non-reactive gases in spark-ignited (SI) internal combustion engines can increase thermodynamic efficiency and reduce nitrogen oxide emissions. However, as engine designers strive to take advantage of this effect through increasingly higher levels of charge dilution, the onset of dynamical combustion instabilities can offset potential gains. One leading method of charge dilution is exhaust gas recirculation (EGR). The upper dilution limit for EGR is the point at which combustion instabilities develop due to erratic flame propagation associated with decreased reactant concentrations and lower temperature. As the high EGR limit is approached, the combustion process becomes erratic and cycle-to-cycle combustion variations (that is, variations in combustion rate and intensity between successive engine cycles) become large. These variations can cause rough engine operation and higher emissions, so engines are routinely operated at dilution levels safely away from the high EGR limit. As a consequence, the full potential benefits of high EGR are not realized.

Cyclic combustion variability associated with dilution with excess air (also known as lean fuelling) has been studied for many decades [1–4]. The most commonly observed instabilities with lean fuelling arise as the result of internal feedback processes within each cylinder. That is, some of the exhaust gases left over from each combustion event are not expelled from the cylinder where they are generated due to inefficiencies in the exhaust process. These residual gases mix with incoming fuel and air at the beginning of the next cycle, causing a slight change in the initial conditions for that cycle. As the level of excess air approaches the extinction limit for combustion, these small changes in the initial charge condition become amplified into large swings in combustion intensity due to the inherent sensitivity to initial conditions. Emergent low-dimensional deterministic features are stimulated by this feedback and are readily visible in the cycle-to-cycle statistics. These features are visible even though there are many other high-dimensional (i.e. stochastic) processes associated with the in-cylinder physics, such as fluid mixing, valve flutter and spark dynamics also perturbing the dynamics. Symbolization of the cycle-to-cycle combustion intensity provides a useful tool for observing these emergent features amid the stochastic background [3,5]. The presence of deterministic features is important, because it can provide some degree of predictability and, thus, the possibility of dynamic cycle-by-cycle controls capable of suppressing combustion oscillations and extension of the practical lean operating limit. This type of control has been demonstrated experimentally on a vehicle [6].

For the high-EGR engines of interest here, exhaust gases are intentionally recirculated through an external duct back to the intake manifold after being expelled from the cylinders. Thus, instead of affecting only the cylinder in which they were generated, the recirculated gases are mixed and distributed via the intake manifold to all the cylinders. As in lean fuelling, a critical limit (tipping point) is reached when the non-reactive exhaust gases dilute the air–fuel mixture to a sufficiently high level to extinguish the combustion reactions. However, as external EGR is accomplished via a recirculation duct, there is an additional time delay between the cycle producing the exhaust and the cycles receiving the feedback. Thus, external EGR involves both additional time lags and cross-cylinder communication that can considerably expand the complexity of the global dynamic behaviour.

In this work, we consider the behaviour of a multi-cylinder experimental engine with external EGR in which cylinder-to-cylinder interactions combine with intra-cylinder instabilities to produce a sequence of complex combustion oscillations depending on the amount of exhaust recirculated. In the high EGR limit, the combustion oscillations in all the cylinders become almost completely phase-locked, producing large simultaneous swings between drastically different combustion states that last for tens of engine cycles each. Our main goal is to illustrate how some of the major features of these complex dynamics can be characterized and understood using symbolic analysis. As explained below, we have found two types of symbolic analysis that are particularly useful for probing these dynamics.

## 2. Experimental set-up

The experimental engine used to generate the data analysed here is part of a suite of advanced combustion research engines in the Fuels, Engines, and Emissions Research Center at Oak Ridge National Laboratory in the USA. This specific engine is a 2.0 l, 4-cylinder SI, gasoline-direct-injection engine manufactured by General Motors. It has been modified by Bosch with a higher compression ratio for flex-fuel optimization as well as an external, cooled, low-pressure EGR loop. The EGR level is controlled with a valve in the EGR loop between the exhaust and intake manifolds. Key engine specifications are summarized in table 1, and a schematic of the engine and EGR loop is shown in figure 1. Additional details of the engine set-up can be found in [7].

Three different EGR loop lengths have been studied on this engine to date: 0.91 m, 1.28 m, and 1.83 m. For the present discussion, we focus only on combustion measurements for the shortest loop length (0.91 m). For each experimental condition, spark timing and fuelling rate were manually set with a specialized external control system. To ensure that the observed dynamics were due to the engine itself rather than the controller, all other parameters (cam phasing, injection timing, etc.) were maintained constant at the default values for 0% EGR. Engine speed was held constant at 2000 r.p.m. using a motoring dynamometer. Stoichiometric air-to-fuel ratio was maintained for all EGR conditions.

For each operating condition, in-cylinder pressure data were collected from all four cylinders at 0.2^{°}CA (crank-angle degrees) intervals of crankshaft position for 10 000 consecutive cycles during steady-state operation. In addition to recording the detailed pressure profiles, we calculated several cycle-integrated measures of combustion intensity in each cylinder for each cycle. These included apparent heat release (HR) and indicated mean effective pressure (IMEP). Both of these quantities are standard measures of engine combustion that are extracted from in-cylinder pressure measurements using widely accepted algorithms [8]. HR and IMEP are both cycle-resolved indicators of combustion quality, which can be treated as discrete time series to observe variations in the combustion dynamics. Physically, HR is an estimate of the net amount of fuel energy released by combustion, while IMEP is a measure of the motive energy supplied to the piston. We focused on the latter quantity for this discussion because of its wider use and potential relevance to on-board diagnostics.

## 3. General experimental trends

Figure 2 illustrates the overall trends observed in the IMEP time series for cylinder 1 as EGR was increased. Similar overall behaviour was exhibited by the other cylinders as well. At the lowest EGR levels, combustion variations from cycle to cycle were small and did not exhibit any obvious repeating pattern. As EGR was increased, more significant variations became apparent. At sufficiently high EGR levels, large combustion oscillations developed. The strongest of these oscillations involved swings between multiple cycles of intense combustion followed by multiple cycles of misfire.

Figure 3 illustrates the above trend with EGR for all four cylinders in terms of the coefficient of variance (COV) in IMEP, which is a widely used metric for combustion stability and is defined as the standard deviation divided by the mean value. As can be seen here, the level of instability grows very rapidly once EGR levels are increased above 21%.

Another important aspect of the observed combustion oscillations is the onset of synergistic effects among the individual cylinders. Ultimately, these synergistic effects result in an almost completely phase-locked synchronization among all four cylinders, so that their combustion oscillations track almost one-to-one. This evolution of these synergistic effects is depicted for cylinders 1 and 2 in figure 4. Here, it can be seen that at very low EGR levels, the cylinders do not appear to track very closely in their combustion fluctuations. At intermediate EGR levels though, they occasionally track together briefly and then diverge. At the highest EGR level, these two cylinders closely match each other, only occasionally diverging (typically at major transition points), and then re-synchronizing. Similar phase locking occurred among all four cylinders at the highest EGR levels as depicted in figure 5.

High-speed measurements of fuel concentration in the EGR loop reveal that the bi-modal combustion described above is associated with unburned fuel cycles in the EGR loop [9]. Based on this, we hypothesize that the large oscillations are promoted by the presence or absence of unburned fuel in the EGR loop, which is transported back to the intake and then either stimulates or suppresses combustion in each cylinder depending on how close the charge mixture dilution is relative to the extinction limit. One important remaining question is how this process is linked to cylinder synchronization. That is, we do not yet understand if synchronization is driven primarily by instabilities that originate in individual cylinders and then stimulate larger scale events or if it occurs primarily because of the EGR loop dynamics becomes directed downward into each cylinder. We hope that symbolic analysis can help resolve this question.

Symbolic analysis has previously been used to characterize discrete time series obtained from internal combustion engine combustion processes [3,5,10]. Symbolic analysis was chosen because of its ability to select specific dynamical patterns in time series with significant levels of noise or high-dimensional components [11]. While other techniques such as conditional entropy [12] or predictability [13] might be suited for characterizing the progression of complexity in the engine dynamics, we employ symbolic analysis partly for targeting specific patterns and partly for its amenability to low-overhead real-time on-board diagnostics and control systems.

## 4. Symbolic analysis methods

We converted the discrete experimental IMEP time series described above to symbolic series (SS) as illustrated schematically in figure 6, using two different approaches for establishing the partitions defining the symbolic values. In the first approach, we partitioned each time series into a specified number of equi-probable bins, determined from the distribution of observed IMEP values at each condition. That is, we divided the observed IMEP values for each cylinder at each condition so that a random selection of any value from that time series would be equally likely to fall within any of the bins. This approach was most useful for detecting non-random (i.e. temporally related) variations in individual cylinders [5]. The second partitioning approach we utilized was based on dividing the range of IMEP values in each time series into a fixed number of equal size bins (equi-spaced bins). That is, the bin sizes were based on equal size increments in IMEP rather than equal probability. We found this method for partitioning to be useful for observing synchronization among the cylinders.

With both approaches for symbolizing the IMEP time series, our goal was to identify recurrent patterns in the combustion oscillations by determining whether the observed frequencies of certain sequences of symbols were significantly different from others. For understanding cycle-to-cycle variations in individual cylinders, we specified a fixed symbol sequence length (number of engine cycles) to track and constructed time-lagged multivariate representations of the symbol series:
4.1
where SS_{i} is the *i*th symbol sequence (‘symbolic word’) occurring in the symbolized time series, S_{i} are the sequential symbols, and *L* is the specified sequence length (i.e. ‘word’ length). Here, we included symbols from immediately adjacent cycles in each sequence, although in general it is also possible to specify longer intervals (i.e. lags) between successive symbols included in each sequence.

For convenience, we tracked the appearance of each symbol by assigning it a unique decimal value determined by
4.2
where SSV_{i} is the decimal value assigned to the *i*th symbolic sequence, *p* is the number of partitions (bins) used in the original symbolization, and *L* is defined as above. We note that this method for categorizing symbol sequences constrains the number of possible sequences that can be tracked to
4.3
In general, a rigorous approach for selecting values for *p* and *L* for evaluating symbolized time series is still an active area of research [7]. For our purposes here, we determined that values of *p*=4 and *L*=4 provided useful levels of resolution for our IMEP measurements as explained below.

We note that the use of equi-probable bins for tracking the behaviour of individual cylinders makes it possible to determine the statistical significance of sequence frequencies. For extremely long, fully random time series, the frequencies of each possible sequence as defined above should be the same (i.e. 1/*n*SS). Thus, when any sequences occur with frequencies significantly higher than this, it is an indication of non-random behaviour. As our experimental engine measurement series have finite length, we estimated confidence limits for determining significant SS peaks by generating many repeated randomized surrogates of the original symbol series. We then identified those SS peaks which occur with greater frequency than the largest peaks observed for 95% of the surrogates.

For understanding synergistic combustion oscillations and synchronization among all four engine cylinders, we specified sequences composed of the symbolic values from each cylinder in a given cycle. These were also treated as in equations (4.1) and (4.2) above, except that *i* now signified the cylinder firing order and *p*=4 and *L*=4. For this engine, the cylinder firing order is 1–3–4–2, so that the first symbol in each sequence (symbolic word) is from cylinder 1, the second value is from cylinder 3, etc. Because the symbolization partitions were not based on an equi-probable division of the original IMEP measurements, it is not trivial to specify explicit confidence limits for the resulting SS peaks. Nevertheless, as discussed below, it was possible to use this method of partitioning to observe important relative shifts in the cylinder-to-cylinder variations as EGR was changed.

## 5. Discussion of symbolic analysis results

Figures 7 and 8 illustrate example histograms for the temporal symbol sequence patterns occurring in each individual cylinder that were generated by the methods described in §4. The results shown are for cylinders 1 and 3, but are similar to those for the other cylinders. The horizontal axis of each plot represents the decimal value of each possible sequence, and the vertical axis indicates its relative frequency. Some of the highest peaks are labelled with their corresponding symbol sequences to illustrate their relation to IMEP level (0 represents the lowest of the 4 discrete IMEP levels, 3 being the highest). The red dashed horizontal line in the upper part of each frame is the 95% confidence limit for random surrogates. Thus, peaks in the histogram that extend above this line indicate sequences that are likely to be the result of deterministic processes. Even at low EGR levels, one of the strongest oscillation patterns corresponds to alternating higher- and lower-intensity burns.

We observe from the example histograms in figure 7 that even with no imposed external EGR, some of the cylinders exhibited non-random cyclic combustion variations. While the causes of these variations are not yet clear, we conjecture that they may be the result of internal residual gas effects within each cylinder (i.e. internal EGR) or possibly acoustic oscillations in the intake or exhaust manifolds. Whatever their cause, these intra-cylinder non-random variations differ from cylinder to cylinder, suggesting that the processes involved are spatially localized. As EGR levels were increased, the non-random features revealed by the histograms in figure 8 also increased and appeared to become more consistent among the cylinders, revealing their increasing tendency to synchronize. Repeated high-intensity and low-intensity burn sequences also became increasingly common with increasing EGR.

Figure 9 illustrates example symbol sequence histograms generated by combining the IMEP values from all four cylinders simultaneously as described above. In this case, we expanded the vertical axis of the histograms into a log scale in order to be able to see the occurrence of rare multi-cylinder events. Initially with no EGR, it appears that all of the cylinders behaved effectively the same, with high energy outputs falling in the top quartile of their range (i.e. the sequence [3,3,3,3]).

As EGR was systematically increased, however, additional peaks emerged, reflecting combustion sequences in which the cylinders began to collectively form recurring patterns. These peaks continued to grow and other peaks emerged as EGR was increased further, ultimately reaching a point where there were a large number of non-trivial repetitions of different global combustion states occurring in all four cylinders. One important consequence of this observation is that it indicates that the early stages of synchronization involved complex cylinder-to-cylinder interactions that were not simple periods. It also suggests that the largest oscillations at high EGR actually were more complex and subtle than just simple, periodic oscillations. More specifically, it appears that many of these emerging histogram peaks reflect brief, but persistent multi-cylinder states that occurred near global combustion mode shifts. We suspect that these recurring multi-cylinder events might actually have been acting as triggers for the mode shifts. This provides a much more subtle view of the dynamics that is not readily visible in the original IMEP time series.

Just as was employed with the univariate symbolization, a method of surrogate time series could be used to estimate empirical confidence limits on the multivariate symbolization. We chose not to do so in this work, partly because the objective with the multivariate symbolization was different, depending on the partitioning. As mentioned above, the equi-probable partitioning is useful in detecting deviations from ‘random’ behaviour, and the deviation with increasing EGR rate was visible in the univariate symbol statistics to show a progression of a cylinder's dynamics. The equi-spaced symbol statistics are much more sensitive to the onset of combustion instabilities, manifest in cycles with significantly less or no IMEP work (resulting in a distinct symbolic value of 0), as just one unstable cycle changes the partition levels markedly. The multivariate symbolization is used to show which cycles tend to drive the others with the progression of instabilities by virtue of examination of the largest specific symbol sequences, so the identification of specific sequences, and not just their confidence levels, was the primary aim.

Electronic supplementary material section contains a treatment using different partitioning for the multivariate symbolization for all seven experimental EGR conditions. Additionally, this section presents a normalized symbolic temporal-asymmetry statistic which is useful in tracking the progression of unstable combustion with increasing EGR rate. Such a metric should prove useful in identifying ‘tipping points’ associated with critical transitions [14].

The complexity of the cylinder-to-cylinder synchronization process is revealed in another type of depiction of the 4-cylinder symbol sequence patterns that we sometimes refer to as symbol sequence spectrograms. Examples are illustrated for 21, 34, and 53% EGR in figures 10⇓–12. In these plots, the 4-cylinder symbol sequence value (SSV) is plotted for each successive cycle. At intermediate stages of the EGR, this reveals episodic periods where the cylinders briefly synchronize together in non-trivial states before diverging apart again, only to resynchronize into other combinations of distinct states. The strongest sequences of high energy burns and misfires clearly occur at the highest EGR as expected. But even at this condition, however, it is apparent that the behaviour is not a simple periodicity, and there is a tendency for certain transient combinations of non-equal combustion to occur in different cylinders in the same cycle.

## 6. Summary and conclusion

The above results confirm that symbolic analysis can reveal important details about the nature and causes of cyclic combustion variations occurring in multi-cylinder SI engines operating with external EGR. We have confirmed that different rules for transforming the original measurements into symbol sequences (e.g. equi-probable versus equi-spaced partitions) can be used to probe distinctive aspects of the dynamics. So far, our results indicate that different levels of combustion oscillations arise in individual cylinders in this engine, even when there is no external EGR applied. As EGR is increased, non-random multi-cylinder patterns begin to emerge, suggesting that the earliest stages of cylinder-to-cylinder synchronization may be driven by interactions among the inherent oscillations in individual cylinders. At the highest EGR levels, it appears that the global multi-cylinder oscillations are not just simple phase-locked periodicities. Instead, these oscillations include subtle and complex cylinder-to-cylinder relationships that are not immediately obvious in the original times series.

Although we used cycle-integrated values (IMEP) as the basis of our initial time series in this study, we expect that integration of the original pressure measurements is likely to obscure some features in the detailed combustion trajectories that would be important in developing a more complete understanding of the physics involved. Thus, we recommend that future studies of this type should include time series that resolve intra-cycle details. For example, it should be possible to evaluate symbol sequences derived from in-cylinder pressure measurements made at multiple crank angles within each cycle. As far as we are aware, this level of symbolic analysis has not been significantly pursued among the engine combustion research community.

## Disclaimer

This manuscript has been authored by the Oak Ridge National Laboratory, managed by UT-Battelle LLC under contract no. DE-AC05-00OR22725 with the US Department of Energy. The US Government retains and the publisher, by accepting the article for publication, acknowledges that the US Government retains a non-exclusive, paid-up, irrevocable, worldwide licence to publish or reproduce the published form of this manuscript, or allow others to do so, for US Government purposes.

## Funding statement

This work was sponsored by the Vehicle Technologies Office, Office of Energy Efficiency & Renewable Energy, US Department of Energy, Gurpreet Singh, Ken Howden, Leo Breton, managers.

## Acknowledgements

The authors thank Robert Bosch, LLC for providing the engine and ECU used in this study.

## Footnotes

One contribution of 12 to a theme issue ‘Enhancing dynamical signatures of complex systems through symbolic computation’.

- © 2014 The Author(s) Published by the Royal Society. All rights reserved.