## Abstract

The construction and operation of large-scale quantum information devices presents a grand challenge. A major issue is the effective control of coherent evolution, which requires accurate knowledge of the system dynamics that may vary from device to device. We review strategies for obtaining such knowledge from minimal initial resources and in an efficient manner, and apply these to the problem of characterization of a qubit embedded into a larger state manifold, made tractable by exploiting prior structural knowledge. We also investigate adaptive sampling for estimation of multiple parameters.

## 1. Introduction

Recently, much effort has been put into the construction of large-scale quantum devices operating in the coherent regime. This has been spurred on by the possibilities offered by quantum communication and information processing, from secure transmission of information to the simulation of quantum dynamics, to the solution of currently intractable mathematical problems [1]. Many different physical systems have been proposed as the basic architecture upon which to construct quantum devices, ranging from atoms and ions to photons to quantum dots and superconducting devices. For large-scale commercial applications, it is likely that this will involve scalable engineered and constructed devices with tailored dynamics requiring precision control.

Owing to inevitable manufacturing tolerances and variation, each device will display different behaviour even though they may be nominally ‘identical’. For their operation, they will need to be characterized as to their basic properties, response to control fields, and noise or decoherence [2]. We may also need to know how ideal is the system in the first place—for instance, the effective Hilbert space; what we may assume to be a qubit may have dynamics involving more than two effective levels. Extracting this information efficiently and robustly is crucial.

In a laboratory setting, an experimentalist may have access to many tools with which to study a system, e.g. spectroscopy and external probes. In a production setting, provision of these extra resources may be difficult, expensive or impossible to integrate with the device. It is, therefore, important to understand what sort of characterization can be performed simply using what is available *in situ*. Ideally, we would also like to be able to characterize the performance of a device with as little prior knowledge of its behaviour, e.g. how it responds to control fields, if this is the information which we are trying to obtain in the first place. Characterization using only the *in situ* resources of state preparation and measurement, even where it is possible, is challenging, owing to the increasing complexity of the signals, number of parameters to be estimated and the complexity of reconstructing a valid Hamiltonian from the resulting signal parameters. Robust and efficient methods of data gathering and analysis, preferably as automated as possible, are therefore essential. Here, we review progress in tackling these problems and we present new results for the problem of characterizing the Hamiltonian of a qubit embedded in a larger state manifold.

## 2. System identification paradigms

A standard tool for determining discrete quantum dynamics is *quantum process tomography* (QPT). Assume a completely positive trace-preserving map *Λ* acting on a system state *ρ* by
2.1where *λ*_{j} are the Kraus operators satisfying . QPT is a method by which we can determine {*λ*_{j}} by seeing how initial states of the system evolve under the map. For a complete characterization, both a complete set of initial states {*ρ*_{j}} and quantum-state tomography on the final states are required.^{1} In general, for a *d*-dimensional system, we need to use *d*^{2}−1 input states, and quantum tomography requires either a measurement with at least *d*^{2} outcomes—for instance, symmetric informationally complete positive-operator-valued measures (SIC-POVMs) [5]—or projective measurements in several different bases.

The ability to generate a complete set of initial states is a strong assumption. In many physical systems, it is only possible to directly prepare a set of orthogonal states, e.g. by projective measurement, or even only a single ‘fiducial’ state. The usual assumption that we can generate any state by coherent control of a fiducial state is invalid when we are trying to determine how to control the system in the first place. Equally, most systems can be measured directly only in a single fixed basis, and other measurement bases are assumed to be available through coherent control. This shows that QPT cannot be used as a starting point for quantum system identification in a setting where control has not been already established. We require a mechanism by which we can bootstrap our knowledge and abilities until control can be enacted upon the system.

Assuming the system dynamics can be approximated to a reasonable degree by Hamiltonian dynamics, the first core challenge is the identification of the *intrinsic system Hamiltonian* , which can be specified by the energy eigenstates and eigenvalues (up to rescaling). Optimum protocols for identification of the Hamiltonian dynamics depend on the available resources. One general paradigm introduced in Schirmer *et al.* [6] assumes a situation where we are restricted to preparation and measurement in a single fixed basis {|*e*_{j}〉}, i.e. we can prepare the system in one of the basis states |*e*_{j}〉, allow it to evolve for some time *t* before projectively measuring it in the same basis, and obtaining state |*e*_{k}〉 with some probability *p*_{jk}(*t*). The computational basis in this case can be defined in terms of preparation and measurement, and our task is to characterize the system Hamiltonian in this basis, i.e. obtain the eigenenergies *E*_{j} and the eigenstates |*E*_{j}〉 (up to a global phase), solely from the data traces of *p*_{jk}(*t*).

Hamiltonian characterization in this paradigm has been considered in a number of papers. It has been shown that a generic Hamiltonian for a single qubit can be recovered from preparation and measurement in a fixed basis up to a certain set of phases and an unobservable global energy shift. The extra phases become relevant only when additional resources are available that allow us to initialize the system in non-measurement basis states, or apply control that alters the system Hamiltonian. In the latter setting, it was also shown how the relative phases for two Hamiltonians could be recovered by composite rotations, vaguely reminiscent of Ramsey spectroscopy [7]. If the Hamiltonians do not commute or coincide with the preparation/measurement basis, full control over a single qubit can be realized, and the system and control Hamiltonians can be characterized up to a sign factor [8]. The results can be extended to multi-qubit or higher level systems [9].

Upon characterization of the intrinsic system Hamiltonian, the effect of applying controls must be identified. We can model this by assuming a Hamiltonian *H*(** λ**) that depends on classical control field parameters

**. In the simplest case, we may approximate the response of the system by , where**

*λ**H*

_{0}is the intrinsic system Hamiltonian and

*H*

_{j}are perturbations resulting from the application of control

*λ*

_{j}. This has been considered in earlier studies [6,8] where the effect of multiple control fields on a single qubit was characterized with respect to a reference Hamiltonian. For coherent operation,

*incoherent effects*should be small but they will still need to be characterized. Although complete characterization of the dynamics for open systems is a daunting task, under certain simplifying assumptions on the type of decoherence (e.g. pure dephasing or relaxation in a natural basis), the number of parameters to be estimated can be reduced and the relevant information extracted [10–13].

Although the general characterization paradigm described is quite restrictive, further restrictions, in some cases, must be imposed to deal with limited resources. Characterization of the coupling constants in a spin network when only a subset of the spins at the boundary can be measured and initialized is an example [14–16]. Another is estimation of leakage out of a subspace or coupling to unknown states, where we generally cannot measure the populations of these states directly. General bounds on subspace leakage were derived in Devitt *et al.* [17], based on the Fourier spectrum of the observed Rabi oscillations. This approach is useful when there are potentially many states very weakly coupled to the subspace of interest. In many cases, however, leakage may be due to coupling to a small number of states outside the subspace, e.g. when we encode a qubit using the lowest two states of a slightly anharmonic oscillator. In this case, a control applied to the qubit transition will induce some coupling to the third (nuisance) level, giving rise to unwanted dynamics. Characterization of this coupling allows the design of pulses that can suppress such unwanted excitations.

## 3. Hamiltonian estimation for embedded qubit

Formally, we consider a three-level system subject to a control field resonant with the 1–2 transition (see Leghtas *et al.* [18]), where |1〉 and |2〉 can be regarded as the qubit states. Assuming a constant amplitude field , transforming to a rotating frame and making the rotating wave approximation, this leads to an effective Hamiltonian of the form
3.1We choose a rotating frame where the off-diagonal elements (*d*_{n}) are real and positive. If higher order processes such as two-photon transitions between states 1 and 3 are negligible, we can assume *d*_{3}=0. The objective is to characterize both the qubit transition coupling *d*_{1} as well as the coupling to the nuisance level *d*_{2} and the detuning *δ* as a result of the anharmonicity.

If the system can be initialized in the basis states |*n*〉 for *n*=1,2,3 and we perform complete projective measurements in the same basis, then the probabilities
3.2can be determined for all *k*,ℓ for different times *t*_{j}. However, if we can initialize and measure the system only in the ground state, then only a single population evolution trace *p*_{11}(*t*) is available. For a two-level system, this is sufficient to infer the population of *p*_{22} but the presence of the third level means that *p*_{11}(*t*) gives us only limited information about the population of the other levels.

The existence of a non-zero detuning *δ* complicates the problem substantially. In the absence of anharmonicity, i.e. for *δ*=0, analytic expressions for *p*_{11}(*t*) can be obtained and the problem reduced to a single frequency estimation problem [19]. For *δ*≠0, there is no simple closed form for the signal *p*_{11}(*t*), and the eigenvalues are no longer of the form 0,±*λ*, but are instead *ω*_{12}=*λ*_{2}−*λ*_{1}=*ω*−Δ*ω* and *ω*_{23}=*λ*_{3}−*λ*_{2}=*ω*+Δ*ω*. For small detunings *δ* relative to the coupling strengths *d*_{n}, the frequency splitting Δ*ω* is much smaller than *ω*. To obtain a frequency resolution of Δ*ω* using spectral analysis would require a signal length of at least 1/(Δ*ω*). However, using the structure of the signal and restricting to solutions consistent with our prior knowledge, we can do considerably better. We know that the observed signal must be of the form
3.3

A natural starting point for a maximum-likelihood estimation is thus to choose basis functions *g*_{0}=1, , and , and, following standard techniques, maximize the log-likelihood function [9,20]
3.4where *m*_{b}=4 is the number of basis functions, *N*_{t} is the number of data points and
3.5The elements *h*_{m} of the (*m*_{b},1)-vector **h** are projections of the (1,*N*_{t})-data vector **d** onto a set of orthonormal basis vectors derived from the non-orthogonal basis functions *g*_{m}(*t*) evaluated at the respective sample times *t*_{n}. Concretely, setting *G*_{mn}=*g*_{m}(*t*_{n}), let *λ*_{m} and **e**_{m} be the eigenvalues and corresponding (normalized) eigenvectors of the *m*_{b}×*m*_{b} matrix *GG*^{†} with *G*=(*G*_{mn}), and let *E*=(*e*_{m′m}) be a matrix whose columns are **e**_{m}. Then, we have *H*=*V* *G* and **h**=*H***d**^{†} with , and the corresponding coefficient vector is **a**=**h**^{†}*V* [9].

Figure 1 shows that the log-likelihood function of the data provides strong evidence for a non-zero detuning, even though no peak splitting is detectable in the Fourier spectrum of the signal. For the given input data, the log-likelihood function has a squeezed peak that is narrow in one direction but much broader in the other. The fact that the peak is squeezed in a direction not aligned with a coordinate axis shows that the uncertainties in *ω* and Δ*ω* are not independent. The plot also shows that the width of the peak along the Δ*ω* direction is much greater than that in the *ω* direction.

This is also reflected in the observed uncertainties of the estimates of *ω* and Δ*ω*. If we take the (*ω*_{est},Δ*ω*_{est}) to be the coordinates for which the log-likelihood peaks, then their standard deviations give an indication of the uncertainty in our estimates. Figure 2 shows the standard deviation of *ω*_{est} and Δ*ω*_{est} for 256 simulated experiments each as a function of the number of time samples *N*_{t} and the number of experiment repetitions *N*_{e}, on a logarithmic scale. The plots look qualitatively similar, suggesting a similar scaling, but the scale shows that the uncertainty of the Δ*ω* estimates is about one order of magnitude greater than that of the *ω* estimates. We observe a similar scaling for the estimated amplitudes *a*_{m} for *m*=0,1,2,3 (not shown). The median relative error in the estimated Hamiltonian shows a similar behaviour though with some kinks, and it is interesting to note that the relative errors in the Hamiltonian tend to be larger than the errors in the estimated frequencies and amplitudes of the signal.

Preliminary results suggest that fragility of the reconstruction procedure is responsible for the observed larger spread in the errors of the reconstructed Hamiltonian even if the uncertainty of the estimated signal parameters is quite low as shown in figure 3*a*. The reconstruction procedure can sometimes fail, leading to outliers in the relative Hamiltonian error histogram shown in figure 3*b*, which typically correspond to unphysical Hamiltonians.

If the likelihood function does not have a sufficiently sharp peak, we can adaptively refine the sampling to reduce the uncertainty. A simple adaptive strategy is as follows:

— preliminary sampling: over a pre-chosen sampling period, measure at randomly chosen sampling times and compute the likelihood of various models based on these initial data;

— uncertainty estimate: the uncertainty of the solution can be estimated from the sharpness and relative height of the highest peak in the likelihood plot;

— refinement: from the initial data, choose an ensemble of probable models and calculate the weighted expected variance of their data traces as a function of time;

— new samples: we make additional measurements at times for which the most probable models differ the most and use the new data points to update the estimates of the signal parameters and Hamiltonian; and

— repeat as necessary.

Although similar in spirit to the adaptive Bayesian identification strategy proposed by Wiseman and co-workers for a single parameter Hamiltonian estimation [21], we do not minimize the variance of a single parameter as the Hamiltonian depends on multiple parameters. Another difficulty is that the multi-parameter likelihood function is usually far from Gaussian, and the expectation values of *ω* and Δ*ω* tend to differ substantially from the maximum-likelihood estimate. In this case, using the expectation values and variances with respect to a given parameter is not necessarily a good indicator of the real uncertainty of the model.

We apply this to the above qutrit system. Starting with a 100-point low-discrepancy sampling of the selected time range [0,20], we obtain *N*_{e}=100 measurements per time point. The resulting log-likelihood function, shown in figure 4*a*, has a squeezed peak centred at (*ω*,Δ*ω*)=(1.9468,0.1087). To narrow the peak width, we resample using the initial estimate for (*ω*,Δ*ω*). A simple and computationally cheap strategy is to choose new sample points at integer multiples of *T*/2=*π*/*ω*_{est}, and get a few accurate samples using, for example, *N*_{e}=1000. The idea is that we will be most sensitive to small modulations in the peak heights, owing to Δ*ω* at these times. Indeed, we find that the relative errors for the frequency and amplitude estimates improve substantially (figure 4*b*). Yet, the relative error of the reconstructed Hamiltonians does not always follow the same trend and sometimes actually *increases*. Further analysis shows that this is due to the reconstruction step and the complex dependence of *H* on the estimated parameters.

This suggests that it would be desirable to estimate the Hamiltonian parameters directly from the data, maximizing the likelihood
3.6or its logarithm, and, for convenience, we have defined , , and *δ*=4*ϵ*. Although there is no simple closed form for *p*_{11}(*t*_{j}) in this case, it can be computed numerically and the challenge is finding the global optimum of the three-parameter likelihood function. Without any prior information, we first compute the log-likelihood on a three-dimensional grid of parameter values, find the region where the global optimum is expected and use this information as a starting point of a local optimization routine to find the peak. For the above example, this yields (*Ω*,*α*,*ϵ*)_{opt}=(1.7159,0.9494,0.5340) for the initial data, versus (*Ω*,*α*,*ϵ*)_{opt}=(1.7323,0.9557,0.4999) with the new data (actual (1.7321,0.9553,0.5000)). The relative error of the reconstructed Hamiltonian for the initial data is 0.0363, comparable (even slightly higher) to the estimate obtained using the previous two-step approach. Using the new data, the relative error is substantially lower (3.8672×10^{−4}), if we maximize (3.6) instead of maximizing (3.4) followed by reconstruction. This approach appears suitable for systematic adaptive estimation, which will be investigated further in future work.

## 4. Conclusion

We have outlined the problem of system characterization and why it is an essential basic building block of quantum control for many prospective quantum information-processing devices. We have shown that such a bootstrapping procedure is possible and that much information can be gained through use of limited *in situ* initially present operational capabilities. By exploiting prior knowledge and reasonable assumptions on the structure and behaviour of the system, maximum-likelihood analysis offers efficient and robust characterization and reconstruction of complex systems.

Scalability remains a challenge. In the general setting, slightly increasing the size of the system leads to an explosion in signal complexity [9] and directly applying system identification to three or more qubits is a formidable task. Some complexity reduction can be achieved using Bayesian signal estimation in order to split up frequency and amplitude estimation, though this has some drawbacks in ensuring physically allowed reconstructed Hamiltonians. Extending the Bayesian estimation directly to Hamiltonians would alleviate reconstruction validity, but direct optimization of the likelihood is challenging for more than a few parameters. Hence, there is the need for a similar complexity reduction for Hamiltonian parameter estimation. Exploiting as much structural information about the system is essential in making the problem tractable, especially in the case of restricted resources.

Adaptive estimation for multiple parameters is a ripe area for further exploration. Practical online schemes may require pragmatic methods of determining adaptive measurements, as full Bayesian optimization involves integration over a highly peaked pay-off function in a high-dimensional parameter space. Development of effective yet computationally efficient optimization routines is imperative.

Relaxing more assumptions or expanding the set of resources available would give greater experimental relevance. Preparation and measurement capabilities may vary; for instance, instead of projective measurement, some experimental proposals implement continuous [22], weak or generalized measurements. States may also be prepared by relaxation or adiabatic passage and may not coincide with the measurement basis.

More generally, there are interesting connections between the compressive sensing (CS) and sparse reconstruction paradigm [23,24] and how our model-based system characterization techniques work. Instead of the union of a low-dimensional (linear) sub-spaces model in CS, we have a solution space as the union of low-dimensional manifolds of parameters. An extension of the notion of ‘basis incoherence’ and general techniques for efficient reconstruction from sparse data would be extremely beneficial.

## Footnotes

One contribution of 15 to a Theo Murphy Meeting Issue ‘Principles and applications of quantum control engineering’.

↵1 There are variants of QPT which use ancillas, entangling or other coherent operations which may offer certain advantages [3,4]. However, they all require operational resources beyond what we assume for system characterization.

- This journal is © 2012 The Royal Society