## Abstract

A brief account of the quantum information dynamics and dynamical programming methods for optimal control of quantum unstable systems is given to both open loop and feedback control schemes corresponding respectively to deterministic and stochastic semi-Markov dynamics of stable or unstable systems. For the quantum feedback control scheme, we exploit the separation theorem of filtering and control aspects as in the usual case of quantum stable systems with non-demolition observation. This allows us to start with the Belavkin quantum filtering equation generalized to demolition observations and derive the generalized Hamilton–Jacobi–Bellman equation using standard arguments of classical control theory. This is equivalent to a Hamilton–Jacobi equation with an extra linear dissipative term if the control is restricted to Hamiltonian terms in the filtering equation. An unstable controlled qubit is considered as an example throughout the development of the formalism. Finally, we discuss optimum observation strategies to obtain a pure quantum qubit state from a mixed one.

## 1. Introduction

Cybernetics, which stemmed from the Greek for ‘controller’ or ‘governor’, was defined by Norbert Wiener, in his book of that title, as the study of control and communication in the animal and the machine. A more philosophical definition, suggested in 1956 by Louis Couffignal, one of the pioneers of cybernetics, characterizes cybernetics as ‘the art of ensuring the efficacy of action’. So far, cybernetics has been restricted mostly to classical self-organizing systems such as mechanical regulators, electrical networks, biological organisms, neurosystems and social systems described by the classical laws of physics, probability and information. It is based on the mathematical systems theory.

*Quantum cybernetics* (QC) deals with such self-organizing mechanical, electrical, biological, neuro- and social systems described by the *quantum laws* of physics, probability and information. In the same way that classical cybernetics is essentially the classical systems theory closely related to the optimal feedback control theory, QC can be described as the quantum theory of optimally observed and feedback-controlled open systems. Thus, the main ingredients of QC are the quantum optimal filtering and quantum feedback control theory based on the quantum stochastic innovation dynamics developed by the author since 1983 [1] and described in a series cited in a recent review paper [2].

Here, we give a brief account of quantum systems theory and optimal control theory in quantum open systems with and without observation. We shall mostly follow the exposition presented in Gough *et al.* [3]. However, we consider the case not only of diffusive but also of counting observations, and not only affine but also concave costs and target functions (see also [2,4]). Such systems can be described by mathematical theory of conditionally Markov quantum processes governed by the Belavkin master equation. This is illustrated by the simplest such system—the unstable controlled quantum bit with continuous diffusive observation modelled by a single Wiener innovation process.

## 2. Some standard facts and notations

A unified description of classical and quantum systems can be made in terms of the ‘quantum measure space’ consisting of a modular *-algebra with a reference weight , , defining a positive linear functional such that ℓ(`A`*)=ℓ(`A`)*. . The algebra of all bounded operators on a Hilbert space of the algebra of all bounded operators on a Hilbert space . We shall assume that is invariant not only with respect to the Hermitian conjugation *, but also with respect to the left and right modular involutions ♯ and ♭,
defined for a reference weight (i.e. positive normal linear functional) *φ*(`A`*)=*φ*(`A`)* on such that * is isometric with respect to the standard pairing
given by a reference weight *φ* on . Then the predual space can be realized by the densities with respect to *φ* defined as generalized elements affiliated with the weak closure of , i.e. as in general unbounded sesquilinear forms commuting with [5].

Let denote the state space realized by positive mass one densities *ϱ**=*ϱ*≥0, (*ϱ*|`I`)=1 with the tangent space and the cotangent space . Every state can be parametrized as *ϱ*(`q`)=*ϱ*_{0}−`q` by a *tangent element* . Cotangent elements are the equivalence classes

### (a) Quantum state geometry

For the semifinite quantum systems with the trace *φ*=tr, we have ♯=*=♭. In the simple case
with respect to the pairing 〈*ϱ*,`A`〉:=tr{*ϱ*`A`}≡(*ϱ*|`A`) of with *ϱ*=*ϱ**.

### Example 2.1

A single quantum bit is described by the matrix algebra of (2×2) matrices `A`=*α*`I`+*σ*_{a}, where is decomposed into Pauli matrices
with the normalized trace *φ*(`A`)=tr{`A`}=*α* defining the standard pairing 〈*ϱ*,`A`〉=*α*−** q**⋅

**. Here, the quantum bit states**

*a**ϱ*=1−

*σ*

_{q}are given by the state coordinate

`q`=

*σ*

_{q}parametrized by real vector from the unit ball |

**|≤1 with respect to the central state**

*q**ϱ*

_{0}=

`I`.

### (b) Derivations and Hessians

A (nonlinear) functional *ϱ*↦F[*ϱ*] on (or on ) admits a Gâteaux derivative if
for each (for each ). Note that , if ∇_{ϱ} is defined only on .

A Hessian with values in (or in ) is defined by

### Example 2.2

Let F[*ϱ*]=*f*[** q**] for

*ϱ*=1−

*σ*

_{q}on in the Pauli matrix basis. Then, ∇

_{ϱ}F[

*ϱ*]=

*λ*

`I`−

**⋅**

*σ***∇**

*f*(

**)≡−**

*q***⋅**

*σ***∇**

*f*(

**),**

*q*### (c) Affine and concave costs

An affine functional G(*u*,*ϱ*)=〈*ϱ*,*G*(*u*)〉 of given by a positive or bounded from below function on a measurable space is called the *expected cost* of the control . The *minimal expected cost*
is not affine but a concave finite functional on the convex set in the sense
for any *λ*∈[0,1] and .

More generally, the concave functional S[*ϱ*] can have values in for an unbounded from below function *G*(*u*).

### Example 2.3

, where is the max-plus indicator function of the ball .

### (d) Legendre–Fenchel transform

The above example for the affine G with *G*(*u*)=|** u**|

`I`+

*σ*

_{u}is a special case of the Legendre–Fenchel transform of the convex function

*g*(

*u*)=|

**|. The result S[**

*u**ϱ*] is well defined as a concave functional for any not identically equal to . Moreover, every concave functional S[

*ϱ*] on can be obtained as a result of the Legendre transformation for G(

*u*,

*ϱ*)=

*g*(

`X`)−〈

*ϱ*,

`X`〉 uniquely defined by a convex function

*g*(

`X`)=

*g*(

`X`+

*λ*

`I`)−

*λ*on the representatives

`X`=

`X`* for

*u*=

`p`(

`X`). This

*g*is found by the inverse Legendre transform Thus, the relative entropies of (a) and (b) types are the transforms of some convex

*g*

_{a}and

*g*

_{b}, and is another (thermodynamic) relative entropy.

## 3. Quantum conditionally Markov dynamics

Quantum Markov systems under demolition observation can be described as quantum sub-Markov dynamical objects over that obey conditionally Markov stochastic dynamics when conditioned with respect to a time-indexed family of increasing commutative observable algebras of a product system . It can be weakly defined by a hemigroup of normal contracting completely positive (CP) maps on the W*-algebras with such that
where for any *s*>0. Because are commutative, generated by a family *v*^{t}={*v*(*r*)|*r*<*t*} of compatible observables *v*(*t*) representing an output classical stochastic process *v*(*ω*,*t*)=*ω*[*v*(*t*)], the hemigroup *ϕ* induces on a controlled hemigroup of sub-Markov maps , where *y*=ℓ(*v*^{r})≡*y*^{r} is a feedback control strategy defining the restriction as the function of commuting controlling operators *y*=ℓ(*v*^{r})≡*y*^{r}. The sub-Markov CP maps are similar to the Markov maps but satisfy a weaker dissipativity condition than the Markovian CP maps *τ*_{r}(*t*), which must obey the conservativity condition *τ*_{t}(*t*,*I*)=*I*. The positive operators describe the decay effect of the unstable object during the process of demolition observation on the time interval [*r*,*t*) and it often does not depend on the control *y* up to time *r*.

### (a) Quantum-controlled generator

Quantum atomic systems are, as a rule, unstable, described by non-Markov but conditionally Markov dynamics, and induced, say, by sub-Markov processes, which are Markov if conditioned by the probability of survival of an unstable system. The continuous feedback-controlled sub-Markov dynamics is usually determined by its generator
given by a completely dissipative map controlled by the values *y*(*t*)=(*u*_{1},…,*u*_{n}) of some parameters . For the simple quantum systems , such dynamics is described by the controlled semi-Lindblad generators with the dissipation part
and
prepared for a continuous indirect measurement, say, of the observables affiliated to . Here we assumed that the dynamics is controlled only by the Hamiltonian *H*(*u*)=*H*(*u*)*, and (affiliated to ) determines the decay operator .

### (b) Deterministic quantum master equations

The density operator *ϱ*_{t} of a quantum state evolved from an initial *ϱ*_{t}=*ϱ*_{0} by resolving the *controlled master equation* with the predual generator normalized on the decaying probability of the survival *p*_{t}=〈*ϱ*_{t},`I`〉 if initially 〈*ϱ*_{0},`I`〉=1. The renormalized *ϱ*^{t}=*ϱ*_{t}/*p*_{t} describes the quantum state conditioned by the survival effect up to time *t*. Its evolution is described by the velocity defining the *deterministic nonlinear filtering equation*

### Example 3.1

An unstable quantum bit is described by the Hamiltonian , , `M`=*μ*`I`+*σ*_{m} with *μ*≥|** m**|+

*λ*

^{2}/4. Then,

*υ*(

*ϱ*)=

*σ*

_{u×q}+

*υ*

_{m}(

*ϱ*), where

### (c) Output processes and continuous observation

The state of an individual continuously observed quantum system does not coincide with the solution of the deterministic master equation, but depends on the random measurement output *ω* in a causal manner. We allow the output to constitute the generalized trajectories *x*_{j}(*t*,*ω*) of classical noise-like processes with zero expectation, but take a mathematically more convenient approach to work with the usual left-continuous trajectories of independent increment processes given by . We shall consider for simplicity only the standard diffusive (*ε*_{j}=0) and counting (*ε*_{j}=1) type of *v*_{j} such that
with as usually (d*t*)^{2}=0. It is, therefore, natural to model this on the Wiener or Poisson probability space with the standard Gaussian or Poisson probability measure denoted respectively by and . They can be induced from the vacuum state 〈*I*,*Y*〉_{∅}=〈*δ*_{∅}|*Yδ*_{∅}〉 as a reference on the product algebra represented in Fock space.

The conditionally Markov dynamics induces the output probability measure defined by the output state
for any given by a measurable functional of *v*=*v*^{t} and any initial state density . Here, is given for each *t* by the density with evolved from by the predual dynamics

### (d) Stochastic quantum master equations

The predual dynamical maps are usually determined as resolving for a *controlled stochastic master equation*. This equation, first derived for continuous non-demolition measurements of , *j*=1,…,*m* in Belavkin [6–8], reads for the general as
and
where ( in the case of trace *φ*=tr) and `C`^{j}=`I`+`L`^{j}. The conditional state satisfies the Belavkin *stochastic nonlinear filtering equation*, which, in the case of *ε*=0, reads as
3.1
and
3.2
where respectively
and
Given *ϱ*^{t0}=*ϱ*, the solution defines an -valued Markov process, and we will have
where *D*(*u*,*ϱ*) is defined in the diffusive case by the elliptic operator
and in the jumping case (*ε*=1) by the Feller Laplacian

## 4. Quantum dynamical programming

The *cost to go* of a feedback control law *y*(*t*)=ℓ(*t*,*v*^{t}) defining an adapted previsible process *y*_{ω} having with respect to the innovation *ω* is
Owing to the statistical interpretation of quantum states,
The optimal average cost on the interval (*t*,*τ*] is
with . It can be given by any concave functional of the terminal state .

### (a) Quantum Hamilton–Jacobi equation

Here, the control strategy *y*=ℓ will be non-random, ℓ={*u*_{j}(*t*)}, as will be any specific cost J[{*u*}]. As for at the times *t*<*t*^{′}≤*τ*, one has
Suppose that {*u*^{o}(*r*,*ϱ*):*r*>*t*} is an optimal control when starting in state *ϱ* at time *t*, and denote by {*ϱ*^{r}:*r*∈(*t*,*τ*]} the corresponding state trajectory starting at a state *ϱ* at *t*. Bellman’s optimality principle observes that
The equation is then to be solved subject to

### (b) Pontryagin’s maximum principle

We may rewrite this as the *Hamilton–Jacobi* (HJ) equation
introducing the Pontryagin Hamiltonian as the transform
defined on by the quadratic in `q` functional
which is affine in as independent of . This leads to the Hamiltonian boundary value problem
as the *Hamilton–Pontryagin problem* with the solutions with `b`=∇_{ϱ}S[*ϱ*] defining the minimal costs by the path integral in

### (c) The filtered quantum Bellman equation

If the filtering equation is in place of the master equation, we have the diffusive or jump Bellman equation
This can be rewritten in the generalized HJ form as
in terms of the generalized (Bellman) ‘Hamiltonian’
where and the series terminates in the diffusive case *ε*=0. If *θ*^{j} do not depend on *u*, this gives
The optimal control strategy coincides in this case with the solution *u*^{o}(`q`,`p`) of the corresponding non-stochastic problem for `q`=*ϱ*_{0}−*ϱ* and `p`=`p`(∇S(*t*,*ϱ*)).

### (d) Linear–convex state costs

Let *C*(** u**),

*G*(

**) and**

*u**H*(

**) be linear in , The additive real-valued functions**

*u**c*(

**) and**

*u**g*(

**) are assumed to be convex in**

*u***, e.g. indicating a constraint for a convex compact , and similar for**

*u**g*. Using the affinity of in

**, we can describe the Pontryagin Hamiltonian as by the usual Legendre transform E(**

*u***) of the constraint cost function**

*p**c*(

**). It is defined as the extremal value at**

*u***=(**

*p**p*

_{α}) with the components

*p*

_{α}(

*ϱ*,

`X`)=〈

*ϱ*,

`P`

_{α}(

`X`)〉 given by

`P`

_{α}(

`X`)=

`B`

_{α}+

*i*[

`Q`

_{α},

`X`].

## 5. Optimal feedback control of purification

Note that the set of the extremal points *u*^{o}(** p**) is not empty, and any extremal point is good for each

**as realizing the only possible global maximal value of the concave function**

*p***⋅**

*p***−**

*u**c*(

**). It determines the diffusive Hamilton–Jacobi–Bellman (HJB) equation in the form In particular, in the case of constraint This ‘zero rest mass’ HJB equation can be obtained as the limit at**

*u**μ*↘0 of the ‘relativistic’ HJB equation with corresponding to the cost gain under the constraint |

**|≤1, because such E(**

*u***) is the Legendre transform of**

*p*### (a) Bequest function for purification

Similarly, one can take , or with , and obtain the concave bequest functions
where **F**=(`F`_{α}). The first function in the form S[*ϱ*]=−|** q**| corresponding to

`F`

_{α}=

*σ*

_{α}was used by Wiseman to optimize the quantum bit purification, and this can be done even nicer with .

Equally, one can use the Jacobs bequest function (see [9]) as the Legendre transform
of the affine cost function given by *G*(** u**)=

**⋅**

*q***.**

*u*Another possible choice is the entropy, say von Neumann entropy corresponding to *μ*=2`I`,

### (b) Optimal feedback qubit control

Let `L`_{0} be (), and let us ignore the effect of environment by taking *R*_{j}=0 for *j*≠0. We may also take cost of constraint and any of the above bequest functions S[*ϱ*].

Explicitly, we have 〈*υ*(*u*,*ϱ*),`p`〉=** v**(

**,**

*u***)⋅**

*q***linear in**

*p***, which is maximized by the unit**

*u*

*u*^{o}=

**×**

*q***/|**

*p***×**

*q***| under the constraint |**

*p***|≤1. This leads to the Hamiltonian function Meanwhile,**

*u**θ*(

*ϱ*)≡(

*λ*/2)(

*ϱσ*

_{z}+

*σ*

_{z}

*ϱ*)−〈

*ϱ*,

*λσ*

_{z}〉

*ϱ*and so

### (c) The qubit Hamilton–Jacobi–Bellman equation

With the customary abuse of notation, we write S(*t*,*ϱ*)=S(*t*,** r**) for

**=−**

*r***and**

*q**θ*(

*ϱ*)=

*λ*(−

*zx*,−

*zy*,1−

*z*

^{2}) in terms of

**=(**

*r**x*,

*y*,

*z*). The Itô term in the HJB equation is then given by (with S

_{xy}=∂

^{2}S/∂

*x*∂

*y*, etc.) Putting everything together, we find that the HJB equation for optimal qubit control under the constraint is

## 6. Discussion

Our analysis is based on the fact that the quantum state is a sufficient coordinate not only for closed but also for open unstable quantum systems under the Markov approximation, and this remains true even if the open system is under a continuous demolition observation. However, we have to deal with differential equations of high or infinite dimensionality if .

Nevertheless, the Bellman principle can then be applied in much the same spirit as for classical states, and we are able to derive the corresponding HJB theory for a wider class of cost functionals than traditionally considered in the literature.

When restricted to the Bloch sphere for the qubit with the cost being a quantum expectation, we recover the class of Bellman equations encountered so far in quantum feedback purification.

## Footnotes

One contribution of 15 to a Theo Murphy Meeting Issue ‘Principles and applications of quantum control engineering’.

- This journal is © 2012 The Royal Society