## Abstract

Prostate cancer is one of the most common malignant neoplasms in men with an overall incidence of approximately 15 per cent during the normal life span. Androgen-deprivation therapy (hormone therapy) is an effective treatment of this disease when progressed to an advanced stage. Despite impressive responses, such treatment when applied on a continuous basis is not curative and eventually culminates in androgen-independent disease. On the other hand, intermittent androgen suppression (IAS) was first conceived as a potential way of delaying progression to androgen-independence, in addition offering the possibility of reducing adverse effects and improving the quality of life. Although the validity of this approach has been confirmed in several clinical studies, the optimal scheduling of the cycles of on- and off-treatment remains to be explored. In the present article, we show that IAS lends itself to mathematical modelling with hybrid dynamical systems and that the model we have developed can be used to select the best strategy for keeping prostate cancer in an androgen-dependent state as long as possible. Our results also suggest that the current way of using IAS exceeds what is necessary for optimal control; in fact, we have found that to achieve optimal control, the amount of therapy (dose and duration of drugs) can be reduced by a factor of one half.

## 1. Introduction

The success of androgen-deprivation therapy (Huggins & Hodges 1941; Huggins *et al.* 1941) is based on the ability to induce apoptosis in prostatic epithelial cells, a cell death process that is regulated by the function of the androgen receptor and results in tumour regression when androgens are withdrawn.

Despite the fact that androgen withdrawal of approximately six months duration results in normal levels of the tumour marker serum prostate-specific antigen (PSA) in about 90 per cent of patients, the benefit of such therapy is usually temporary (Whitmore 1973). This is because under complete androgen withdrawal surviving tumour cells generally progress to an androgen-independent state; it is probably that permanent androgen ablation is the impetus for further growth and progression of androgen-independent cells (Feldman & Feldman 2001). However, if androgen-ablation therapy is stopped before the tumour cells progress to an androgen-independent state, the surviving cells may retain their dependence on androgen. In this regard, since pharmacological castration has proven effective, the androgen-ablation routine can be stopped and re-started at any time as has been demonstrated in a number of clinical studies (Akakura *et al.* 1993; Goldenberg *et al.* 1995; Higano *et al.* 1996; Evens *et al.* 1998; Bruchovsky *et al.* 2006; Seruga & Tannock 2008; Abrahamsson 2010).

Figure 1 shows the clinical data of the behaviour of the serum PSA during intermittent androgen suppression (IAS) covering a period of about 2000 days. It should be noted that the serum PSA was not allowed to exceed 20 ng ml^{−1} during the off-treatment periods of IAS. The database is described in detail elsewhere (Bruchovsky *et al.* 2006). Although the basic protocol is well established, the question whether it is optimal or not remains open to challenge.

In this article, we address the question of how to determine the optimal schedule of the cycles of on- and off-treatment. For this purpose, we develop a mathematical model and apply optimal controlling based on the model. The basic model we investigate here is a hybrid dynamical system (Witsenhausen 1966). A hybrid dynamical system is one with both discrete and continuous variables. In our basic model, a discrete variable is used to represent on- and off-intervals of androgen-withdrawal treatment and continuous variables represent the populations of cancer cells. The model we employ includes the effect of mutual competition between androgen-dependent and -independent cancer cells (Shimada & Aihara 2008). Since the model contains nonlinear terms representing mutual competition, it is hard to obtain an analytical solution of optimal controlling. To overcome this difficulty, we approximate the basic model to a piecewise affine (PWA) system (Sontag 1981; van der Schaft & Schumacher 2001), which belongs to the class of hybrid systems. It is known that many kinds of problems appearing in analysis/control synthesis of PWA systems can be reduced to mixed integer programming (MIP; Bemporad & Morari 1999; Bemporad *et al.* 2000, 2002; Grossmann 2002). Our simulation shows that the resultant optimal schedule reduces both the PSA level and the amount of treatment compared with the conventional IAS therapy.

## 2. Mathematical model

Here, we introduce a mathematical model that explains the dynamical behaviour of prostate cancer (Ideta *et al.* 2008; Shimada & Aihara 2008; Tanaka *et al.* 2010). Let *x*=(*x*_{1}, *x*_{2}) denote the subpopulations of cancer cells where *x*_{1} and *x*_{2} correspond to androgen-dependent cells and androgen-independent cells, respectively. PSA is denoted by *y*=*c*_{1}*x*_{1}+*c*_{2}*x*_{2} where *c*_{1}=*c*_{2}=1 is assumed for the sake of simplicity. Parameter *u* represents androgen withdrawal in the on and off modes, i.e.

The basic mathematical model discussed in this article is the following one proposed by Shimada & Aihara (2008):2.1a2.1band2.1c
Here, we assume that the unit time (the time interval between *t* and *t*+1) of the above dynamics corresponds to four weeks (28 days). Parameters *β*_{1} and *β*_{2} are mutual suppression rates caused by competition between androgen-dependent cells and androgen-independent cells for a nutrient resource inside a prostate tumour. Parameter *α*_{1}(*u*) represents the net growth rate of androgen-dependent cells, namely the difference between the proliferation and the apoptosis rates that depend on the serum androgen concentration. We assume *α*_{1} as2.2while *α*_{2} is the constant growth rate of androgen-independent cells. Since *u* assumes the value of 0 or 1 only, the model (2.2) can express all combinations of the net growth rates of androgen-dependent cells during on-treatment and off-treatment periods. Parameter *m*(*u*) denotes the mutation rate from androgen-dependent cells to androgen-independent cells, which is defined as

Figure 2 shows the vector fields of our model in the state space with *u*=1 and *u*=0 where the parameter setting is as follows: *m*_{0}=0, *m*_{1}=0.01, *β*_{1}=0.2, *β*_{2}=0.2, , *α*^{2}_{1}=−1.2 and *α*_{2}=0.3. The green solid and the red dotted lines in figure 2 are the nullclines and 𝒩_{2}(*u*)={*x*|*x˙*_{2}=0 with given *u*}, respectively.

We can see that the nullclines change depending on whether the treatment is on (*u*=1) or off (*u*=0). This switching of the treatment modes in turn generates switching of the vector fields affording the emergence of a stable area where *x* can be kept in a bounded domain with small *x*_{2} (Ideta *et al.* 2008; Shimada & Aihara 2008; Tanaka *et al.* 2010) as shown in figure 3*a*. This is possible even if a stable equilibrium point never exists in a stationary situation of either permanent androgen withdrawal (*u*=1) or replacement (*u*=0); in fact there are only unstable equilibrium points that are cross points of and 𝒩_{2}.

The trajectory of IAS starting from *x*=(12, 0.25) is shown in figure 3*a*. Androgen withdrawal and replacement are switched according to the rule that (i) if *y* exceeds a given value *r*_{1} during an off-treatment period, we re-start withdrawal, and (ii) if *y* falls below another value *r*_{0} during an on-treatment period, then the dosing for androgen withdrawal is stopped. In figure 3*a*, *r*_{0} and *r*_{1} are set to be *r*_{0}=0.4 and *r*_{1}=13. The lines *y*=*r*_{0} and *y*=*r*_{1} are shown as dotted lines in figure 3*a*; the graph demonstrates that the state is confined to an area between the two dotted lines and also that it converges to a stable cycle. Thus IAS can equate to therapy that keeps the state in a safe stable region by on- and off-treatment cycles. Figure 4*a* shows the temporal changes of PSA under IAS. It confirms that our model can reproduce the dynamical behaviour of PSA similar to the clinically observed data in figure 1.

Next, we generate a relapse phenomenon in our model. In clinical practice, permanent androgen withdrawal terminates in the appearance of recurrent disease in the majority of cases. Figure 4*b* illustrates an example of such relapse behaviour in our model. Here the androgen-dependent cells decrease, while the mutation to androgen-independent cells is enhanced owing to continuous androgen withdrawal; the system goes to relapse owing to an increase in androgen-independent cells as shown in figure 3*b*.

Tanaka *et al.* (2010) discuss other mathematical models such as stochastic models and linear switching models (Ideta *et al.* 2008; Hirata *et al.* 2010). See that article and references therein for more details.

## 3. Optimal scheduling

### (a) Piecewise affine model

To optimize the treatment schedule, we explore the possibility of reducing the system of prostate cancer in equation (2.1) to a simpler version that is more manageable. Here we employ a PWA system (Sontag 1981; van der Schaft & Schumacher 2001). A hybrid system is generally one with both discrete and continuous variables (Witsenhausen 1966) as exemplified in equation (2.1). In particular, PWA systems are of a type where the state space is piecewisely divided into polyhedral regions, each of which is attached with affine dynamics. By introducing auxiliary discrete variables, we can describe discrete transitions between different continuous dynamics; driving a car with gear change is one example of such a PWA system. It is known that many kinds of problems appearing in analysis of PWA systems can be reduced to MIP (Grossmann 2002).

The PWA system modelling allows us to construct an explicit control law (Bemporad *et al.* 2000, 2002). The PWA system model introduced here is basically a PWA approximation of the system (2.1) using the parameter setting described in appendix A. The details of constructing the PWA system can be found in appendix A. Here we roughly sketch that. Let us define a feasible region where *x*^{max}_{1} and are the maximum values of *x*_{1} and *x*_{2}, respectively, with and *x*^{max}_{2}=15. The region is decomposed into polygons as shown in figure 5; splitting into polygonal regions makes the optimization procedure tractable. The decomposition is obtained by splitting the state space along the nullclines and 𝒩_{2} with additional dividing lines so that the dynamics of the PWA system becomes close enough to that of the original nonlinear system (2.1). With the resultant polygons *P*_{1}, …, *P*_{10}, the PWA system which we consider here can be written as3.1

For each *P*_{i} (*i*=1,…, 10), we determine affine dynamics (*A*_{i}, *f*_{i}) that is a local approximation of system (2.1). For this purpose, we employ a criterion of approximation, that is, each affine system well approximates the original nonlinear dynamics (2.1) on the boundary of the polygon. Figure 6 shows an example of dynamical behaviour of PSA in the PWA system. We can see that it faithfully reproduces actual clinical data (figure 1).

### (b) Model predictive control

Based on the model constructed above, the optimal scheduling is computed through model predictive control (MPC; see Bemporad & Morari 1999; Heemels *et al.* 2010 for details). For simplicity, we assume that we can observe the internal state *x*. To actually observe the internal state, an observer system like the Kalman filter is available (Kalman & Bucy 1961; Balluchi *et al.* 2002). Hereafter the optimal control law is designed under this assumption of observability. To obtain the optimal control law (Bemporad & Morari 1999; Bemporad *et al.* 2000, 2002), the system is converted to a discrete-time system. The MPC uses the input sequence as the solution of the following optimization problem:3.2where ||*x*||_{1}=|*x*_{1}|+|*x*_{2}|, *T*=7 (the unit of time which corresponds to *t*=0.6^{1} in equation (A1)) indicates the observation time step of PSA. We can reflect our dosing policy in the optimal scheduling through the weight *R*, which represents the therapeutic gain or loss related to the intensity of treatment as guided by clinical decision. For example, if a patient has an early stage cancer that can be managed conservatively but the side effects of treatment such as hot flushes are severe, then the use of large doses of drug might be avoided to reduce toxicity. In such a case, *R* should be assigned a large value. On the other hand, if a higher priority is given to suppressing the patient’s cancer as much as possible as opposed to avoiding side effects, *R* should be assigned a small value. We solve the MIP problem to minimize the cost function of equation (3.2).

We computed the optimally controlled solutions from an initial state *x*=(13, 2) with *R*=4 and *R*=9000, and illustrate the results in figure 7. Figure 7*a*i, *b*i indicates the temporal changes of *x*_{1} and *x*_{2}. Figure 7*a*ii, *b*ii shows the evolutions of PSA. The red dotted curves in figure 7*a*iii, *b*iii are unsuccessfully treated trajectories resulting in a relapse under permanent androgen withdrawal. The blue solid curves inside the black-arrowed circles, on the other hand, indicate the bounded trajectories when optimal control is realized. Figure 7*a*iv, *b*iv shows the transitions between on- and off-treatment periods. It should be noted that all of these results represent a type of IAS with short periods of on-treatment. When *R* is large as in figure 7*b*, the amount of drug administered is less and peak values of PSA are higher. By comparison, when *R* is small as in figure 7*a*, the amount of drug administered is more and the PSA values are smaller. Also, in this situation, the optimal trajectory moves around the edge of a safety–danger boundary, that is the boundary between the ‘danger area’ where a relapse cannot be avoided and the ‘safety area’ where PSA can be well regulated. By contrast, when *R* is large (figure 7*b*), the trajectory moves in a region of higher *x*_{1} and lower *x*_{2}. In this case, the androgen-independent cells are more suppressed. If we want to make the amount of drug as small as possible but still sufficient for constraining the number of tumour cells inside the safety range and simultaneously offer the possibility of a better quality of life, therapy with a large *R* is favoured. From the shapes of the PSA time series in figure 7*a*ii, *b*ii, it may be assumed that conventional IAS as applied in the clinic can be described as a treatment with a large *R*. This implies that such therapy does not excessively suppress androgen, but rather keeps the PSA level inside a safety region by appropriately reducing the dose intensity of therapy. However, our predictions indicate that there is room to improve the treatment schedule of IAS by optimal control as discussed in §4.

## 4. Discussion

The average amounts of drug dose and PSA over 2240 days are summarized in table 1. The average dosing rate is the rate of days in the on-treatment mode, which is expressed as . Shown in the table are the average dosing rates and the average PSA values under optimal control using the model. The corresponding data were also obtained from the simulation in figure 6 and an actual clinical case (figure 1*b*, patient 2) of IAS. We can see that the optimal control with *R*=9000 yields an average dosing rate that is about one-half of the conventional dose of intermittent therapy (table 1, clinical data). Moreover, it should be noted that even when *R*=4, the average drug dosing rate is lower than that with conventional IAS. With respect to the average level of PSA, it is best when *R*=4 as expected. At this *R* value, the average PSA is about two-thirds of the level achieved in the other treatments cases. This result strongly suggests that predictions based on optimal control can improve cyclic hormone therapy for prostate cancer.

If we impose a high penalty on drug dose (large *R*), the on-treatment is terminated when the PSA level is still relatively high. The result implies that intermittent therapy with short periods of on-treatment that fail to achieve sufficiently low nadirs is still compatible with keeping PSA in the safety range. This is different from actual clinical practice as illustrated in figure 1. On the other hand, if we impose a low penalty on drug dose (small *R*), the resultant data suggest that it is optimal to repeat androgen ablation while PSA is in a lower range taking care not to cross the separatrix in the state space. Our method makes it possible to flexibly derive optimal strategy according to clinical decisions that reflect the patient’s condition by selecting the appropriate value for *R*.

In reality, there is uncertainty about the dynamics of prostate cancer so that the system might not evolve as predicted. Therefore, it is preferable to employ a large *R* to obtain a conservative solution. Another approach to solve the uncertainty is to develop an optimal controlling method based on stochastic models as discussed in Tanaka *et al.* (2010). However, such an undertaking is beyond the scope of the present study, but will be addressed in future work. In conclusion, we believe that the combination of mathematical modelling and optimal control strategies as presented in this article will lead to more sophisticated and individualized treatment regimens for prostate cancer.

## Acknowledgements

The authors would like to thank Dr K. Akakura, Dr G. Tanaka, Dr A. M. Ideta, Dr T. Shimada, Dr Y. Hirata and Dr T. Takeuchi for their valuable discussions. This research is partially supported by the Japan Society for the Promotion of Science (JSPS) through its Funding Programme for World-Leading Innovative R&D on Science and Technology (FIRST Programme).

## Appendix A. Construction of the piecewise affine system

In this appendix, we illustrate how we constructed the PWA system that approximates the original nonlinear dynamical system (2.1). The feasible region where and *x*^{max}_{2}=15 is decomposed into polygons as shown in figure 8.

The decomposition is basically obtained by splitting the state space along the nullclines and 𝒩_{2}. Since is not a straight line when *u*=1, we approximated it as a line {(*x*_{1}, *x*_{2})|*x*_{1}=*α*_{2}/*β*_{2}}, where the *x*_{1} coordinate is that of with *x*_{2}→∞. Splitting lines that are not along the nullclines have been obtained so that the dynamics of the PWA system becomes close enough to that of the original nonlinear system (2.1). We name each polyhedral component *P*_{1},…, *P*_{9}.

For each *P*_{i} (*i*=1,…, 9), we attach affine dynamics (*A*_{i}, *f*_{i}) that is a local approximation of system (2.1). For this purpose, we employ a criterion of approximation, that is, each affine system approximates the original nonlinear dynamics (2.1) on the boundary of the polygon. However, it is impossible to satisfy that condition at every point on the boundary. Therefore, we choose appropriate boundaries where the dynamics coincides. On a boundary where either *x*_{1} or *x*_{2} is constant, it is possible to make the approximated affine dynamical system coincide with the original nonlinear one because the dynamics of equation (2.1) is linear if either *x*_{1} or *x*_{2} is constant. Hereafter, we will illustrate the details of our PWA system modelling.

We write the PWA dynamics asA1whereLet us introduce , and , where is the *x*_{2} coordinate of the horizontal line in . If there is no confusion, we abbreviate as . We can see that on or the dynamics of equation (2.1) is affine with respect to *x*_{2} or *x*_{1}, respectively. Therefore, we can determine *A*_{i} and *f*_{i} to agree with the original one on such a region. In the following, we illustrate the dynamics on each polyhedral region.

—

*P*_{1}. On*P*_{1}, we chose {(*x*_{1},*x*_{2})∈*P*_{1}|*x*_{1}=0} and as the boundaries where the dynamics of the PWA system agrees with the original one. When*x*_{1}=0, the original dynamics isTherefore, we obtain ,*f*^{1}_{1}=0, and*f*^{2}_{1}=0.When , the original dynamics isTherefore, we obtain

*a*^{3}_{1}=*m*(0)−*β*_{2}*x*_{2}, and . However, indicates*x*_{1}does not change temporally in this region, which is far from the original dynamics. To correct this discrepancy, we modified as , where*ϵ*_{1}>0 is a small constant.—

*P*_{2}. On*P*_{2}, we employed and as the boundaries where the dynamics of the PWA system agrees with the original one. Then we obtain by simple calculation as the case of*P*_{1}A2—

*P*_{3}. On*P*_{3}, we employed and as the boundaries where the dynamics of the PWA system agrees with the original one. Then we obtainA3—

*P*_{4}. On*P*_{4}, we employed and as the boundaries where the dynamics of the PWA system agrees with the original one. Then we obtainA4—

*P*_{5}. On*P*_{5}, we chose and as the boundaries where the dynamics of the PWA system agrees with the original one. Then we obtainA5Since*m*(0)=0 in our setting, indicates the temporal change of*x*_{2}vanishes in*P*_{5}. Thus as in the case of*P*_{1}, we modified where*ϵ*_{2}>0 is a small constant.—

*P*_{6}. On*P*_{6}, we employed and {(*x*_{1},*x*_{2})∈*P*_{6}|*x*_{2}=*xˆ*_{2}} as the boundaries where the dynamics of the PWA system agrees with the original one. This is the same situation as in*P*_{3}. Therefore, we obtainA6Hereafter we deal with the situation*u*=1. In this situation, the horizontal line of disappears from the feasible area . Alternatively we consider that is a*x*_{2}coordinate of at*x*_{1}=*x*^{max}_{1}. We can see that is an almost*x*_{2}-constant line in a region of large*x*_{1}. Thus we employ a line as an approximation of 𝒩_{2}.—

*P*_{7}. On*P*_{7}, we employed {(*x*_{1},*x*_{2})∈*P*_{7}|*x*_{1}=0} and as the lines where the dynamics of the PWA system agrees with the original one. Then we obtainA7—

*P*_{8}. On*P*_{8}, we employed and {(*x*_{1},*x*_{2})∈*P*_{8}|*x*_{2}=*x*^{max}_{2}} as the lines where the dynamics of the PWA system agrees with the original one. Then we obtainA8—

*P*_{9}. On*P*_{9}, we employed and {(*x*_{1},*x*_{2})∈*P*_{9}|*x*_{2}=*x*_{2}} as the boundaries where the dynamics of the PWA system agrees with the original one. Then we obtainA9

The vector field and the behaviour of the PWA system obtained above are shown in figures 9 and 10. Next we slightly modify the system to make it more plausible. Our original nonlinear model and the PWA system obtained above cannot demonstrate behaviour of a reduced velocity of PSA decrease, i.e. after the initiation of treatment a rapid decrease of PSA is observed; after a short period of time, the rate of decrease changes and becomes slower. To realize such biphasic behaviour we split *P*_{9} into two polyhedral components at (figure 11). The newly generated components are designated *P*_{9} and *P*_{10}.

Then the dynamics is reset asA10where *ω* is a scaling parameter. We employed *ω*=0.2 and .

The behaviour of PSA in our newly obtained PWA model is shown in figure 6 and more closely mimics the graph for patient 1 in figure 1.

## Footnotes

↵1

*t*=0.6 corresponds to about two weeks. Although a shorter interval allows for more precise control, in reality, it is difficult to switch between the on- and off-treatment modes so frequently.One contribution of 12 to a Theme Issue ‘Theory of hybrid dynamical systems and its applications to biological and medical systems’.

- © 2010 The Royal Society