## Abstract

The identification of production functions from data is an important task in the modelling of economic growth. In this paper, we consider a non-parametric approach to this identification problem in the context of the spatial Solow model which allows for rather general production functions, in particular convex–concave ones that have recently been proposed as reasonable shapes. We formulate the inverse problem and apply Tikhonov regularization. The inverse problem is discretized by finite elements and solved iteratively via a preconditioned gradient descent approach. Numerical results for the reconstruction of the production function are given and analysed at the end of this paper.

## 1. Introduction

A crucial ingredient of every economic growth model is the choice of the production function. A common choice and a good approximation for highly developed countries is a concave production function (e.g. the Cobb–Douglas production function [1]). However, if one wants to account for effects like poverty traps (see [2]), one might want to choose a wider class of production functions, e.g. functions that are convex for low and concave for higher values. One of the first models with a non-concave production function was introduced by Skiba in 1978 [3] (similar ideas for epidemics can be found in [4,5]).

As the shape of the production function has such a great impact on the economic model, we consider the problem of identifying the production function from data about economic development. The underlying economic model is the spatial Solow model (see [6–9]), an extension of the so-called Solow growth model, introduced in 1956 by Solow [10], it being the first attempt to describe long-run growth by an analytical model. The idea to include regional components regained attention in the last two decades by the introduction of the new economic geography by Krugman [11].

The parameter identification problem of identifying the production function from (noisy) data is an ill-posed inverse problem in the sense that the solution does not depend continuously on the data. We therefore have to introduce regularization to stabilize the computations, which we will do by using a Tikhonov regularization.

Before discussing the inverse problem, we will introduce the spatial Solow model in §2. In §3, we will formulate the parameter identification problem and examine its regularization. The iterative solution of the problem and the numerical results are described in §4, followed by a brief outlook in §5.

## 2. The spatial Solow model

By following the ideas of Camacho *et al*. [7], we assume a continuous space structure, as it is reasonable to suppose that in modern economies all locations have access to goods. Let *k*(*x*,*t*) denote the capital stock held by the representative household located at *x* at date *t*, for , *n*∈{1,2} and *t*≥0. The budget constraint of household *x*∈*Ω* is then modelled as
2.1
where *A*(*x*,*t*) denotes the technological level at *x* and time *t*, *f* is the production function and *δ* is the depreciation rate. The standard neoclassical production function is assumed to be non-negative, increasing and concave, and verifies the Inada conditions, that is,
2.2
We will depart from the assumptions with respect to concavity in particular around zero as well as the first Inada condition and allow for general convex–concave production functions, an example being
2.3
If we choose *p*>1, *f* will be convex for small values of *k* and concave for large values (figure 1). Such examples of *f* are of particular interest, because they are related to the potential existence of poverty traps and the identification of those in the solution of an inverse problem, respectively.

Using an interval of the form *I*=[0,*K*], the set of admissible production functions *F*_{adm} is defined as
2.4
*f*′_{max} being a fixed constant, which can be understood as the maximal growth that an economy is capable of. Especially underdeveloped countries can have very high growth rates for short time periods (even far more than 100%), but it is clear that for every country the possible growth is still limited.

In addition to (2.1), we assume that the initial capital distribution, *k*(*x*,0), is known and that there is no capital flow to infinity, that is,
2.5
For a bounded *Ω*, see below after equation (2.7).

The technological level is determined via a diffusion equation of the form
2.6a
2.6b
2.6c
with *g*_{A} being either constant, a function depending only on space or a function depending on space as well as on time. Thus the model reads as
2.7
with initial condition *k*(*x*,0)=*k*_{0}(*x*)>0 for *x*∈*Ω*. Condition (2.5) is being approximated in this case by the homogeneous Neumann boundary condition, representing no capital flow through the boundary and thereby a closed economy.

We scale this model by introducing the scaled variables and . As there is no characteristic quantity for *k*, we will not introduce any scaling. Although the model is theoretically set up to have an infinite time horizon, we of course need to choose an endpoint, denoted by *T*, to solve this model numerically. This leads us to the following scaled version of the model (by directly renaming to *x*,*t*, respectively):
2.8a
2.8b
2.8c
with *d*=1/*δL*^{2} and *g*(*k*,*x*,*t*)=(*A*(*x*,*t*)/*δ*)*f*(*k*)−*k*.

Using ideas from [12], we can show that the model (2.8) is well posed.

### Theorem 2.1 (existence and uniqueness of classical solutions)

*Let k*_{0}*∈C*^{2}*(Ω),* *and let t*_{0}*>0. Let* *be Lipschitz continuous and A∈C(Ω×[0,T]). Then problem (2.8) has a unique solution k∈C*^{1}*([0,T],C*^{2}*(Ω)).*

### Proof.

The assertion follows by using theorems 12, 13 and 14 in [12, ch. 11] and theorem 3.6 in [9]. ▪

The existence of weak solutions for less regular initial values and *f*∈*F*_{adm} can be verified by an approximation argument:

### Theorem 2.2 (existence and uniqueness of weak solutions)

*Let* *and let T>0, f∈F*_{adm}*, and A∈C(Ω×[0,T]). Then problem (2.8) has a unique weak solution k∈L*^{2}*([0,T],H*^{1}*(Ω))∩H*^{1}*([0,T],H*^{−1}*(Ω)). Furthermore, this solution is globally bounded.*

### Proof.

We can find a globally bounded sequence such that in *L*^{2}(*Ω*) and a sequence such that *f*^{m}≡*f* outside *I* and in . Noticing that the right-hand side *g*(*x*,*t*,*k*)=(*A*/*δ*)*f*(*k*)−*k* is negative for *k* sufficiently large, we can employ a maximum principle and conclude uniform boundedness of *k*^{m} and hence of *k*. Moreover, the boundedness of *f*′ and *f*(0) implies
which allows us to conclude standard *a priori* estimates for *k*^{m} in *L*^{2}([0,*T*],*H*^{1}(*Ω*))∩*H*^{1}([0,*T*],*H*^{−1}(*Ω*)) by multiplying the equation with *k*^{m} and integrating over space and time. Thus, there exists a weakly convergent subsequence *k*^{mℓ}, which converges strongly in *L*^{2}(*Ω*×[0,*T*]) due to the Aubin–Lions lemma (see [13]). The latter and the regularity of *f* can be used to show that indeed , which finally implies that *k* is a bounded weak solution of (2.8). Uniqueness follows with a similar estimate, noticing that also
▪

Further, we can see that the solution of problem (2.8) depends continuously on the reaction term *f*, and, hence, gives us the well-posedness of (2.8).

### Theorem 2.3 (continuous dependence)

*Let k*_{i} *be a weak solution of (2.8) with f*_{i}*∈F*_{adm} *for i=1,2, such that k*_{i}*(x)∈I a.e. in Ω×[0,T]. We then have*
2.9
*for some constant C depending only on f′*_{max}.

### Proof.

See theorem 3.7 in [9]. ▪

A more detailed analysis of this model can be found in [8]. It is shown how different parameters of the model, such as the depreciation rate and the technology terms, affect the capital distribution over time. Furthermore, the simulation results show that a concave–convex production function is in fact capable of recreating the effect of poverty traps.

## 3. Identification of production functions

As mentioned above, the choice of the production function is crucial for an economic model, as its shape will greatly influence the capital distribution. However, there are many different possibilities to make this choice and often it is not clear which production function will fit the situation best. A reasonable question would be: Is it possible to identify the underlying (real) production function from data about the capital distribution of some spatial economy?

In general, data about the economic situation, such as the gross domestic product (GDP), of different regions and different countries are readily available. For example, the spatial interdependence among economies is examined on the basis of 204 European regions over a time frame of 23 years in [14]. However, when using real data, it is necessary to adjust and extend the model to be able to get more realistic results instead of qualitative results (see §5).

We will stick to our qualitative model and assume that we have data, denoted by *k*(*x*,*t*), matching this model and try to identify the corresponding true production function *f*. Of course, we cannot assume to have exact data, because the measurement of economic factors in itself is a difficult task and, thus, we expect to have data that are corrupted by some kind of noise. In our case, we assume additive Gaussian noise, i.e. we have data of the kind
with noise level *ε*. These measurements are obtained in a subset of *Ω*×[0,*T*]. Complete measurements would indeed mean obtaining *k*^{ε} in the full set, but there are at least two relevant examples of measurements in subspaces. The first is the case when one only measures capital around a few discrete time steps, another is when one is able to monitor the capital *k* only in a limited spatial region. In particular, the first is a motivation for using a variational framework for the inverse problem instead of direct natural linearization methods (see [15]). In the second, one would probably also need to consider the identification of the initial value in the unaccessible region.

To put this in mathematical terms, we have the nonlinear parameter-to-solution map *F*:*F*_{adm}→*L*^{2}([0,*T*];*L*^{2}(*Ω*)), mapping the production function *f* to the respective capital distribution *k*_{f}, which solves (2.8),
3.1
The results of §2 directly imply the well-definedness and continuity of *F*. We now want to identify the production function *f* from the noisy data *k*^{ε}, which we expect to be an ill-posed inverse problem. We can formulate this problem in terms of an optimization problem, i.e. we want to minimize
with being a weighted *L*^{2} norm, that is,
3.2
Here, *χ* can be used to account for the above-mentioned case of incomplete data by setting *χ*(*x*,*t*)=0 if there is no measurement available at location *x* and time *t*.

Let us first discuss the issue of identifiability in the case of perfect data, i.e. given *k*=*F*( *f*). We can rewrite the equation as
which implies that *f*(*K*) is uniquely determined for each *K* such that there exists (*x*,*t*) in the interior of the measurement domain with *k*(*x*,*t*)=*K* and *A*(*x*,*t*)≠0. By a continuity argument, *f* is determined on the closure of the set consisting of such *K*. On the other hand, it is apparent that *f* cannot be determined uniquely on an interval of values not appearing in the measurements.

The conditions of the admissible parameter set *F*_{adm} already act as a regularizer, i.e. in the above formulation the inverse problem is not ill-posed if we look for a least-squares solution. In numerical tests, we will see that in practice additional regularization is not really necessary in situations where the data are good. However, we will nonetheless introduce an additional Tikhonov regularization term, which leads to the optimization problem
3.3
with *f** being some kind of *a priori* guess for *f*. We set the interval *I*=[0,*K*_{max}] for to be the domain of the parameter, such that
As we only have information for this interval, it is clear that it is not possible to identify the parameter outside of this domain. Note however that, for typical noisy data, very large values of *k*^{ϵ} will be mainly caused by noise, so that Tikhonov-type regularization can be quite useful in such regions, which we shall observe again in the numerical results below. The basic existence result is given by:

### Theorem 3.1

*Let J*_{β} *be defined as in (3.3) and let β≥0. Then there exists a minimizer of J*_{β} *in F*_{adm}.

### Proof.

The admissible set is a closed and compact subset of the space of continuous functions. The continuous dependence result in theorem 2.3 yields the continuity of the data term in the functional, while the lower semicontinuity of the regularization term follows from its convexity. Hence, there exists a minimizer of *J*_{β}. ▪

We obviously cannot expect the functional *J* to be convex (unless *β* is very large, which is not a relevant case) due to the nonlinear nature of the operator *F*, hence the uniqueness of minimizers is left open.

Note that existence is even true in the case *β*=0. In order to understand the effect of regularization, we examine the Lagrangian and the corresponding optimality conditions
3.4
and the corresponding optimality conditions for the direct problem
3.5
for the adjoint equation
3.6
and for the parameter
3.7
where *f*_{s} denotes the generalized derivative of *f* and is such that *h*(0)=0, *h*_{s}(*s*)≥0 if *f*_{s}(*s*)=0 and *h*_{s}(*s*)≤0 if *f*_{s}(*s*)=*f*′_{max}. If *β*=0, one needs to expect a bang–bang-type behaviour in the solution, i.e. several subintervals with either *f*_{s}=0 or *f*_{s}=*f*′_{max}. Note that the last optimality condition for *β*=0 implies (due to the arbitrary choice of *h*) that if none of the two bounds is satisfied
which is not to be expected for most *s*. In the case *β*>0, one observes that *f*−*f** solves a second-order differential equation on *I*, hence more regularity is expected. These findings will be illustrated by numerical experiments in the following.

## 4. Numerical solution and results

In this section, we will show some numerical results of the parameter identification problem introduced above. We will start by giving some details about the simulated dataset used for the calculations and then show the identification results for a constant technological level and for a space-dependent technology term.

### (a) Simulated dataset

For simplicity, we will restrict the computations to the case of one spatial dimension and use a simulated capital distribution on [0,1]×[0,*T*] shown in figure 2, with *L*=50, *T*=150 and *δ*=0.05. An equidistant grid with 250 nodal points has been used for both the space and the time domain, which leads to a spatial-step size Δ*x*=0.2 and a time-step size Δ*t*=0.6. The classic second-order difference quotient has been used to discretize the diffusion. The time derivative in the resulting semi-discrete model has then been discretized by the backward difference quotiont, leading to the backward Euler method.

We will keep these values fixed throughout this whole section. We obtained the data by solving the direct problem with the production function (figure 3*a*)
the initial capital distribution shown in figure 3*b*, a constant technology term *A*(*x*,*t*)=1 and a constant starting point *f**(*k*)=1 for all *k*.

To ensure that we have sufficient data for the reconstruction especially for small values of *k*, we used an initial capital distribution with *k*_{0}(*x*)=0 on parts of the spatial domain.

In the case of incomplete data, it seems natural to choose . However, owing to measurement noise and also in the case of incomplete data, this might underestimate the real size of the interval. Moreover, the variational regularization effectively yields zero derivative of *f* at *k*=*K*_{max}, so one should make sure that *K*_{max} is large enough to avoid artefacts from this boundary condition. In our experiments, it turned out that is a reasonable choice as an upper bound for the interval *I*, hence we here report results with this value.

### (b) Numerical results

To solve the minimization problem (3.3), we will apply the gradient descent algorithm. In [9], it has been shown that the operator *F* is Fréchet differentiable and, hence, *J*_{β} is Fréchet differentiable as well. The computation of the derivative of *J*_{β}( *f*) will require high numerical effort. To reduce this effort, we will not compute the derivative directly but use the directional derivatives of the Lagrangian. We can calculate *J*′_{β}( *f*)*h* by subsequently solving and and evaluating .

Note that solving these equations also includes solving the direct problem as well as the adjoint problem. For both these computations, a different (usually coarser) discretization has been used. In this case, both the space and time domain have been discretized using only 150 nodal points instead of the 250 nodal points used in creating the simulated dataset. To apply the more refined data to the coarser grid, a simple interpolation has been used.

We use a simple steepest descent method and a backtracking line search method to find a minimum of our functional *J*_{β} (for further information on optimization algorithms, see, for example, [16]).

However, we want the parameter *f* to fulfil some regularity constraints, i.e. *f* should be in *H*^{1}(*I*). Furthermore, we will fix the boundary values *f*(0)=0 and *f*(*K*_{max})=1, which means that . We get the weak update formula
which, by using integration by parts, leads to
Thus, the update involves the solution of another partial differential equation
with the aforementioned boundary values.

#### (i) Constant technological level

We will give some numerical examples for the reconstruction of the production function (see [8,9]).

We assume we have the given data *k*^{εi}∈[0,1]×[0,*T*] with *i*=1,2,3 (figures 2, 4 and 5) with the respective noise level

The actual noise levels and the relative error *ρ*_{i}=∥*k*_{f†}−*k*^{εi}∥_{L2}/∥*k*_{f†}∥_{L2} are shown in table 1.

We will start by considering *k*^{ε1} and *k*^{ε2} and first try to reconstruct the production function without additional Tikhonov regularization, i.e. *β*=0. In the case of noise-free data, we are able to identify the production function quite well (figure 6*a*). If we have the data *k*^{ε2}, the reconstruction is similar but not as good as in the first case (figure 6*b*). Note that the peak around *k*=20 is merely an artefact as the true capital distribution does not provide information for larger values of *k*. By adding Tikhonov regularization, we can smoothen this peak and further improve the reconstructions in both cases (figure 7).

Now, we take a look at the data *k*^{ε3}. Table 1 shows that the noise level *ε*_{3} and the relative error *ρ*_{3} are comparatively large, thus suggesting that a good reconstruction might not be possible. Indeed, we see that the reconstruction develops a very strong peak around *k*=1. The steep increase however can still be identified, although no additional regularization is used (figure 8*a*). Even with regularization, we are not able to smooth this area (figure 8*b*). Note that the exact data include an area with low capital and an area with high capital, connected by an area where the amount of capital increases greatly due to the steep increase of the production function. By looking at the data *k*^{ε3}, we see that despite the large noise these two areas can still be distinguished. Thus, the information about the steep increase of the production function is still contained in the data and therefore can be reconstructed. Especially the area with low capital, however, is greatly affected due to the high noise level and a reasonable reconstruction, even with a high amount of additional regularization, is not possible. Keep in mind that we are using a very simple noise model. It is not at all clear that the same noise model holds for low-capital and high-capital regions. A more realistic noise model could greatly influence the quality of the reconstructions for a real dataset.

Table 1 summarizes the results of the above computations.

#### (ii) Space-dependent technological level

We will now look at the case where the technological level *A*(*x*,*t*) is not constant over both space and time but is of the form seen in figure 9*a*. The corresponding capital distribution is shown in figure 9*b*.

In figure 10, we can see that the reconstruction is again pretty good even without additional regularization. Also, the reconstruction for higher values is better than in the previous case with constant technology. This is simply due to the fact that higher values for the capital are present in the data (figures 2 and 9*a*). We can further smoothen the solution by adding Tikhonov regularization, but as the original reconstruction is already very good, it does not improve the solution too much.

The behaviour for noisy data is the same as in the case of constant technology and therefore we will not go into detail on this. Also, having a technology term that depends not only on space but also on time (see (2.6)) has no influence on the reconstruction. In general, the technology term only influences the capital distribution and therefore only indirectly the reconstruction of the production function.

## 5. Conclusion and outlook

We were able to reconstruct the production function from data corresponding to a spatially augmented Solow model with different noise levels. Furthermore, different technology terms have been examined and we have seen that the reconstruction does not depend on the respective technological level. In general, for the production function to be identifiable, it is clear that the dataset needs to provide enough data for the respective capital level. Additional Tikhonov regularization, although not essential, can be used to improve the reconstructions.

An interesting point to examine further is the question of how much data is *enough*. This is especially interesting considering the case where real data are used. Using real data, however, calls for a more realistic model.

Usually, in real datasets, one does find values like the gross value added or the GDP for different countries and/or different regions. Instead of introducing a spatial dimension to the Solow model, often the Solow model is simply applied to every region individually and additional weighting parameters are introduced that represent the interaction between neighbouring regions (see [11,14,17]).

To incorporate real data into a spatially augmented Solow model, one needs to compute the Solow model in two space dimensions. Also the above-mentioned neighbouring effects need to be included, e.g. borders can influence the diffusion of capital and technological progress between regions and regions have different effects on each other depending on the relative distance. Detailed models of such might lead to graph Laplacians, respectively, non-local diffusion operators instead of the standard diffusion.

Furthermore, it is not realistic to assume that every region has the same potential for economic growth. At least a division between mobile workers and immobile peasants seems to make sense to describe the situation realistically, resulting in systems of equations. This leads not only to migration effects and worker agglomerations but also to an economy with two different sectors: the manufacturing sector with increasing returns to scale and the agricultural sector with constant returns (see [6]).

Furthermore, we simply assumed additive Gaussian noise in our model. This might of course not be realistic and, using real data, a more realistic noise model might be necessary to reliably reconstruct the production function.

## Funding statement

The work of R.E. has been supported by the German Research Exchange Office DAAD, the work of M.B. has been supported by the German Science Foundation DFG through the project *Regularization with Singular Energies*.

## Acknowledgements

The authors thank Prof. Dr Davide La Torre (University of Milan) for fruitful discussions.

## Footnotes

One contribution of 13 to a Theme Issue ‘Partial differential equation models in the socio-economic sciences’.

- © 2014 The Author(s) Published by the Royal Society. All rights reserved.