## Abstract

We extend the notion of universality of a function, due to Turing, to arbitrary (countable) effective domains, taking care to disallow any cheating on the part of the representations used. Then, we can be certain that effective universal functions cannot simulate non-effective functions.

## 1. Introduction

Alan Turing, in his groundbreaking paper, read in 1936, on the undecidability of the halting problem and the Entscheidungsproblem [1], also invented the notion of a universal machine. He explained the idea as follows:
*The universal computing machine.* It is possible to invent a single machine which can be used to compute any computable sequence. If this machine *I* is supplied with a tape on the beginning of which is written the [standard description] of some computing machine *M*, then *I* will compute the same sequence as *M*.[1, p. 241]

There are two domains that are standard in discussions of computability: (i) finite strings over finite alphabets, for which the Turing-computable partial functions are the effective ones, and (ii) the natural numbers, for which the effective functions are identified with the partial recursive ones. Davis [2] and Rogers [3] have proposed general definitions of universality for Turing machines and for partial recursive functions, respectively.

For other (countable) domains, we will adopt the analogous notion of effectiveness of a model of computation that was developed by Boker & Dershowitz [4]. Armed with the appropriate concept of generic effectiveness (§3), we extend the notion of universality to arbitrary sets of functions over arbitrary domains (§5), while addressing oft-ignored questions of ‘honesty’ of representations of domains (§4), pairings of elements (§6) and encodings of programs (§2). Our main result (theorem 5.2) assures us that—under fairly weak conditions—representations cannot boost computational power; so an effective universal function cannot encompass non-effective functions.

## 2. Function encodings

In the simplest case, a universal function simply ‘carries on its shoulders’ a whole set of functions. The functions we speak of may be partial (unless stated otherwise). In particular, a universal function is meant to simulate both total and partial functions. So universal functions are partial by their very nature, and may be undefined at some points.

Let *Φ* be some (usually, but not necessarily, countably infinite) set of unary functions over a domain *D*. Given a binary function *ψ* over *D*, let *Ψ*={*ψ*_{a}: *a*∈*D*} be the set of all specializations (cross sections) *ψ*_{a}=*λy*. *ψ*(*a*,*y*) of *ψ* in its first parameter. We say that *ψ* is *universal* for *Φ* if , where is the superset relation. This is the same as requiring there to be an *encoding* such that *φ*=*ψ*_{#φ} for all *φ*∈*Φ*. Equality of the two partial functions, the original one *φ* and the simulation *ψ*_{#φ}, means that *φ*(*y*)=*ψ*(#*φ*,*y*) for all *y*∈*D*, where equality (here, as well as later) is ‘Kleene's’; so the two sides must also agree with regard to the values at which either is undefined, in which case, both must be undefined.

We are placing no demands on the encoding (#), only that there be some parameter value (*a*=#*φ*) for which the universal function (*ψ*) exhibits the identical argument-value relation as that of the given function (*φ*). The fundamental idea of universality is that one function can by itself cover a collection of functions, in an *extensional* sense, but not that it necessarily uses the same, or similar, means as the given programs.

If one wants to incorporate functions of greater arity than 1, then one requires a family of universal functions, one for each arity, which we may imagine as one and the same varyadic function. In that case, we demand that *φ*(*y*_{1},…,*y*_{ℓ})=*ψ*(#*φ*,*y*_{1},…,*y*_{ℓ}), for all *φ* of arity ℓ, and values *y*_{1},…,*y*_{ℓ}.

In practice, of course, we are interested in *effective* universal functions—functions that can be expressed as programs. Then, a single universal function *ψ* provides an effective means of computing any and all computable functions *φ*, just by varying *ψ*'s first parameter. We discuss what it means to be effective in §3.

In particular, nothing in our definition explicitly rules out a perverse, non-computable permutation of a standard enumeration of programs, like assigning even codes to total (universally halting) *φ* and odd codes to strictly partial ones. Such an encoding, however, would preclude a universal function *ψ* being effective, because, were it to be, then *λi*. *ψ*_{2i} would be an effective enumerator of programs for all the total recursive functions, an impossibility. If the universal function is itself effective, then no encoding can endow it with the ability to simulate an uncomputable function. This is because the encoding #*φ* is simply a constant; thus, it can only supply finitely much additional information to *ψ*.

In point of fact, normally one is provided with a set of programs (standard descriptions of Turing machines, say) in some formalism (that is, programming language), which specifies the computed functions *Φ*, albeit with infinite duplication (there are always infinitely many programs with the same input–output behaviour). There is no harm in this, as then, from our point of view, #*φ* is the code (e.g. the Gödel number) of any one out of the infinitely many programs that compute *φ*, which, as it happens, is an uncomputable coding. The related notion of an *interpreter* of one programming language in another, on the other hand, would suggest that the interpreter has an effective way of understanding the programs it is ‘interpreting’; in that case, we would want distinct codes for distinctly behaving programs. This way or that, our definition of universality captures the intended extensional containment: a function *ψ* is universal for a set of programs if its specializations *Ψ* form a superset of the functions *Φ* computed by the given set of programs.

## 3. Effectiveness over arbitrary domains

For domains other than strings or numbers, we also need to be able to speak of effectiveness and effective universality. To that end, we need the characterization of effectiveness from Boker & Dershowitz [4]. For one thing, we may presume that all elements of the domain are reachable by means of given effective operations. Were that not the case, then even constant functions would be deemed non-effective. Moreover, all ‘basic’ operations that are provided initially by a model of computation need to be effective. To be sure of that, one may simply require that the domain is generated by (finitely-many free) constructors and that initial operations are programmable from nothing more than said constructors. For example, the positive integers in binary are generated by the constant 1, and two unary operations, ⌢ 0 for × 2, and ⌢ 1 for ×2+1. Thus, an effective model (like a random-access machine) may incorporate basic arithmetic operations that are programmable using these constructors.

We also assume that there is an effective test of equality for the domain. Insisting that the domain be generated by (free) constructors, as opposed to (finitely many) arbitrary operations, which may provide more than one way to refer to the same domain element, precludes the hiding of information in equalities between the values of distinct terms.

Besides the requirement that the given basic operations be programmable, effectiveness for general domains is just what one would expect. This generic notion of effectiveness allows us to talk formally about the effectiveness of operations from one domain to another, by requiring that everything can be built out of constructors for the union *C*∪*D* of the domains. That said and done, in what follows, we need not dwell on programming details.

The precise definition of effectiveness over arbitrary (countable) domains given in Boker & Dershowitz [4] is based on the generic abstract-state machine formalism of Gurevich [5]. States can be any (isomorphism-closed) class of logical structures over some fixed finite vocabulary. Furthermore, transitions must be effectively describable. This is formalized by saying that there is a fixed finite set of terms that ‘determines behaviour’ of the algorithm in a sense made precise by Gurevich [5] and Blass *et al*. [6]. (Transitions must also preserve the domain of states and commute with isomorphism.) In Boker & Dershowitz [4] and Dershowitz & Gurevich [7] have shown how the Church–Turing thesis, namely, that no effective model can do more than a Turing machine can, follows provably from the formalization of the earlier-mentioned characterization. Two other natural characterizations of effectiveness, based on Turing machines [8] and recursive functions (along the lines of Rabin [9] and others), are shown to be equivalent to this one in Boker & Dershowitz [10].

A *successor* function *s*:*C*→*C* is a unary function that enumerates its domain: for some *e*∈*C*. For any finitely generated domain, there clearly are effective successor functions. Just effectively enumerate all terms (or just all constructor terms), thereby enumerating all domain values, while skipping over duplicate values (if required). Therefore, by standard search methods, any effective total injection *f* has an effective inverse *f*^{−1} such that *f*^{−1}(*f*(*x*))=*x* for all *x* in the domain. (Even the inverse of a partial injection *f* can be computed for its domain of definition by a breadth-first search, bounding the number of steps the program for *f*(*x*) is allowed to run on any given potential inverse *x*. This method may also be employed to enumerate the domain when some operations are partial.)

## 4. Domain representations

To relate sets of functions (extensional models of computation) over two different domains, *C* and *D*, we will employ a liberal notion of ‘simulation’, with inputs mapped from *C* to *D* via an arbitrary (not explicitly effective) injective representation *ρ* and outputs by the same, or another, representation *σ*. For example, one typically represents a Platonic number *n*, as used by recursive functions, as a string in unary (1^{n}) or in binary. Conversely, a string can be viewed as a number in a base the size of the alphabet. By the same token, graphs and other data structures can be represented as strings. What we do not really want is for the representation to include non-trivial computational information, like whether a graph has a Hamiltonian cycle.

### Definition 4.1 (Simulation of functions)

For partial functions, and , and (injective) representations , we write and say that *h* *simulates* *g* via *ρ* and *σ*, if *g*=*σ*^{−1}°*h*°*ρ*, qua partial functions, that is, if *σ*(*g*(*x*))=*h*(*ρ*(*x*)) for all *x* in *C*, where equality is Kleene's and *σ* is strict (in the sense that *σ* of undefined is undefined).

### Definition 4.2 (Simulation of models)

For sets of partial functions, and , we write and say that *H* *simulates* *G* via *ρ* and *σ* if for all *g*∈*G*, there exists *h*∈*H* such that . We say that *H* *simulates* *G*, and write simply , if for some choice of representations *ρ* and *σ*.

When *ρ* and *σ* are the identity function, is the superset relation, . Simulation is transitive and reflexive.

When *σ*=*ρ*, it was shown in Boker & Dershowitz [11], proof of theorem 4.7 that *ρ* is necessarily effectively computable (over *C*∪*D*) if there is an effective function that simulates a successor function *s*∈*G*. (The significance of a successor was noted by Shapiro [12].) Even when *σ*≠*ρ*, it turns out that both must be effectively computable, provided that—in addition—the identity function is simulated effectively.

### Proposition 4.3

*If there are effective simulations over domain* *D* *of the successor and identity functions of domain* *C*, *via representations* *ρ* *and* *σ*, *then* *ρ* *and* *σ* *are effectively computable for* *C*∪*D*.

### Proof.

Let *s* be a successor function for *C*, starting from constant *e*∈*C*, and let * i* be the identity. It follows from the definitions that , and

*(*

*i**y*)=

*y*, for all

*x*,

*y*∈

*C*, where the hats indicate the respective simulating functions over

*D*. Putting those together, we have , from which it follows that .

As explained at the end of §3, the inverse of the simulation of the injection is computable for elements of the image *σ*(*C*) of *σ* in *D*. Hence, *ρ* is effectively computable because *ρ*(*e*) is just some constant, while *ρ*(*s*(*x*)) can be computed by a composition of effective functions from *ρ*(*x*).

Furthermore, is computable. ■

When functions take more than one argument, the representation function *ρ* should be extended to tuples: *ρ*〈*x*_{1},…,*x*_{ℓ}〉=〈*ρ*(*x*_{1}),…,*ρ*(*x*_{ℓ})〉. Otherwise, the earlier-mentioned definitions are unchanged.

## 5. Universality across domains

Using the notion of simulation, we arrive at the following generic definition of a universal function.

### Definition 5.1 (Universality)

Let *Φ* be some set of unary functions (over a domain *C*). A binary partial function *ψ* (over domain *D*) is *universal* for *Φ* if *Ψ* (={*λy*. *ψ*(*a*,*y*): *a*∈*D*}) simulates *Φ*.

In other words, *ψ* is universal for *Φ* if there is some injective input representation , and an injective output representation , plus an arbitrary encoding of the functions in *Φ*, such that *ψ*(#*φ*,*ρ*(*x*))=*σ*(*φ*(*x*)), for all *φ*∈*Φ* and *x*∈*C*. Note that if *ψ* is universal () and *ψ*′ simulates *ψ* (hence, ), then *ψ*′ is also universal ( by transitivity) (cf. [13], theorem 1).

Our main result is that an effectively computable universal function can only simulate effective functions, as long as it simulates the constructors of a domain. Thus, the representation cannot in fact provide the universal function with any information that might allow computation of the uncomputable.

### Theorem 5.2

*Let Φ be some set of unary functions over a domain C, including the successor and identity. Then, if there is an effective universal function (over any domain D) for Φ, then all the simulated functions φ∈Φ are also effective.*

### Proof.

Let *ψ* be the universal function. We have *φ*=*σ*^{−1}°*ψ*_{#φ}°*ρ*, for any *φ*∈*Φ*. By proposition 4.3, *ρ* and *σ* are effective because the simulations (, , etc.) of the successor and identity (*s*, * i* and the like) are effective. So

*φ*is an effective composition of effective functions. ■

In other words, the universal function cannot underhandedly simulate harder functions than it itself is capable of computing.

Nevertheless, just because *ρ* is effective does not mean that it cannot hide some information, albeit computable, in the representation, simply by mapping *x*∈*C* to a ‘tuple’ [*x*,*f*(*x*),*g*(*x*),…] in *D*, for finitely many computable functions *f*, *g*, etc., such as Hamiltonianism. (The square brackets here stand for any standard tupling operation that effectively converts a sequence into a single element.) But if there is an effective injective *ρ* and universal function *ψ*, it can be shown that there is also an effective bijective *ρ*′ and universal function *ψ*′, with the latter doing all the hard work. So, restricting the notion of simulation to bijective representations, though that is not the way things are usually done, could make sense.

## 6. Data pairing

One potential problem with the earlier-mentioned definition of a universal function is that some models of computation—like Turing machines—do not take their inputs separately, but, rather, all functions are unary (string-to-string for Turing machines). In such cases, one needs to be able to represent pairs (and tuples) as single elements. One standard pairing function for the naturals (used by Gödel) is the injection 〈*i*,*j*〉:=2^{i}3^{j}. Another, among many, is the (Cantor) bijection , or this one (used by Kalmár and others): 〈*i*,*j*〉:=2^{i}(2*j*+1)−1. For strings, one usually uses an injection like 〈*u*,*w*〉:=*u*;*w*, where ‘;’ is some symbol not in the original string alphabet.

There are several ways to proceed. The pairing function could reside in the source domain *C*, or in the target domain *D*, or in the representation of *C* as *D*. Nevertheless, this need raises a critical issue. Unless we demand that pairing be effective, there could be a machine that does too much, computing even non-effective functions. For example, a naive definition might simply ask that pairing be injective and that *φ*(*x*)=*ψ*〈#*φ*,*x*〉 for all *φ*∈*Φ* and *x*∈*D* (omitting parentheses around the pair). The problem is that an injective pairing could cheat and include the answer in the ‘pair’. In contrast with encodings, which can only provide a finite amount of information regarding a function *φ*, a dishonest pairing 〈#*φ*,*x*〉 may add an infinite quantity, by providing different data for each argument *x* of *φ*. For Turing machines, say, the pair 〈*u*,*w*〉 might be represented as *u*;*w* when machine *u* halts on input *w* and as *u*:*w* (with a colon instead of a semicolon) when it does not. Better yet, one could map 〈#*φ*,*y*〉↦[*φ*(*y*),#*φ*,*y*], where the square brackets are some ordinary tupling function for the domain. Then a putative universal machine could effortlessly ‘compute’ virtually anything, computable or otherwise, just by reading the encoded input.

So, we clearly need for pairing to be effectively computable, as Davis and Rogers also insist. But we are talking about models in which no function takes two arguments; so we might not have an appropriate notion of computable binary function at our disposal. To capture effectiveness of pairing in such circumstances, we demand the existence of component-wise successor functions. Let be a pairing function for *D*. The component-wise successor functions operate as follows: *s*_{1}:〈*a*,*b*〉↦〈*s*(*a*),*b*〉 and *s*_{2}:〈*a*,*b*〉↦〈*a*,*s*(*b*)〉. If *s*, *s*_{1} and *s*_{2} are effective, then we will say that *pairing is effective*. This is because one can program pairing so that , where *z*=*s*^{i}(*e*) and *y*=*s*^{j}(*e*). And if pairing is effective, then its two projections (inverses), *π*_{1}:〈*a*,*b*〉↦*a* and *π*_{2}:〈*a*,*b*〉↦*b*, are likewise effective. (Generate all representations of pairs in a zig-zag fashion, until the desired one is located. What the projections do with non-pairs is left up in the air.)

Another concern is that requiring pairing to be computable is too liberal for the purpose. One does not really want the pairing function to do all the hard real work itself. For example, the mapping could include *φ*(*x*) in the pair, at least for *φ* that are known to be total (like, for the primitive recursive functions, of which there are infinitely many). That would make it a trivial matter to be universal for those functions—just transcribe the answer from the input.

If, in addition to being effective, pairing is bijective, then we will deem it *honest*, since then there is no room for hiding information. For such a pairing with effective projections, there is an effective means of forming a pair 〈*a*,*b*〉 (by enumerating all of *D* until the two projections give *a* and *b*, respectively). Note that, with bijectiveness alone, without effectivity, one could still hide a fair amount of uncomputable information in a bijective mapping. For instance, imagine that 0 is the code of the totality predicate and that the rest of the naturals code the partial-recursive functions in a standard order. Map pairs (*i*+1,*z*) to 3〈*i*,*z*〉, where 〈⋅,⋅〉 is a standard bijective pairing; map (0,*z*) to 3*j*+1 when *z* is the (code of the) *j*th total (recursive) function; map (0,*z*) to 3*k*+2 when *z* is the *k*th non-total (partial recursive) function. Now, let *U* be some standard effective universal function. Then, for *y* divisible by 3, *ψ*(*y*):=*U*(*y*/3) would cover all the partial-recursive functions, whereas *ψ*(*y*):=*y*≡1 (mod 3) would yield the uncomputable totality predicate, when *y* representing (0,*z*) is not divisible by 3.

In the presence of a (not necessarily honest) pairing function, we say that *ψ* is universal if , where, this time, *ψ*_{a}=*λy*. *ψ*〈*a*,*y*〉.

### Definition 6.1 (Unary universality)

Let *Φ* be some set of unary functions (over a domain *C*). A unary function *ψ* (over domain *D*) is *universal* for *Φ*, via injective pairing function 〈⋅,⋅〉 (over *D*), if *Ψ* (={*λy*. *ψ*〈*a*,*y*〉: *a*∈*D*}) simulates *Φ*. If, in addition, pairing is bijective, then we call the universal function *honest*.

That is, *ψ* is universal if *ψ*〈#*φ*,*ρ*(*x*)〉=*σ*(*φ*(*x*)) for *φ*∈*Φ* and *x*∈*C*. Of course, we are interested in the case where both pairing and the universal function are effective.

### Theorem 6.2

*Let Φ be some set of unary functions over a domain C, including the successor and identity. Then, if there is an effective unary universal function (over any domain D) for Φ, via an effective injective pairing, all the simulated functions φ∈Φ are also effective.*

### Proof.

Apply theorem 5.2 to *ψ*′(*z*,*y*):=*ψ*〈*z*,*y*〉. ■

Suppose *Φ*={*φ*_{z}}_{z} is some standard enumeration of (the definitions of) the partial-recursive functions. On the basis of Davis's second definition of a universal Turing machine (which relies on a notion of effective mappings between strings and numbers, namely, recursive in Gödel numberings), Rogers defines the property ‘universal(III)’ of a unary numerical function *ψ* to be that *φ*_{z}(*x*)=*γ*(*ψ*〈*z*,*x*〉) for some effective (recursive) bijection *γ* and effective (but perhaps dishonest) pairing 〈⋅,⋅〉. Let us say that such a *ψ* is *Rogers universal*.

We may conclude the following from the definitions.

### Theorem 6.3

*If a function is Rogers universal, then it is universal in the sense of definition 6.1. Furthermore, there is an honest effective universal function.*

### Proof.

Let *ψ* be the given Rogers-universal function. Take the standard enumeration for the encoding #, identity for the input representation *ρ* and *γ*^{−1} (which is well defined and effective) for the output representation *σ*. Then, *ψ*〈#*φ*_{z},*ρ*(*x*)〉=*ψ*〈*z*,*x*〉=*σ*(*φ*_{z}(*x*)), as required.

For the second part, take some bijective pairing 〈〈⋅,⋅〉〉 with effective projections *π*_{1} and *π*_{2}, and let *ψ*′:=*λr*. *ψ*〈*π*_{1}(*r*),*π*_{2}(*r*)〉 be our honest universal function. Putting those together, we get *ψ*′〈〈*z*,*x*〉〉 =*ψ*〈*z*,*x*〉=*σ*(*φ*_{z}(*x*)). ■

## 7. Discussion

We have generalized the classical notion of universality to apply to arbitrary models over arbitrary domains. Indeed, for every effective model in the sense of Boker & Dershowitz [10], there is a universal program *ψ* over its own domain, which can be expressed as an effective abstract state machine. Moreover, any such *ψ* is universal for every effective model over any other effective domain, via a bijective representation between domains. (A bijection can be induced from the successor functions of the two domains.) We have seen that these universal functions cannot simulate more than can actually be done effectively within the model of computation itself.

Some directions for further thought are the following.

— Extend the notion of universal evaluation to partial evaluations as in Kleene's

*s*_{mn}theorem [14] and the Futamura projections [15].— Consider steppers and interpreters, which take programs and simulate their step-by-step behaviour, in a controlled fashion. Abstract state machines provide a means for capturing the step-by-step behaviour of arbitrary classical algorithms.

— Clarify the relation between a universal abstract state machine and non-effective universal functions [16], §6.

— Consider a Turing machine simulation in which initial tapes may have infinite effective content, rather than being all blank at the ends, an issue that has been raised in the recent Wolfram competition [17].

## Acknowledgements

We thank the referees for their expeditiousness and their helpful suggestions.

## Footnotes

↵† Research done partly while on leave at INRIA-LIAMA (Sino French Lab in Computer Science, Automation and Applied Mathematics), Tsinghua University, Beijing, China.

↵‡ This work was carried out in partial fulfillment of the requirements for a PhD degree at Tel Aviv University.

One contribution of 18 to a Theme Issue ‘The foundations of computation, physics and mentality: the Turing legacy’.

- This journal is © 2012 The Royal Society