## Abstract

We answer a question of Rudnick, largely in the negative, as to whether we have square root cancellation for error terms in moment calculations.

## 1. Background: Lang–Weil

Start with a finite field *k* and *X*/*k* separated of finite type, which is smooth and geometrically connected, of dimension *n*≥1. The Lang–Weil estimate [1] is the assertion that for variable finite extensions *K* of *k*, we have the estimate
Lang and Weil proved this by using its truth for curves, established by Weil, together with a fibration argument. From a modern point of view, Lang–Weil is best seen as resulting from Grothendieck's Lefschetz trace formula [2], combined with Deligne's estimates [3, Corollary 3.3.4]. For any prime ℓ not the characteristic *p* of *k*, we have
One knows that is one-dimensional, with Frob_{K} acting as (#*K*)^{n}, and, thanks to Deligne, that each is mixed of weight ≤*i* (for any chosen embedding of into ).

So the formula becomes with

## 2. Background: Deligne's equidistribution theorem

How does Deligne's equidistribution theorem relate to this? The situation is that we have a lisse -sheaf, ℓ≠*p*, on *X* which is pure of weight zero, of rank *r*≥1. Attached to it are its geometric and arithmetic monodromy groups . These are algebraic groups over . One knows, again by Deligne, that (the identity component of) *G*_{geom} is semi-simple, cf. [3, Corollary 1.3.9] and its proof, and Theorem 3.4.1 (iii).

Suppose that our has *G*_{geom}=*G*_{arith}. Embed into , view *G*_{arith} as a group over , and choose a maximal compact subgroup of the complex Lie group . Then for each finite extension *K*/*k*, and each *x*∈*X*(*K*), (the semi-simplification, in the sense of Jordan normal form, of) the Frobenius conjugacy class Frob_{x,K} meets in a unique -conjugacy class *θ*_{x,K}.

Deligne's equidistribution theorem asserts that as , the classes {*θ*_{x,K}}_{x∈X(K)} become equidistributed in , the space of conjugacy classes in , for (the direct image from of) Haar measure of total mass one, cf. [3, Theorem 3.5.3], [4, Theorem 3.6] and [5, Theorem 9.2.6].

The proof goes along the now usual lines, of estimating the appropriate Weyl sums. More precisely, for each irreducible non-trivial representation *ρ* of *G*_{arith}, we form the corresponding ‘pushout’ sheaf on *X*. By Peter–Weyl, what must be shown is that the large #*K* limit of
vanishes.

This sum is
in which is mixed of weight ≤*i*, and in which the highest term is (the Tate twist by −*n* of) the space of coinvariants of *G*_{geom} in the representation *ρ*. So the leading term vanishes
and we get the estimate

In view of Lang–Weil, we get

An equivalent formulation is this. Take any representation *σ* of *G*_{arith}, and denote by *N*(*σ*) the multiplicity of the trivial representation in *σ*. Thus, *N*(*σ*) is the dimension of , upon which Frob_{K} operates as the scalar (#*K*)^{n}. Write *σ* as the direct sum of *N*(*σ*) copies of the trivial representation with a finite sum of irreducible non-trivial representation *ρ* of *G*_{arith}, say *σ*=*N*(*σ*)1⊕*τ*, with *N*(*τ*)=0. For *N*(*σ*)1, i.e. for the constant sheaf , we have the tautological equality
For the sheaf , whose vanishes, the Lefschetz trace formula gives
By Deligne (and Lang–Weil), this last sum is , so we get
To the extent that the sum has a better estimate, e.g. because some of its vanish for large *i*, or have lower weight than allowed by Deligne's general theorem that has weight ≤*i*, we get a better estimate of the error term.

## 3. Rudnick's question

Zeev Rudnick raised what is, in hindsight, the obvious question:

If , when can we do better? When will we get ‘square root cancellation’, i.e. an estimate, for every irreducible non-trivial representation *ρ* of *G*_{arith},

Equivalently, when will we get an estimate, for every representation *σ* of *G*_{arith},

## 4. Examples showing a largely negative response

In the following sections, we will give examples in which some *σ*'s have square root cancellation, and in which many others do not.

Fix integers *N*≥*n*≥2, a prime *p*>2*N*+1, and a non-trivial additive character *ψ* of . For a finite extension, is a non-trivial additive character of *K*. Consider the *n* parameter family of sums, for each *K*, given by
There is a lisse sheaf on the of (*a*_{1},*a*_{2},…*a*_{n}) whose trace function is given by these sums:
This sheaf is lisse of rank *N* and pure of weight zero. One knows [6, Theorem 19] that for this sheaf we have

### Lemma 4.1

*After passing to a finite extension* *the sheaf* *on* *has*

### Proof.

First extend scalars to . For any finite extension , each Frob_{x,K} has its characteristic polynomial with coefficients in , so in particular has its determinant in . The key point is that this field has a unique place lying over *p*. So has absolute value 1 at each Archimedean place (purity), and is a unit at all finite places of residue characteristic ℓ≠*p* (existence of ℓ-adic cohomology). By the product formula, the determinant must be a unit also at , so is a root of unity of order dividing 2*p*. If we take an extension of odd degree, then the square of each Frob_{x,K} has such a determinant. Thus, we have inclusions

From these inclusions, we certainly have
so there exist an ℓ-adic unit *α* such that after the constant field twist *α*^{deg} of , we have *G*_{geom}=*G*_{arith}, cf. [7, Lemma 3.1]. It remains only to show that any such *α* is a root of unity. [For if *α*^{N}=1, then after extension of scalars from to , we will have *G*_{geom}=*G*_{arith} for .] To see that any such *α* is a root of unity, choose any point . Then both and lie in *G*_{arith}, indeed the latter lies in *G*_{geom}. Comparing determinants, both of which are roots of unity of order dividing 4*p*, we see that *α*^{N} is a root of unity of order dividing 4*p*. ▪

For the remainder of this section, and in the two sections to follow, we work with the sheaf on , with large enough that
We denote by std the given (‘standard’) *n*-dimensional representation of *G*_{arith}, and by std^{∨} the dual representation. We will be concerned with the representations
of *G*_{arith}, for each pair of integers (*A*,*B*) with 0≤*A*,*B*≤*n* (excluding the case *a*=*b*=0, the trivial representation). We denote
the dimension of the space of invariants in std^{⊗A}⊗(std^{∨})^{⊗B}, and by
the ‘empirical moment’

We know that *M*_{A,B} is the large *q* limit of . Our concern is with estimating the difference

## 5. Explicit calculation of

For any (*A*,*B*), the empirical moments and are complex conjugates of each other (after any embedding of into ). So we will assume from now on that

In the affine space , with coordinates (*x*_{1},…,*x*_{A}, *y*_{1},…,*y*_{b}), denote by the closed subscheme defined by the *n* equations
[In the case *B*=0, is the closed subschema defined by the *n* equations

### Lemma 5.1

*For* *a finite field of characteristic* *p*>*n*, *the points of* *have the following explicit description*.

*(1)**If**n*≥*A*=*B*>0,*then a point**lies in**if and only if the two lists*(*x*_{1},…,*x*_{A})*and*(*y*_{1},…,*y*_{A})*are rearrangements of each other, i.e. if and only the first**A**elementary symmetric functions agree on them*.*(2)**If**n*≥*A*>*B*≥0,*then a point**lies in**if and only if the two lists of length**A*, (*x*_{1},…,*x*_{A})*and*(*y*_{1},…,*y*_{B},0,0..,0) (*the second list obtained by padding out the list of**y*_{i}'*s by appending**A*−*B**zeros*)*are rearrangements of each other*.*(3)*(*a special case of*(*2*)*above*)*If**n*≥*A**and**B*=0,*the only point of**is*(0,…,0).

### Proof.

Because the characteristic *p*>*n*, for *A*≤*n* the equality of the first *A* Newton symmetric functions is equivalent to the equality of the first *A* elementary symmetric functions. ▪

### Lemma 5.2

*For* *n*≥*A*≥*B*≥0, *but* (*A*,*B*)≠(0,0), *and* *a finite field of characteristic* *p*>*n*, *we have*

### Proof.

Expand each term of the sum defining . By definition, we have
Its *A*′th power is then
The *B*th power of its complex conjugate is 1 if *B*=0, and for *B*>0 it is
So is times
Reversing the order of summation, and using orthogonality of characters, we see that is times
From the previous lemma, we know that for a point i(*x*_{1},…,*x*_{A},*y*_{1},…,*y*_{B}) in , the lists (*x*_{1},…,*x*_{A}) and (*y*_{1},…,*y*_{B},0,0..,0) are rearrangements of each other. The function vanishes at such a point, and hence this last sum is just . ▪

### Proposition 5.3

*For* *n*≥*A*>0, *and* *M*_{A,0}=*M*_{0,A}=0.

### Proof.

The first assertion is immediate from the previous two lemmas, and the second follows because *M*_{A,0} (resp. *M*_{0,A}) is the large *q* limit of (resp. of ). ▪

### Corollary 5.4

*If* *N*=*n*, *the group* *G*_{geom} *for our sheaf* *is*

### Proof.

In the previous section, we have seen that over we have inclusions (remember *N*=*n* here)
Hence is a lisse rank one sheaf on which is of order dividing 2. But the group vanishes, because *p* is odd. So we have inclusions
We must rule out the possibility that *G*_{geom} is *SL*(*n*). But if it were, then , would be a geometrically trivial summand of , and *M*_{n,0} would be non-zero. ▪

### Proposition 5.5

*Suppose* *n*≥*A*>*B*>0. *For* *C*:=*A*−*B*, *we have* *M*_{A,B}=0,
*and* *has a non-zero large* *q* *limit*.

### Proof.

In this case, 0<*A*−*B*<*n*≤*N*, so already the scalars in *SL*(*N*), namely *μ*_{N}, act by a non-trivial character, namely the *A*−*B*th power of the ‘identical’ character *ζ*↦*ζ*, in the representation
A point in is of the form (*x*_{1},…,*x*_{A},*y*_{1},…,*y*_{B}) such that at least *C*:=*A*−*B* of the *x*_{i} vanish, and such that the list of (at most *B*) non-vanishing *x*_{a}s is a rearrangement of the list of non-vanishing *y*_{b}s. Now break up by the number *d* of distinct non-zero *x*_{a} in a point. There is exactly one point whose *d* is zero. For given *d* with *B*≥*d*≥1, the number of points with *d* distinct non-zero *x*_{a} is the product of with a strictly positive combinatorially defined integer, call it *D*(*A*,*B*,*n*,*d*). Thus, we have
Dividing by , we see that
is
▪

### Proposition 5.6

*For* *n*≥*A*≥1, *we have the following results.*

*(1) For A*=1,*and**M*_{1,1}=1.*(2) For A*=2,*we have**(3) For n*≥*A*≥3,*we have*

### Proof.

Assertion (1) is immediate from the fact that . Assertion (2) is immediate from the fact that

For *n*≥*A*≥3, we break up by the number *d* of distinct coordinates *x*_{a} in a point. The number of points with precisely *d* distinct *x*_{a}'s is the product of with a strictly positive combinatorially defined integer, call it *D*(*A*,*A*,*n*,*d*).

We have *D*(*A*,*A*,*n*,*A*)=*A*! and [The term is to specify on each side the placement of the double root, and the term (*A*−2)! is to specify the reordering of the *A*−2 simple roots.

So looking at the two highest order terms, we have
Expanding out , we get
Thus is
Dividing through by *q*^{A} gives the assertion. ▪

## 6. Cohomological consequences

We have seen in Lemma 5.2 that, up to a factor , *M*_{A,B} is a polynomial in *q*, in principle quite explicit. A natural question is the extent to which we can infer from such information the vanishing, or non-vanishing, of various cohomology groups. Here are some results along this line.

Let us begin with the fact that . By the Lefschetz Trace Formula, this is equivalent to
Already the trace on the is *q*^{n}. This suggests that vanishes for *i*≠2*n*. We will now show that this is in fact the case. Here is an equivalent formulation.

The sheaf has a direct sum decomposition in which is the subsheaf of endomorphisms of trace zero. The fact that is thus equivalent to

### Lemma 6.1

*The cohomology groups* *all vanish*.

### Proof.

Compute the cohomology via the Leray spectral sequence for the projection

It suffices to show that all the vanish. By proper base change, it suffices to do this fibre by fibre. On the fibre over the point , say with values in some finite extension , we have the polynomial
and our sheaf on this fibre is the (naive) Fourier Transform of . So the restriction of to this fibre is geometrically irreducible, and its *M*_{1,1}(*k*) is 1, by the same calculation as above. Therefore, the restriction to this fibre of has no (because on this fibre is geometrically irreducible), and the alternating sum of traces of Frob_{k} on its is zero. On the other hand, its vanishes (because is lisse on an open curve), and hence its must vanish, as all powers of Frob_{k} have trace zero on this . ▪

At the opposite extreme, we have the following result.

### Lemma 6.2

*For* *n*≥*A*≥1, *the cohomology group*
*is non-zero and its subspace of highest weight* 2*n*−1 *is non-zero*.

### Proof.

This is immediate from proposition 5.5. First it gives the vanishing of the . Then it tells us that
is , and that after division by , its large *q* limit is non-zero. By Deligne, the for *i*<2*n*−1 have lower weight, so we get the asserted non-vanishing of the weight 2*n*−1 part of the . ▪

### Lemma 6.3

*For* *n*≥*A*≥2, *the weight* 2*n*−2 *part of*
*is non-zero, and has dimension at least* *A*(*A*−1)*A*!/4, *but its weight* 2*n*−1 *part vanishes*.

### Proof.

By proposition 5.6, we have
This already shows that the weight 2*n*−1 part of vanishes. If we look at the parts of weight 2*n*−2, only and are possibly non-zero, and we get
We rewrite this as
which gives the asserted result. ▪

## 7. Another example

We fix an odd integer *n*≥3, and a prime *p* not dividing *n*(*n*−1). We consider, in characteristic *p*, the two parameter family of hyperelliptic curves
over the open set of , parameters (*a*,*b*), where the discriminant of *x*^{n}+*ax*+*b*, namely
is invertible. For this family of curves, its *H*^{1} along the fibres, Tate twisted by , is a lisse sheaf on of rank 2*g*=*n*−1 which is pure of weight zero. Its trace function at a point (*a*,*b*) with values in a finite extension is given by
Here denotes the quadratic character of , extended by zero to all of . To define , we fix a choice of in and then define to be the appropriate power of .

One knows that for this , we have *G*_{geom}=*G*_{arith}=*Sp*(*n*−1), cf. [8, Theorem 5.4 (1)]. In particular, the standard representation is irreducible, and hence *M*_{1,0}=0, i.e. vanishes. Moreover, we have

### Lemma 7.1

*For any finite extension* .

### Proof.

By definition, is times the sum If this sum extended over all , it would vanish; simply reverse the order of summation, i.e. write it as and note that the innermost sum vanishes.

So it remains to show that

The condition Δ(*a*,*b*)=0 is the condition
which we rewrite as
This means precisely that (−*a*/*n*,*b*/(*n*−1)) is of the form (*t*^{n−1},*t*^{n}) for a unique . So our sum is
For *t*=0, the inner sum becomes , which vanishes because *n* is odd. For *t*≠0, we use the fact that *x*^{n}−*nt*^{n−1}*x*+(*n*−1)*t*^{n} is homogeneous in *x*,*t* of degree *n*, so we write it as *t*^{n}(*X*^{n}−*nX*+*n*−1) with *X*:=*x*/*t*. The sum over *t*≠0 becomes
which is the product
in which the first factor vanishes (again because *n* is odd). ▪

In fact, we have the following explanation of this vanishing.

### Lemma 7.2

*The cohomology groups* *all vanish*.

### Proof.

The idea is simply to imitate, cohomologically, the argument given above.

We first define a sheaf on all of which agrees with our previously defined on and whose trace function at any point is

For this, we consider the sheaf on the of (*x*,*a*,*b*), with the understanding that this sheaf has been extended by zero across the points where *x*^{n}+*ax*+*b*=0. For the projection of onto given by *pr*(*x*,*a*,*b*):=(*a*,*b*), vanishes for *i*≠1 (check fibre by fibre). The Tate-twisted sheaf is the desired .

We wish to show that all the groups vanish. Using the excision long exact sequence we are reduced to showing the vanishing of all the groups and of all the groups .

To show the vanishing of the groups , we note first that, from the construction of as (a Tate twist of) the only non-vanishing , namely the *R*^{1}, we have
To show that these groups vanish, we use the projection *pr*_{1,2} of onto given by (*x*,*a*,*b*)↦(*x*,*a*).

For this projection, all the vanish, as one sees looking fibre by fibre (the cohomological version of summing over *b*).

To show that the groups all vanish, we use the construction of once again, this time to write
where the in question is that of (*x*,*t*). By excision on this , it suffices to treat separately the open set , coordinates *x*,*t* and the line *t*=0. On this line, with coordinate *x*, we are looking at the groups
which all vanish. On the product , we make the (*t*,*x*/*t*) substitution to write our sheaf as the external tensor product of on the first factor with on the factor. The vanishing then results from Kunneth, because on the second factor all the groups vanish (again because *n* is odd). ▪

Thanks to a marvelous formula of Davenport–Lewis, we do have square root cancellation for .

### Lemma 7.3

*We have* .

## 8. A third example

We fix an even integer *n*≥4, and a prime *p* not dividing *n*(*n*−1). We consider, in characteristic *p*, the two parameter family of hyperelliptic curves
over the open set of , parameters (*a*,*b*), where the discriminant of *x*^{n}+*ax*+*b*, namely
is invertible. For this family of curves, its *H*^{1} along the fibres, Tate twisted by , is a lisse sheaf on of rank 2*g*=*n*−2 which is pure of weight zero. Its trace function at a point (*a*,*b*) with values in a finite extension is given by
One knows [8, Theorem 5.17 (1)] that for this , we have *G*_{geom}=*G*_{arith}=*Sp*(*n*−2). In particular, the standard representation is irreducible, and hence *M*_{1,0}=0, i.e., vanishes. However, in contradistinction to the case when *n* is odd, we have the following lemma.

### Lemma 8.1

*We have*

### Proof.

Here the discriminant Δ(*a*,*b*) vanishes precisely when

in other words when (*a*,*b*) is of the form (*a*,*b*)=(*nt*^{n−1},(*n*−1)*t*^{n}) for a unique . Thus, there are *q* points in at which Δ vanishes. By definition, is times the sum
If this sum extended over all points (*a*,*b*) in , it would be *q*^{2} (from summing the term 1); the sum over all (*a*,*b*,*x*) of vanishes (for each (*a*,*x*), sum over *b*).

The sum over the points where Δ vanishes is the sum
In this second sum, the sum over the points (0,*x*) is *q*−1 (because *n* is even). For each *t*≠0, we write
with *X*:=*x*/*t*. Because *n* is even, for each *t*≠0 the sum over *x* of is independent of *t*, equal to the quantity
So all in all, the sum over the points where Δ vanishes is

So is times the quantity

One checks easily that the polynomial *x*^{n}+*nx*+*n*−1 has no triple roots, and that its unique double root is *x*=−1. We readily compute that
Thus *P*_{n−2}(*x*) is square free. As *x*^{n}+*nx*+*n*−1 vanishes at *x*=−1, we have
The value of *P*_{n−2}(*x*) at *x*=−1 is *n*(*n*−1)/2) (L'Hôpital's rule), so we get
with
Here is the trace of on *H*^{1} of the complete non-singular model of the hyperelliptic curve *y*^{2}=*P*_{n−2}(*x*)) of genus (*n*−4)/2. In particular,

Thus, is times the quantity Thus, ▪

### Lemma 8.2

*The cohomology group* *vanishes, but the weight* 3 *part of* *is one-dimensional, and* *acts on it as* *q*^{3/2}.

### Proof.

The vanishing of the is the fact that *M*_{1,0}=0. By the Lefschetz trace formula, is (1/(*q*(*q*−1)) times the two term sum
From our estimate that , we see that this sum is −*q*^{3/2}+*O*(*q*). As is mixed of weight ≤*i*, we get the asserted result. ▪

### Remark 8.3

The reader may be concerned by the apparent sign ambiguity in the statement above, that the eigenvalue of on the weight three part of is *q*^{3/2}. Here is a more intrinsic way to say this. Instead of , consider the sheaf which is the *H*^{1} along the fibres of our family of curves *y*^{2}=*x*^{n}+*ax*+*b*. In terms of , we defined to be the one-half Tate twist , which involved a choice of and a consequent determination of . The sheaf is pure of weight one, the cohomology group is mixed of weight less than or equal to 4, and what is being asserted is that its weight 4 part is one-dimensional, with acting as *q*^{2}.

Exactly as in lemma 7.3, the Davenport–Lewis formula gives square root cancellation for .

### Lemma 8.4

*We have* .

### Proof.

By definition, is (1/*q*)(1/(*q*(*q*−1)) times the sum
Expanding the square, this is
If these last two summations extended over all , the first would vanish, and the second would be *q*^{2}(*q*−1) by the Davenport–Lewis formula. So our sum is

The summands for (*a*,*b*)=(0,0) are *q*−1 and (*q*−1)^{2}, respectively, so both are *O*(*q*^{2}). For each of the *q*−1 summands with (*a*,*b*)≠(0,0) but Δ(*a*,*b*)=0, the polynomial *x*^{n}+*ax*+*b* has precisely *n*−1 roots, of which *n*−2>0 are simple roots. In particular, this polynomial is not geometrically a square, so the Weil bound gives
So all in all, the total contribution of the Δ=0 terms is *O*(*q*^{2}), and our sum over is *q*^{2}(*q*−1)+*O*(*q*^{2}). Dividing through by *q*^{2}(*q*−1) gives the asserted result. ▪

## Footnotes

One contribution of 8 to a Theo Murphy meeting issue ‘Number fields and function fields: coalescences, contrasts and emerging applications’.

- © 2015 The Author(s) Published by the Royal Society. All rights reserved.