Continued Fractions and Modular Forms

Ilan Vardi

IHES

Algorithms Seminar

April 3, 2000

[summary by Cyril Banderier]

A properly typeset version of this document is available in postscript and in pdf.

If some fonts do not look right on your screen, this might be fixed by configuring your browser (see the documentation here).

Abstract
This incursion into the realm of elementary and probabilistic number theory of continued fractions, via modular forms, allows us to study the alternating sum of coefficients of a continued fraction, thus solving the longstanding open problem of their limit law.



1   Introduction

For the readers of these proceedings,1 it is not a secret anymore that the continued fraction expansion of p/q, the Jacobi symbol (p/q), or the gcd of two integers (p,q), or even Gauss' lattice reduction algorithm cover phenomena of similar computational complexity. However for continued fractions, two distinct cases have to be considered: the continuous and the discrete case. The discrete case deals with continued fraction expansions of rational numbers whereas the continuous case deals with continued fractions of real (irrational) numbers.

For the continuous model, given the apparatus of ergodic theory, many basic results on continued fractions fall as application of more general theorems (see Chapter 9 in Paul Lévy's book [11]). Ergodic theory, which was guessed by Maxwell and formulated by Boltzmann, concerns itself principally with quantifying how points in a continuous space evolve under iteration of a transformation. The ergodic theorem (due to Birkhoff in 1931) states that for almost all initial points x0 of the continuous space E with measure l,
 
lim
n®+¥
1
n
n
å
j=1
f ( Tj(x0) ) = ó
õ
 


E
f(ydl(y).

For the discrete model, there is some kind of effect that precludes the use of ergodic theory. At least, results from the continuous model may serve as a heuristic for guessing corresponding facts about the discrete world.

What one would need in order to make this heuristic rigorous is a kind of ``ergodic theory with an error term.'' This is to some extent afforded by the introduction of Ruelle operators (see the works by Brigitte Vallée in 1995) and of modular forms (see the works by Ilan Vardi in 1987--1993) in the discrete model.

The main object of this lecture is the alternating sum of coefficients of a continued fraction. A motivation for studying this is the evaluation of Legendre symbol (d/c), which essentially expresses whether d is a perfect square modulo c. This symbol can be evaluated using the Euclidean reduction (d/c) = (dmod c/c) (where c, d are odd) and the quadratic reciprocity law
æ
ç
ç
è
c
d
ö
÷
÷
ø
= (-1)(c-1)(d-1)/4 æ
ç
ç
è
d
c
ö
÷
÷
ø
.
In fact, it was shown by Rademacher [12] that
æ
ç
ç
è
d
c
ö
÷
÷
ø
=(-1)
(3 - a - d + c å (-1)i ai)/4
 
,
where d/c=[0,a1,...,ar] (brackets stand for the expansion in continued fraction) and 0<a,d<c, adº1mod c with c and r odd. Note that d-1mod c can itself be computed using the Euclidean algorithm.

There is also a geometrical motivation: the alternating sum expresses the number of times that a geodesic winds around the cusp of a modular surface.

In the continuous case, Guivarc'h and Le Jan [7] established that the average alternating sum converges to a Cauchy distribution with characteristic function exp(-p|t|/(2ln 2)). For the discrete case, the stumbling block is that even the expected asymptotics estimated s(d,c)~12/p2ln cln ln c is unproved (s(d,c) stands for the sum of the coefficient of the continued fraction of d/c). The problem remained open until Vardi found another approach (see [15]) via Dedekind sums.

2   Dedekind Sums

The Dedekind sum appears to be have been mistakenly defined and instead should have been defined as the alternating sum of continued fraction coefficients. Historically, the Dedekind sum is defined for relatively prime integers d and c as
s(d,c)=
c-1
å
h=1
((hd/c))((h/c)),
with the notation ((x)) = 0 (if x is an integer), and x - ë xû - 1/2 otherwise.

This sum was introduced by Dedekind in 1876 while editing Riemann's collected works. He used this sum to express the functional equation of the Dedekind h function
h(z) = e
p i z/12
 
¥
Õ
n=1
(1 - e
2p i n z
 
)
which, he proved, satisfies
   ln h æ
ç
ç
è
az+b
cz+d
ö
÷
÷
ø
=
ì
ï
ï
í
ï
ï
î
ln h(z) +
1
2
ln (cz+d) +
p i
12
æ
ç
ç
è
-3+
a+d
c
-12s(d,c) ö
÷
÷
ø
,
for c > 0,
lnh(z)+
p ib
12
,
when c=0,
    (1)
where Á(z)>0, g=(
 a b 
 c d 
), and a, b, c, d are integers satisfying ad-bc=1. Note that
ln h(z) =
p iz
12
-
¥
å
m,n=1
e
2p imn z
 
m
,
so ln h is holomorphic for Á(z)>0. Using the functional equation (1) Dedekind proved a fundamental identity for Dedekind sums, namely the reciprocity law
s(c, d) =
c
d
+
d
c
+
1
cd
- s(d, c).

Note that the definition of the Dedekind sum gives that s(d, c) = s(dmod c, c) and the reciprocity law relates the value of s(d,c) to s(c,d). It follows that s(d,c) can be computed by using the Euclidean algorithm, so it should be expressible in terms of the continued fraction expansion of d/c. In fact, this is the statement of a result found independently by three authors in 1977:
Theorem 1  [Barkan [2], Hickerson [9], Knuth [10]]   If [0,a1,a2,...,ar] is the regular continued
fraction expansion of
d/c with r odd (with d<c and 0<a<c such that adº 1 mod c) then
s(d,c)=
1
12
æ
ç
ç
è
-3+
a+d
c
-
r
å
i=1
(-1)iai ö
÷
÷
ø
.

It remains to find the distribution of the values of the Dedekind sums when c and d range over large intervals. Vardi did so by using Paul Lévy's theorem (details in Section 4) and for this, needed to justify several approximations. To this aim, let us recall a few facts about modular forms and Kloosterman sums, since these objects appeared to be the key to the asymptotic analysis.

3   Modular Forms

The group SL(2,Z) acts on the upper half complex plane H by (
 a b 
 c d 
z=az+b/cz+d. One now considers subgroups G of SL(2,Z) containing every matrices of SL(2,Z) congruent to the identity matrix in SL(2,Z/NZ). Every such group G has a fundamental domain: an open set DÌ H such that for all zÎ H there is at most one g Î G with g(z)Î D and at least one g Î G with g(z)Î D.

Definition. A modular form of weight k is a holomorphic function on H satisfying:

Definition. A non-holomorphic modular form of weight r and multiplier system c is a function f(z) on H satisfying:
1') f(gz)=c(g) æ
ç
ç
è
cz+d
|cz+d|
ö
÷
÷
ø
r



 
f(z)   (for gÎ G),  and  2')   ó
õ
ó
õ
 
D
| f(x+iy) | 2
dxdy
y2
<¥.

Condition 2') shows that the non-holomorphic modular forms form a Hilbert space L2(D, c, r) under the Petersson inner product
á f,gñ =ó
õ
ó
õ
 
D
f(z)g(z)yr
dxdy
y2
.
The Kloosterman sum (introduced by Kloosterman in 1927 in a refinement of the Hardy--Littlewood circle method) is defined by
S(m,n,c)=å e
2p i(ma+nd)/c
 
,
where the sum ranges over d<c for adº1mod c and gcd(d,c)=1.

In 1948, André Weil proved the estimate S(m,n,c)=O(c1/2+e). Asymptotics of sums of Kloosterman sums is a vivid subject, e.g., this ``kloostermania'' recently succeeded [5] to prove that there are infinitely many numbers of the form x2+y4. Sums of Kloosterman sums exhibit strong cancellations that can be estimated by making use of modular forms.

Generalized Kloosterman sums (for some subgroup G of SL(2,Z)) are defined by
S(m,n,c,c,G)=åc(g) e
2p i ( (m-a)a+(n-a)d ) /(qc)
 
,
where the sums ranges over g=(
 a c 
 c d 
)Î G with 0 < a < qc and 0 < d < qc. In the sum, q is the smallest integer such that (
 1 q 
 0 1 
)Î G and a is defined by e-2p i a=c (
 1 q 
 0 1 
) with 0£a<1.

Goldfeld and Sarnak's formulation (see [6]) of Kuznetsov's trace formula gives
 
å
c < N
S(m, n, c, c, G) =
 
å
1/2 < sj < 1
tj(m, n, c, G)
N
2sj-1
 
2sj
+ O(N
b/3 + e
 
),     (2)
where b is the best constant that can be put in the estimate S(m, n, c,c, G) = O(cb +e), and the sum is over exceptional eigenvalues sj (defined hereafter) of the operator Dr=y2(2/ x2+2/ y2)-iry/ x. Since Dr has a self-adjoint extension to L2(D, c, r), its spectrum is discrete and real: there is a sequence of eigenvalues going to infinity, with only a finite set of negative eigenvalues which correspond to holomorphic modular forms if r is an even integer. The non-negative eigenvalues are simple except the case l = 1/4, which could have multiplicity 2.

According to Selberg's notation, one writes an eigenvalue as l = s(1-s), with Â(s) ³ 1/2. It follows that there is a finite number of exceptional eigenvalues for which l < 1/4. An exceptional eigenvalue corresponds to s > 1/2, while the other eigenvalues have Â(s) = 1/2 (note the analogy with the Riemann hypothesis).

For a given non-holomorphic Poincaré series Pm (see [4]), the Petersson product á Pm,ujñ gives the mth Fourier coefficient rj(m) of the eigenfunction uj (which is a modular form) associated to the eigenvalue sj (1-sj). This allows to make explicit the tj's of the formula (2):
tj(m, n, c, G) =
q2rj(m)rj(n) ( p2 (m-a)(n-a)/q2 )
1-sj
 
 
G(sj+r/2) G(2sj-1)
(-i)rpG(sj-r/2)

Finally, the following theorem provides the link between a sum of generalized Kloosterman sums and a sum whose asymptotics allows the application of Lévy's theorem:
Theorem 2  [Vardi [14]]  
e
p i r/2
 
 
å
0 < d < c
gcd(d, c) = 1
e
2p i r s(d, c)
 
=S ( 1,1,c,cr,SL(2,Z) ) ,
where cr=e2p i r (a+d/c-3-12s(d,c)).

4   Limiting Distribution

One says that an arithmetic function f(n) has a limiting distribution F(x) if
 
lim
N®¥
1
N
| n<N:f(n)<x } | =F(x).
In other words, one takes a histogram of values of the function f(n) and looks at its shape. One method of showing that an arithmetic function has a limiting distribution is due to Paul Lévy (see examples in [13]).

Theorem 3  [Paul Lévy]   If there exists a function g(t) continuous in 0 such that
 
lim
N®¥
1
N
 
å
n<N
eitf(n)=g(t),
then f(n) has a limiting distribution F(x) satisfying g(t)=ò-¥¥eit x dF(x).
This is simply the probabilist's terminology for the Fourier transform; g is the characteristic function of the distribution. In order to prove the limiting distribution result for Dedekind sums (and thus for alternating sums of continued fraction coefficients) one applies Lévy's theorem to s(d, c)/ln c. What we want to prove is the estimate
 
lim
N®¥
 
å
0<d<c<N
gcd(d,c)=1
e
its(d,c)/ln c
 
| { 0<d<c<N:gcd(d,c)=1 } |
=e
-|t|/(2p)
 
,     (3)
where the right-hand side corresponds to the characteristic function of the Cauchy distribution
ó
õ
¥


-¥
a eity
a2 + y2
 dy = e
-a|t|
 
,
with a = 1/(2p). The well-known estimate [8]
| { 0<d<c<N:gcd(d,c)=1 } | =
3N2
p2
+O(Nln N)
shows that the sought estimate (3) can be rewritten as
 
å
0< d < c <N
gcd(d, c) = 1
e
it s(d, c)/ln c
 
~ e
-|t|/(2p)
 
3N2
p2
.
Proving such a formula presents a number of technical difficulties. For example, one would like to remove the absolute values on the right hand side, and the bothersome 1/ln c term in the exponential. The first point is solved since s(c-d, c) = - s(d, c) so that the left-hand side is independent of the sign of t. Consequently, one may restrict attention to t>0. The second point is solved noting that the ln function does not vary very much, and that for most values of c < N, ln c is almost equal to ln N. An estimate obtained by the continued fraction formula for Dedekind sums and the subsequent upper bound of S(d/c) £ (ln N)3/2 +e for almost all d < c < N, shows that-5mm
 
å
0< d < c <N
gcd(d, c) = 1
e
it s(d, c)/ln c
 
=
 
å
0< d < c <N
gcd(d, c) = 1
e
it s(d, c)/ln N
 
+ O æ
è
N2(ln N)
-1/5 +e
 
ö
ø
.
See [15] for details. The problem is therefore reduced to showing that
 
å
0< d < c <N
gcd(d, c) = 1
e
it s(d, c)/ln N
 
~ e
-t/(2p)
 
3N2
p2
,    t > 0.    &nb