I wrote an introductory note about elliptic curves and modular functions. At the begining it was supposed to be just an introduction to an article about very cool proof of Siegel. However, in the meantime I broke my finger (quite severely, bone is in many pieces…) and it’s still hard for me to use keyboard.

Accordingly, first few paragraphs aren’t very relevant, as they describe my admiration of Siegel’s proof.

I use Unicode for mathematical notation. Enjoy :-)

I got back to Warsaw, having spent the Easter in my hometown. The first lecture I attended after the break was “Modular Forms” and the lecturer served us a big chunk of juicy mathematics. Namely, he presented Carl Ludwig Siegel‘s proof of transformation formula for Dedekind eta function.

It’s not that this proof is extremely difficult. It isn’t; it’s understandable for anybody after basic course in complex analysis. Rather, it’s extremely tricky, and, because it’s relatively short (thus easy to follow), this trickiness is what makes it a cool thing.

I’d like to write what is the theorem and how Siegel proved it. (Or rather, I’ll use it as an excuse to write a summary of what I’ve learned so far :-)

First of all, the proof is really tricky. It is so tricky that after getting a general idea I decided that it’s impossible for normal human being to come up with something like that. If you come to the same conclusion after reading this post then it’s perhaps worth to have in mind that Siegel’s proof wasn’t a first proof of a result I’ll describe in a moment. AFAIK, previous proofs were (much) longer, but they were more straightforward. Still, because of such things I wonder whether First Class Mathematicians are actually normal human beings…

Let me review, for introduction and motivation, what I’ve learned. (If you want to focus on the meet, skip few paragraphs.) What we’re generally interested in are elliptic curves. For our purposes elliptic curve is a 2-dimensional torus with a chosen complex structure (not almost complex structure). As usually in mathematics, what we really care about are just isomorphism classes of objects, and in our case apropriate isomorphism is obviously a holomorphism (a diffeomorphism whose differential commutes with multiplication by i = √-1) (and “obviously” means here nothing but “by definition”). By some neat theorem every elliptic curve (“torus with a chosen complex structure”) is isomorphic to a torus of a form C/Λ, where Λ is some lattice in ℝ2=C

(Of course, if we were only interested in topological properties of a torus it wouldn’t matter which lattice Λ we choose (say, whether we choose lattice generated by vectors (1,0) and (0,1) or (1,0) and (1,1)). C/Λ is topologically always the same torus. However, complex structure is different; you can visualize this by imagining a single torus with two complex structures coming from above two lattices and asking yourself what does the multiplication by i do on a tangent plane on this torus. Answer: for the first lattice multiplication by i is a rotation by π/2, for the second it’s some other linear transformation).

For this reason, the name “elliptic curve” is really reserved for tori of form ℂ/Λ (not tori with some weird complex structure). I adopt this convention from now on (and when I write “torus” it means “elliptic curve” as well, unless I explicitely state that I mean topological torus (torus without complex structure)) .

(You may argue that elliptic curve is a stupid name for a torus – torus is not a curve but something 2-dimensional. Well, the reason is that torus is 1-dimensional if regarded as a complex manifold. You may have heard that elliptic curve is a set of points in ℝ2 given by equation y2 = a∙x3 + b∙x + c, or similar – such set points indeed looks like an honest 1-dimensional curve. To obtain such a curve from 2-dimensional torus with complex structure one has to embed (holomorphically) this torus in ℂP2 (complex plane) and look at the intersection of torus with some affine part of ℂP2.)

Now let’s wonder when two such tori, ℂ/Λ and ℂ/Λ’, are isomorphic. It’s easy to see that if Λ’ is a rotation of Λ then tori are isomorphic. Also (which is only very small ε less obvious), if Λ’ is a stretching of Λ then tori are isomorphic. These two properties are summarized by saying that if Λ’=αΛ, α∊ℂ, then ℂ/Λ ≃ ℂ/Λ’. The converse is also true and not hard to understand if one knows that isomorphism between two elliptic curves must come from an ℝ-linear isomorphism of ℂ (and this also follows from aforementioned neat theorem (I believe so :-).

It follows that every elliptic curve is isomorphic to the one of form ℂ/Λ for Λ generated by two complex numbers (previously I wrote “vectors” instead of “complex numbers” in this context) of which first is equal to 1 and second to some other complex number τ∊ℂ. Additionally, we can take τ to be from upper half-plane (because we don’t care about orientation on our tori, so that ℂ/ ≃ ℂ/ (β* is a complex conjugate of β, ≃ is induced by ℝ-linear transformation of ℂ, complex conjugation), even though lattices and are in general different. Let’s adopt convention that generators are always written in anticlockwise order).

Notice that τ is not something assigned to a torus ℂ/Λ, but rather to some chosen generators u,v of Λ. If we change generators of Λ then we’ll also change τ. It’s straightforward to check that if new generators are u’=a∙u+b∙v, v’=c∙u+d∙v then τ will change in the following manner: τ’=(aτ+b)/(cτ+d) (check it yourself). Notice also that, because u’ and v’ are also generators of Λ, matrix
a b
c d
is an element of SL2(ℤ). So we have an action of a group SL2(ℤ) on an upper half-plane.

We can now do something VERY cool, which I hear mathematicians often do and I always find a bit exciting. Instead of looking and investigating properties of single elliptic curve, we look at a set of all elliptic curves. We have a morphism from this set to H/SL2(ℤ), where H is upper half-plane and the action of SL2(Z) on it is the one just described. This morphism takes elliptic curve C/Λ, choses some pair of generators (u,v) of Λ and sends the elliptic curve to an orbit which contains τ associated to (u,v) (it’s easy to see that τ=v/u).

Now, we cooked up the action of SL2(ℤ) on H precisely in such a way that this morphism is well-defined – that is, it doesn’t depend on choice of generators of Λ. Additionally, because two elliptic curves ℂ/Λ and ℂ/Λ’ are isomorphic precisely when Λ’=αΛ, we see that unisomorphic elliptic curves are mapped to different points of H/SL2(ℤ). (And, of course, every point of H/SL2(ℤ) is an image of some elliptic curve.)

Summarizing, isomorphisms classes of elliptic curves are precisely points of H/SL2(ℤ) (more precisely: there is bijection between…). Cool! We have given a set of isoclasses of elliptic curves additional geometric structure in a very natural way! (natural is the key word; there is plenty of bijections between set of isoclasses of elliptic curves and your favorit geometrical object X but for most of these bijections there is no way to translate properties of X into useful information about isoclasses of elliptic curves)

Let me compare situation in which we are to the one often encountered in classical mechanics: we investigate there a motion of some set of objects in ℝ3. So we cook up a space (called “phase space”) whos points are in correspondence with all possible configurations of our objects (for example, phase space of two particles is ℝ12 – point of this ℝ12 encodes positions and velocities (which are 3-vectors) of both particles). Now, real-valued functions on the phase space are called sometimes “observables” (however, more often this word is used for something in quantum mechanics which I unfortunately can’t understand), because they correspond to things we can observe (for example, in this ℝ12 function that sends a point to its first coordinate may correspond to observing first spatial coordinate of first particle; function that sends a point of ℝ12 to a sum of squares of last six coordinates (perhaps with some coefficients, if masses of both particles are not equal to 2) corresponds to observing kinetic energy of whole system).
I woudn’t hesitate to call functions on a phase space “properties of a system” or “modular functions” – modular comes from a word “moduli” which means (more or less :-) “property” (not “This is my property, so go away or I’ll shoot you.” but “Properties of this kind of plants are interesting.”).

Similarly, we call functions on H/SL2(ℤ) “modular functions” or “elliptic modular functions” to emphasize that we’re interested in moduli of elliptic curves (“in properties of elliptic curves”). Before I give precise definitions, let me guess what some of you might think right now: “Ok, so functions on ℍ/SL2(ℤ) are like observables, ℍ/SL2(ℤ) itself is like a phase space and elliptic curves are like configurations of a mechanical system. So maybe we can go further with this analogy and ask what on the ellpitic curves part of a diagram is like a mechanical system?”.

Most curiosly, there is a group of physicist that pursue this point of view! They call themselves “String theorists”, they’re a little bit weird and AFAIU they claim something like this: basic things in universe are not particles or whatever but tiny “strings” – 1-dimensional compact manifolds (1-dimensional compact manifold = finite number of circles). If we draw a “movement” of such a string in a spacetime we get a two dimensional manifold. For example, if we are dealing with some peaceful string that doesn’t change its spatial position, we’ll get a pipe in a spacetime. However, 1-circle string can change into a 2-circle string at some point in time – if this happens we’ll get “pants”. Also, 1-circle string can change itself into 2-circle string and then back into 1-circle string – what we’ll get in this situation is topologically a torus without 2 disks. Also, it may happen that at a moment 0 there is no string, at a moment 1/4 there is a string consisting of one circle, later on it changes itself into two circles, then back to one circle, and at the moment 3/4 it again disappears – in this case the 2-dimensional manifold we get in a spacetime is a torus.

Although it’s not very important from string theory point of view, we now focus on this last example, so that we make a connection with elliptic curves. Physicists say that the string in any given moment has some internal tension which gives rise to a complex structure on the resulting torus. And yes, they also say that a movement of a string (that is, 2-dimensional torus with a complex structure) is pretty much the same from the physical point of view if tori are isomorphic. Phase space of all possible movements of a string (movements of this specific type: appear, change into two circles, back into one circle, disappear) is therefore a ℍ/SL2(ℤ). We can roughly end our analogy with saying that “movement of a string” is like a mechanical system. In this case, elliptic modular functions could be interpreted as “properties of a motion of a string” (or just “observables” :-).

I hope that what I wrote in last 2 paragraphs is not a total nonsense :-).

Let’s back to reality. Theorem Segal proved roughly states that some explicetely given function on ℍ (upper half-plane) is actually almost a modular function (that is, that this function is almost invariant under action of SL2(ℤ) on ℍ. Let’s start with precise definitions and statement of a thereom.

Suppose we have a meromorphic function f: ℍ –> ℂ (ℍ denotes upper half-plane). We’ll say that f is a “modular function” iff following two conditions hold:
a) f is invariant under action of SL2(ℤ) on ℍ. That is, for every matrix A∊SL2(ℤ) and for every τ∊ℍ we have f(A∙τ)=f(τ)
b) There exists m∊ℕ such that Fourier series of f is as follows: f(τ) = Σk≥me2πiτ.

Note that by a) f is a periodic function with period 1 (because one has a matrix
1 1
0 1
in SL2(ℤ) which sends every τ∊ℍ to τ+1), so f has some Fourier series – be requires that this series doesn’t have infinitely many negative terms.

Clearly, modular function gives rise to a meromorphic function ℍ/SL2(ℤ) –> ℂ. However, we’ll be interested also in functions on ℍ which don’t give rise to functions ℍ/SL2(ℤ) –> ℂ but to sections of vector bundles over ℍ/SL2(ℤ). That’s a reason for the following definition:

Suppose we have a meromorphic function f: ℍ –> ℂ and k∊ℕ (so k=0,1,…). We’ll say that f is a “modular function of weight 2k” iff following two conditions hold:
a) For every matrix A∊SL2(ℤ) and for every τ∊ℍ we have f(A∙τ)=(cτ+d)2kf(τ), where (c d) is a lower row of a matrix A.
b) There exists m∊ℤ such that Fourier series of f is as follows: f(τ) = Σk≥ma(k)∙e2kπiτ.

Note that (cτ+d)2k = 1 for (c d) = lower row of matrix
1 1
0 1
so modular functions of weight 2k are periodic with period 1 and again above remark applies. Also, we see that “modular functions of weight 0” are just “modular functions”.

Let’s consider ℂ-linear bundle Λ of complex differential forms over ℍ (of course, it’s isomorphic to trivial bundle). Its sections are forms g(z)∙dz, where g is some complex valued function on ℍ. We can also consider kth tensor power of this bundle: Λ⊗k. Now, it’s straightforward to check that modular function f of weight 2k gives rise to a section of Λ⊗k, namely f(z)∙dzk (here dzk =dz⊗dz⊗…⊗dz).

We now define a star of the evening: The Dedekind eta function, η. It is a function ℍ –> ℂ defined by the equation
η(τ) = eπiτ/12∙Πn≥1(1 – e2πinτ).

(Let me remind that a group SL2(ℤ) is generated by two matrices
1 1
0 1
0 -1
1 0
which act on ℍ by sending τ to τ+1 and sending τ to -1/τ, respectively. It’s straightforward to check that to check if given meromorphic function is modular of weight 2k it’s enough to check a) only for above 2 matrices (and check b), of course).)

One easilly checks that η is meromorphic and that it fulfills b). We’ll investigate to what extent η fulfills a).

Unfortunately, η is not a modular function. Indeed, one sees easilly that η(τ+1) = eπi/12∙η(τ). However, this is still nice transformation law – 24th power of η still has chance to be a modular function of weight 12. Indeed, this is the case, as (among the oher things) the following theorem shows:

Theorem With A Tricky Proof By Siegel: η(-1/τ) = (-iτ)1/2∙η(τ).

Before the tricky proof, let me once more remark that indeed from this follows that 24th power of η is a modular function (this is straightforward). However, this theorem has many other consequences (which one can find for example in Apostol’s book. The following proof also can be find there, in 3rd chapter).

Tricky Proof By C.L. Siegel: Check it out in Apostol’s book :-)