I wrote an introductory note about elliptic curves and modular functions. At the begining it was supposed to be just an introduction to an article about very cool proof of Siegel. However, in the meantime I broke my finger (quite severely, bone is in many pieces…) and it’s still hard for me to use keyboard.
Accordingly, first few paragraphs aren’t very relevant, as they describe my admiration of Siegel’s proof.
I use Unicode for mathematical notation. Enjoy :-)
I got back to Warsaw, having spent the Easter in my hometown. The first lecture I attended after the break was “Modular Forms” and the lecturer served us a big chunk of juicy mathematics. Namely, he presented Carl Ludwig Siegel’s proof of transformation formula for Dedekind eta function.
It’s not that this proof is extremely difficult. It isn’t; it’s understandable for anybody after basic course in complex analysis. Rather, it’s extremely tricky, and, because it’s relatively short (thus easy to follow), this trickiness is what makes it a cool thing.
I’d like to write what is the theorem and how Siegel proved it. (Or rather, I’ll use it as an excuse to write a summary of what I’ve learned so far :-)
First of all, the proof is really tricky. It is so tricky that after getting a general idea I decided that it’s impossible for normal human being to come up with something like that. If you come to the same conclusion after reading this post then it’s perhaps worth to have in mind that Siegel’s proof wasn’t a first proof of a result I’ll describe in a moment. AFAIK, previous proofs were (much) longer, but they were more straightforward. Still, because of such things I wonder whether First Class Mathematicians are actually normal human beings…
Let me review, for introduction and motivation, what I’ve learned. (If you want to focus on the meet, skip few paragraphs.) What we’re generally interested in are elliptic curves. For our purposes elliptic curve is a 2-dimensional torus with a chosen complex structure (not almost complex structure). As usually in mathematics, what we really care about are just isomorphism classes of objects, and in our case apropriate isomorphism is obviously a holomorphism (a diffeomorphism whose differential commutes with multiplication by i = √-1) (and “obviously” means here nothing but “by definition”). By some neat theorem every elliptic curve (“torus with a chosen complex structure”) is isomorphic to a torus of a form C/Λ, where Λ is some lattice in ℝ2=C
(Of course, if we were only interested in topological properties of a torus it wouldn’t matter which lattice Λ we choose (say, whether we choose lattice generated by vectors (1,0) and (0,1) or (1,0) and (1,1)). C/Λ is topologically always the same torus. However, complex structure is different; you can visualize this by imagining a single torus with two complex structures coming from above two lattices and asking yourself what does the multiplication by i do on a tangent plane on this torus. Answer: for the first lattice multiplication by i is a rotation by π/2, for the second it’s some other linear transformation).
For this reason, the name “elliptic curve” is really reserved for tori of form ℂ/Λ (not tori with some weird complex structure). I adopt this convention from now on (and when I write “torus” it means “elliptic curve” as well, unless I explicitely state that I mean topological torus (torus without complex structure)) .
(You may argue that elliptic curve is a stupid name for a torus – torus is not a curve but something 2-dimensional. Well, the reason is that torus is 1-dimensional if regarded as a complex manifold. You may have heard that elliptic curve is a set of points in ℝ2 given by equation y2 = a∙x3 + b∙x + c, or similar – such set points indeed looks like an honest 1-dimensional curve. To obtain such a curve from 2-dimensional torus with complex structure one has to embed (holomorphically) this torus in ℂP2 (complex plane) and look at the intersection of torus with some affine part of ℂP2.)
Now let’s wonder when two such tori, ℂ/Λ and ℂ/Λ’, are isomorphic. It’s easy to see that if Λ’ is a rotation of Λ then tori are isomorphic. Also (which is only very small ε less obvious), if Λ’ is a stretching of Λ then tori are isomorphic. These two properties are summarized by saying that if Λ’=αΛ, α∊ℂ, then ℂ/Λ ≃ ℂ/Λ’. The converse is also true and not hard to understand if one knows that isomorphism between two elliptic curves must come from an ℝ-linear isomorphism of ℂ (and this also follows from aforementioned neat theorem (I believe so :-).
It follows that every elliptic curve is isomorphic to the one of form ℂ/Λ for Λ generated by two complex numbers (previously I wrote “vectors” instead of “complex numbers” in this context) of which first is equal to 1 and second to some other complex number τ∊ℂ. Additionally, we can take τ to be from upper half-plane (because we don’t care about orientation on our tori, so that ℂ/ ≃ ℂ/ (β* is a complex conjugate of β, ≃ is induced by ℝ-linear transformation of ℂ, complex conjugation), even though lattices and are in general different. Let’s adopt convention that generators are always written in anticlockwise order).
Notice that τ is not something assigned to a torus ℂ/Λ, but rather to some chosen generators u,v of Λ. If we change generators of Λ then we’ll also change τ. It’s straightforward to check that if new generators are u’=a∙u+b∙v, v’=c∙u+d∙v then τ will change in the following manner: τ’=(aτ+b)/(cτ+d) (check it yourself). Notice also that, because u’ and v’ are also generators of Λ, matrix
a b
c d
is an element of SL2(ℤ). So we have an action of a group SL2(ℤ) on an upper half-plane.
We can now do something VERY cool, which I hear mathematicians often do and I always find a bit exciting. Instead of looking and investigating properties of single elliptic curve, we look at a set of all elliptic curves. We have a morphism from this set to H/SL2(ℤ), where H is upper half-plane and the action of SL2(Z) on it is the one just described. This morphism takes elliptic curve C/Λ, choses some pair of generators (u,v) of Λ and sends the elliptic curve to an orbit which contains τ associated to (u,v) (it’s easy to see that τ=v/u).
Now, we cooked up the action of SL2(ℤ) on H precisely in such a way that this morphism is well-defined – that is, it doesn’t depend on choice of generators of Λ. Additionally, because two elliptic curves ℂ/Λ and ℂ/Λ’ are isomorphic precisely when Λ’=αΛ, we see that unisomorphic elliptic curves are mapped to different points of H/SL2(ℤ). (And, of course, every point of H/SL2(ℤ) is an image of some elliptic curve.)
Summarizing, isomorphisms classes of elliptic curves are precisely points of H/SL2(ℤ) (more precisely: there is bijection between…). Cool! We have given a set of isoclasses of elliptic curves additional geometric structure in a very natural way! (natural is the key word; there is plenty of bijections between set of isoclasses of elliptic curves and your favorit geometrical object X but for most of these bijections there is no way to translate properties of X into useful information about isoclasses of elliptic curves)
Let me compare situation in which we are to the one often encountered in classical mechanics: we investigate there a motion of some set of objects in ℝ3. So we cook up a space (called “phase space”) whos points are in correspondence with all possible configurations of our objects (for example, phase space of two particles is ℝ12 – point of this ℝ12 encodes positions and velocities (which are 3-vectors) of both particles). Now, real-valued functions on the phase space are called sometimes “observables” (however, more often this word is used for something in quantum mechanics which I unfortunately can’t understand), because they correspond to things we can observe (for example, in this ℝ12 function that sends a point to its first coordinate may correspond to observing first spatial coordinate of first particle; function that sends a point of ℝ12 to a sum of squares of last six coordinates (perhaps with some coefficients, if masses of both particles are not equal to 2) corresponds to observing kinetic energy of whole system).
I woudn’t hesitate to call functions on a phase space “properties of a system” or “modular functions” – modular comes from a word “moduli” which means (more or less :-) “property” (not “This is my property, so go away or I’ll shoot you.” but “Properties of this kind of plants are interesting.”).
Similarly, we call functions on H/SL2(ℤ) “modular functions” or “elliptic modular functions” to emphasize that we’re interested in moduli of elliptic curves (“in properties of elliptic curves”). Before I give precise definitions, let me guess what some of you might think right now: “Ok, so functions on ℍ/SL2(ℤ) are like observables, ℍ/SL2(ℤ) itself is like a phase space and elliptic curves are like configurations of a mechanical system. So maybe we can go further with this analogy and ask what on the ellpitic curves part of a diagram is like a mechanical system?”.
Most curiosly, there is a group of physicist that pursue this point of view! They call themselves “String theorists”, they’re a little bit weird and AFAIU they claim something like this: basic things in universe are not particles or whatever but tiny “strings” – 1-dimensional compact manifolds (1-dimensional compact manifold = finite number of circles). If we draw a “movement” of such a string in a spacetime we get a two dimensional manifold. For example, if we are dealing with some peaceful string that doesn’t change its spatial position, we’ll get a pipe in a spacetime. However, 1-circle string can change into a 2-circle string at some point in time – if this happens we’ll get “pants”. Also, 1-circle string can change itself into 2-circle string and then back into 1-circle string – what we’ll get in this situation is topologically a torus without 2 disks. Also, it may happen that at a moment 0 there is no string, at a moment 1/4 there is a string consisting of one circle, later on it changes itself into two circles, then back to one circle, and at the moment 3/4 it again disappears – in this case the 2-dimensional manifold we get in a spacetime is a torus.
Although it’s not very important from string theory point of view, we now focus on this last example, so that we make a connection with elliptic curves. Physicists say that the string in any given moment has some internal tension which gives rise to a complex structure on the resulting torus. And yes, they also say that a movement of a string (that is, 2-dimensional torus with a complex structure) is pretty much the same from the physical point of view if tori are isomorphic. Phase space of all possible movements of a string (movements of this specific type: appear, change into two circles, back into one circle, disappear) is therefore a ℍ/SL2(ℤ). We can roughly end our analogy with saying that “movement of a string” is like a mechanical system. In this case, elliptic modular functions could be interpreted as “properties of a motion of a string” (or just “observables” :-).
I hope that what I wrote in last 2 paragraphs is not a total nonsense :-).
Let’s back to reality. Theorem Segal proved roughly states that some explicetely given function on ℍ (upper half-plane) is actually almost a modular function (that is, that this function is almost invariant under action of SL2(ℤ) on ℍ. Let’s start with precise definitions and statement of a thereom.
Suppose we have a meromorphic function f: ℍ –> ℂ (ℍ denotes upper half-plane). We’ll say that f is a “modular function” iff following two conditions hold:
a) f is invariant under action of SL2(ℤ) on ℍ. That is, for every matrix A∊SL2(ℤ) and for every τ∊ℍ we have f(A∙τ)=f(τ)
b) There exists m∊ℕ such that Fourier series of f is as follows: f(τ) = Σk≥me2πiτ.
Note that by a) f is a periodic function with period 1 (because one has a matrix
1 1
0 1
in SL2(ℤ) which sends every τ∊ℍ to τ+1), so f has some Fourier series – be requires that this series doesn’t have infinitely many negative terms.
Clearly, modular function gives rise to a meromorphic function ℍ/SL2(ℤ) –> ℂ. However, we’ll be interested also in functions on ℍ which don’t give rise to functions ℍ/SL2(ℤ) –> ℂ but to sections of vector bundles over ℍ/SL2(ℤ). That’s a reason for the following definition:
Suppose we have a meromorphic function f: ℍ –> ℂ and k∊ℕ (so k=0,1,…). We’ll say that f is a “modular function of weight 2k” iff following two conditions hold:
a) For every matrix A∊SL2(ℤ) and for every τ∊ℍ we have f(A∙τ)=(cτ+d)2kf(τ), where (c d) is a lower row of a matrix A.
b) There exists m∊ℤ such that Fourier series of f is as follows: f(τ) = Σk≥ma(k)∙e2kπiτ.
Note that (cτ+d)2k = 1 for (c d) = lower row of matrix
1 1
0 1
so modular functions of weight 2k are periodic with period 1 and again above remark applies. Also, we see that “modular functions of weight 0″ are just “modular functions”.
Let’s consider ℂ-linear bundle Λℂ of complex differential forms over ℍ (of course, it’s isomorphic to trivial bundle). Its sections are forms g(z)∙dz, where g is some complex valued function on ℍ. We can also consider kth tensor power of this bundle: Λℂ⊗k. Now, it’s straightforward to check that modular function f of weight 2k gives rise to a section of Λℂ⊗k, namely f(z)∙dzk (here dzk =dz⊗dz⊗…⊗dz).
We now define a star of the evening: The Dedekind eta function, η. It is a function ℍ –> ℂ defined by the equation
η(τ) = eπiτ/12∙Πn≥1(1 – e2πinτ).
(Let me remind that a group SL2(ℤ) is generated by two matrices
1 1
0 1
and
0 -1
1 0
which act on ℍ by sending τ to τ+1 and sending τ to -1/τ, respectively. It’s straightforward to check that to check if given meromorphic function is modular of weight 2k it’s enough to check a) only for above 2 matrices (and check b), of course).)
One easilly checks that η is meromorphic and that it fulfills b). We’ll investigate to what extent η fulfills a).
Unfortunately, η is not a modular function. Indeed, one sees easilly that η(τ+1) = eπi/12∙η(τ). However, this is still nice transformation law – 24th power of η still has chance to be a modular function of weight 12. Indeed, this is the case, as (among the oher things) the following theorem shows:
Theorem With A Tricky Proof By Siegel: η(-1/τ) = (-iτ)1/2∙η(τ).
Before the tricky proof, let me once more remark that indeed from this follows that 24th power of η is a modular function (this is straightforward). However, this theorem has many other consequences (which one can find for example in Apostol’s book. The following proof also can be find there, in 3rd chapter).
Tricky Proof By C.L. Siegel: Check it out in Apostol’s book :-)

7 comments
Comments feed for this article
May 22, 2007 at 8:41 pm
John Armstrong
Whew! A lot there. First off, good job in the explanation. I do have a couple notes, though.
I think mentioning orientation here is a red herring. You can just rotate around your generating vectors until one hits the x-axis and the other is in the upper half-plane. If one is on the axis and the other is in the lower plane, keep rotating. If both are on the axis you don’t have a lattice.
Oh, but you do understand them. You just don’t know it yet!
You’re looking at the algebra of real-valued (measurable) functions on a manifold. Just move over to the algebra of self-adjoint operators on a Hilbert space and presto! operators.
Don’t believe me yet? The real-valued functions contain a subalgebra of characteristic functions. These reflect the boolean (sigma-)algebra of measurable subsets — property states. Now just move over to the algebra of projection operators on a Hilbert space to get the nonboolean algebra of quantum states.
Try Jeffrey Bub’s Interpreting the Quantum World for more.
Next, there’s something screwey in your Fourier series formula in condition (b) of modularity. No biggie, really.
And your ending.. it reminds me of the “you’re not a monk” joke.
May 24, 2007 at 11:56 am
sirix
I fixed modularity condition, thanks for pointing this out.
As to observables, I’ll definitely check Bub’s book – details of what you say are still unclear to me:
On one hand I have a manifold with functions – these are classical observables.
On the other hand I have one quantum observable – self-adjoint operator A and morphism of algebras:
measurable functions —> self-adjoint operators (f goes to f(A))
Where did I get A from? If I have a function f, what is f(A)?
May 24, 2007 at 3:29 pm
John Armstrong
Let me try to rephrase. I’m not talking about a homomorphism, but an analogy.
Property states in classical mechanics are points in phase space. In quantum mechanics they’re rays in a Hilbert space. Predicates — yes/no questions like “is the particle in the box” — are measurable subsets of phase space classically. In quantum mechanics they’re subspaces of the Hilbert space.
Now we change a bit. Instead of talking about the measurable subsets of phase space we can talk about their characteristic functions, which take only 0 and 1 as values. On the quantum side we switch from subspaces to their projection operators, which have only 0 and 1 as eigenvalues. These correspond to observing the result of a yes/no question like “is the particle in the box?” or “is the cat alive?”
Predicates — regions of phase space in the classical picture — can be expressed with inequalities of measurable functions. To ask if the particle is in the box is to ask if a collection of inequalities describing the sides of the box is satisfied. Thus we’re interested in observing the values of real-valued measurable functions on phase space. When we move to the quantum picture, the analogous role to real-valued functions on a measure space is played by self-adjoint operators on the Hilbert space.
And those self-adjoint operators are what quantum mechanics calls “observables”.
May 26, 2007 at 11:08 am
thomas1111
Nice work for the Unicode notation! This looks interesting, I’ll try to read it more carefully one day.
Now, if you’re interested by the story about classical and quantum observables I would recommend reading the papers by Alfonso Gracia-Saz, especially this one and references therein.
May 27, 2007 at 9:20 pm
sirix
John: Sorry for this misunderstanding… The very same day (!) I read your comment I learned a little about spectral theory – and there one has homomorphism from measurable functions to self-adjoint operators, for any a priori chosen self-adjoint operator (function f goes to operator f(A) – one defines it easily on polymials and then…). So, I got really excited by what you wrote – I thought there was some quantum mechanical side of what I’ve learned.
thomas1111: thanks for links. As to unicode, I made myself a keyboard mapping so that virtually all symbols are obtained by alt+letter or alt+shift+letter (it’s easy to do such thing in linux).
May 27, 2007 at 9:23 pm
sirix
thomas: And if you ever read it I’ll appreciate any comment :-). I’m unsure whether “string theory part” makes sense – I’d love if somebody checked whether it does.
May 27, 2007 at 10:04 pm
John Armstrong
No need to be sorry. If people understood everything they wouldn’t need me to teach :D
Seriously, though, I’m glad to hear of the happy coincidence. I admit I still don’t know exactly how deep that rabbit-hole goes, but it really helped me understand the logic of quantum mechanics to think of everything in terms of the change from the boolean lattice of subsets of phase space to the non-boolean lattice of subspaces of a Hilbert space of states.