“The 2456th Century was matter-oriented, as most Centuries were, so he had a right to expect a basic compatibility from the very beginning. It would have none of the utter confusion (for anyone born matter-oriented) of the energy vortices of the 300s, or the field dynamics of the 600s.”
(Isaac Asimov, The End of Eternity)
It is an inevitable feature of the introduction of any long-form proof mining paper that at the some point it mentions that the field was “suggested” by Georg Kriesel in the 1950s, under the name of proof unwinding, and then “given maturity” by Ulrich Kohlenbach and his collaborators starting in the 1990s, a fact which has always reminded me of the end of the Asimov passage quoted above. It is equally inevitable that every field of study experiences a moment following its inception when knowledge is low and ambition is high — see, for example, Antarctic exploration or artificial intelligence.
Let us see what this means for our subject at hand.
We start not with Kreisel, but with Hilbert himself, who established proof theory at the start of the twentieth century in order to realize a foundation for all of mathematics which one could secure by reliable means (if we wanted to start even earlier and more ambitiously, we could have done so with Frege and his program of reducing mathematics to logic, whatever that may mean). It is commonly said that Hilbert’s program went up in flames with the advent of Gödel’s incompleteness theorems, and less commonly said that it was somewhat salvaged by Gentzen and his followers in the art of ordinal analysis. There exist, though, features of his proof-theoretic work that have survived more or less unchanged, such as his goal of clarifying the role that ideal objects play in mathematical proofs, for example via the higher-order recursion later made manifest by System , which was already present in his aborted attempt of giving a proof of the continuum hypothesis from this 1926 paper (an English translation is available in this collection); in addition, the quantitative kind of results that would come to characterize proof mining were already present in his school — as the current version of the field’s mythology tells us — in Grete Hermann’s work on polynomial ideals from that same year, which would later lead to the theory of Gröbner bases and to modern combinatorial and computational commutative algebra.
As I said in the first post of this series, this was the moment when Kreisel came with his explicit program of applying proof-theoretic tools to concrete mathematical proofs. There is much to say about his research program and his proposed methods in the 1996 book Kreiseliana — which I heartily recommend for many reasons — and so I had to make a choice about what I could cover in a single post that would nevertheless not feel incomplete. Given the title of this series, I will state the specific mathematical results that the program has aimed to analyze, and try to give an idea on why those statements could be considered logically tractable.
The simplest statement that Kreisel had in mind was a purely numerical one of the sort I talked about in the first post, namely Littlewood’s result that there is a natural number for which
is positive, where
is the prime-counting function, while
is the logarithmic integral — by using suitable rational approximations one can easily see that this is a
statement — even
. Unfortunately, Kreisel’s efforts on finding upper bounds on that
were quite inconclusive (they were found in the meanwhile by other means), but the most significant aspect here was him pointing out that the task could in principle be done without any additional mathematical machinery.
A more structural statement (of the sort in the previous post) was the weak form of Hilbert’s Nullstellensatz, which essentially states that for every finite set of polynomials over an algebraically closed field with no common zero, one can write as a linear combination of them with polynomial coefficients (intuitively, a sort of “completeness theorem” for the “logic of polynomials”). The relevant quantitative problem was to give bounds on the coefficient polynomials in terms of just the number and degree of the polynomials one starts with. That this can be fit into the scheme I presented last time — and thus that computable bounds exist — was first shown by Abraham Robinson in this paper. Kreisel then showed that those bounds can be made primitive recursive. Again unfortunately, Hermann’s work that I linked above had already produced doubly exponential bounds on the problem (and later it was shown that even exponential ones exist).
Another algebraically-flavoured statement targeted by Kreisel (and for which again Robinson showed that it has the required logical form) was that of Artin’s solution to Hilbert’s seventeenth problem, stating that each polynomial in variables of degree
which is defined over a real-closed field in which every positive element can be expressed as the sum of
squares and which only takes non-negative values can be written as the sum of
squares of rational functions, and the problem is to find a bound for
in terms of
,
and
. In contrast to the above examples, this led to a highly productive back-and-forth between logical and algebraic methods, which is chronicled in detail in Charles Delzell’s contribution to Kreiseliana.
Among contributions of others to Kreisel’s program, I would firstly mention Girard’s work on the finitary van der Waerden theorem, which states that for any natural numbers and
there is a
such that for any partition into
sets of
there is an
and
,
with
such that for every
,
is in the
’th set of the partition (i.e. the
’th set contains an arithmetic progression of length
). It is easy to see that everything after the
in this statement is computable by quite elementary means, and thus one expects to be able to find bounds on the
in terms of
and
(the exact minimum value of
is called the van der Waerden number). In Girard’s Proof Theory and Logical Complexity, Vol. 1 (the second volume may be found at this location), he attempts two such analyses by cut elimination of the topological proof of the theorem due to Furstenberg and Weiss. The bounds he obtained, though, are of similar (Ackermannian) complexity to the ones evident from the original proof of van der Waerden. Smaller bounds were later obtained by non-logical means: first Shelah gave primitive recursive ones, then Gowers succeeded in reducing them to being just multiply exponential; on the other hand, Kohlenbach’s student Philipp Gerhardy managed to present a clearer account of the quantitative content inherent in the topological proof, now having the guidance of the general logical metatheorems.
The only attempt of those times that went beyond the barrier was that of Horst Luckhardt, who produced in the mid-1980s an innovative analysis of Roth’s theorem. The theorem states that for any algebraic irrational number
with minimal polynomial of degree
and any
there are only finitely many
’s such that
is an irreducible fraction whose distance to
is smaller than
, and the problem here would be to bound the number of those
’s in terms of
,
,
and the height of
(the last of which being defined as e.g. in this paper of Bombieri). This he obtained by first producing a form of the statement as in the conclusion of Herbrand’s theorem — not guaranteed to exist, as that theorem only applies to proofs in pure first-order logic — and then producing the bound as a consequence of the existence of some “growth conditions” on the corresponding Herbrand terms — again, not guaranteed to exist, as, for example, Kreisel, when first suggesting this avenue, had only thought of weaker growth conditions. Almost immediately, however, Bombieri and van der Poorten obtained similar bounds using more ad hoc methods, and their work was quickly generalized to more abstract statements by other researchers.
In an appendix to Takeuti’s 1987 second edition of his proof theory monograph, Kreisel noted that “in the late seventies I learnt of many areas (-functions, Galois cohomology, ergodic theory, topological dynamics), where mathematicians wanted to unwind proofs of
-theorems, but were not able to do so without logical guidance”. At the same time, however, Kohlenbach was starting to apply this kind of tools in an entirely new direction.
I will speak more about the Kohlenbach era in the next post, but I believe this early work of his is a better fit here, thematically speaking. His focus was on best approximation theory, namely proofs of the uniqueness of such best approximations. For example, in the classical theory of Chebyshev approximation, if one denotes the supremum norm by and — for any
— the class of real polynomials of degree at most
by
, one has that for any continuous
and any
, there is a unique
such that
What Kohlenbach did in his PhD thesis (written under the supervision of Luckhardt) was to obtain explicit so-called moduli of uniqueness, i.e., roughly speaking, functions such that for any
and
as in the above, any
and any
,
that have the corresponding approximation errors
and
within
of the desired minimum, one can then be sure that
. This work was published in these two papers, which also left the analyses of some related statements as open problems: the modulus of uniqueness for the case of
-approximation was found in 2001 by Kohlenbach together with his first PhD student Paulo Oliva, while the case of Chebyshev approximation with bounded coefficients was recently solved by myself.
It was a widely held belief back then that useful mathematical applications would require proof interpretations able to treat even stronger logical systems than first- or second-order arithmetic, which led Friedrich (another student of Luckhardt, previously mentioned here) and Kohlenbach to work on higher forms of bar recursion. Kohlenbach’s work in approximation theory, however, which was almost purely of a proof-theoretical nature in its methods, showed that in practice only very weak foundational axioms are used, which in turn only lead to functions of very low complexity (a phenomenon which is now called proof-theoretic tameness).
This post has been the most difficult for me to write so far, as it dealt with quite a number of facts which I could not easily recite off the top of my head. Next time we shall dwell closer to the present and thus to my own lived experience, as we shall see how this whole enterprise of unwinding proofs came to be what we know it to be today: proof mining.
1 thought on “What proof mining is about, Part III: The Heroic Age of Unwinding Proofs”