After the developments recounted at the end of the previous post, proof mining lay dormant for a number of years; and it was during this period that it got its current name, as per the suggestion of Dana Scott. It was not just a renaming, but a refinement of the concept: Kohlenbach defines proof mining as (quoting from memory) “that kind of unwinding which properly concerns itself with issues of representation”. Representations in mathematics are a topic of which Sam has previously written here; and their role in contemporary proof mining is due to the fact that many of the advances have concerned nonlinear analysis, so one has to represent in some way real numbers and the metric spaces one works with (see Chapter 4 of Kohlenbach’s monograph), or, at least, slightly bypass the latter by means of the abstract types we talked about two posts ago. I will not dive too much into representation details, except perhaps to note that equality and order on real numbers are expressed by universal statements, and that quantifiers on real numbers may frequently be judiciously replaced by those on naturals.
Now, a ubiquitous kind of statement in analysis is sequential convergence: if is a metric space with distance function , is a sequence in and is a point in , then the fact that converges to is expressed by the following statement:
One has to get rid in some way of that pesky that feels exogenous to the situation (roughly speaking, it is an unknown and thus cannot be simply added as a constant to the language), and one does it mainly in two possible ways: either it is known from the start to be an easily expressed element (e.g. the number zero, as we shall see below), or it is not, and then one replaces convergence with the equivalent — and also — Cauchyness. The latter is also equivalent in the sense of what proof mining would want to obtain from a proof of the statement displayed above, namely a rate of convergence — a bound for the in terms of the — since that may be trivially interchanged with the similarly defined rate of Cauchyness.
What is actually a problem is that the desired formulas for bounds may not be guaranteed to exist, even when it comes to proofs of simple statements such as “every bounded monotone sequence is convergent” — in which case, the counterexamples — computable sequences with no computable rate of convergence — are known as Specker sequences (and Sam has also talked about those recently); and if you remember the first two posts, the relevant metatheorems only guaranteed the extraction of bounds for statements of level up to .
Thus, when we start analyzing convergence proofs, we find that they fall naturally into two classes: the ones that are to some degree constructive, such that we might cleverly exploit that feature in order to actually obtain a rate of convergence, or — as in the above — the ones where we know from the start that there cannot be a computable (or even a uniform) rate, so we are forced to look after something weaker.
I will now try, a bit differently than in the previous post, to give an accurate impression of how the statements that proof mining has analyzed in the last two decades look like, and of the main difficulties encountered. Since the endeavour can be said to have undergone an explosion in that period, I cannot possibly cover every single result — instead, I shall strive for legibility and talk only about those that illustrate most plainly the above two classes, those that represented the most significant advances, and those that can provide the necessary context and momentum.
Kohlenbach’s first foray into abstract nonlinear analysis concerned the asymptotic regularity of Krasnosel’skii-Mann iterations in Banach spaces. Let us see what this means. If is a convex subset of a Banach space, is a self-mapping of and is a sequence in , then a Krasnosel’skii-Mann iteration associated to this data is a sequence in such that for each , . I will now skip a lot over the history of the study of such iterations, but the main points here are that Ishikawa has shown in 1976 that if is nonexpansive (i.e. for all , , ), is bounded, and there is a such that for all , , then
while Groetsch had already shown in 1972 that in the case where the Banach space is uniformly convex, the latter two conditions on the sequence may be replaced by .
The above statement is usually called asymptotic regularity (of with respect to ) — note that it is not a new category of statements: it is plainly a statement of convergence, just of a different sequence than the main one , and proving it is usually the first step in the proving the (weak or strong) convergence of to a fixed point of . This may be made intuitive by noticing that while convergence signifies the fact that becomes closer to a fixed point, asymptotic regularity means that becomes closer to being a fixed point, to the idea of it.
Of course, given that it is a statement, it is, as per the discussion above, not guaranteed that one could find a corresponding bound for it — a “rate of asymptotic regularity”. What is specific to Krasnosel’skii-Mann iterations is that the sequence is in this case nonincreasing, and thus a bound for the apparently weaker statement
which may be extracted (as the statement is ) and is generally called a modulus of liminf, is already a rate of asymptotic regularity.
This extraction is what Kohlenbach did in 2001-2003 in a series of three papers, namely he obtained corresponding rates of asymptotic regularity for the iterations in Groetch’s and Ishikawa’s theorems, depending in addition on an upper bound on the diameter of , a modulus of uniform convexity for the space (in the case of Groetsch’s result), and witnesses for the respective conditions on the sequence . We note here that some rates had been previously obtained using deeper functional-analytic results by Kirk and Martinez-Yanez and by Baillon and Bruck in the case where is constant; in the general case, Kohlenbach’s proof-theoretically extracted rates remained the tightest known — namely exponential — until they were dethroned in 2014 by quadratic rates obtained using Bernoulli sum techniques.
The transition from the liminf statement to the asymptotic regularity one may be more elaborate than in the above while still remaining constructive. This is the case with the generalization of Groetsch’s above result to Ishikawa iterations due to Tan and Xu, which was analyzed by my advisor Laurențiu Leuștean in two papers (see also the later, similar analysis due to myself regarding the parallel algorithm). There, in order to obtain a rate of asymptotic regularity out of the modulus of liminf, one analyzes that second, constructive part of the proof as one would analyze an intuitionistic one, leading to a so-called hybrid approach to proof mining (evoking the hybrid functional interpretations of Dan Hernest and Paulo Oliva).
A rate of asymptotic regularity may even help us find a rate of convergence for the sequence in the special case where it may be proven that has a unique fixed point, call it (e.g. when belongs to one of the various classes that generalize the class of contractions); see also Section 4.1 of this survey by Kohlenbach and Oliva. Namely, one first extracts from that proof a modulus of uniqueness (see the end of the previous post), and then if is the rate of asymptotic regularity, then for any , we have that and hence . Thus, is a rate of convergence of to . Recently, this idea has been generalized to the idea of a modulus of regularity in this paper by Kohlenbach together with Genaro López-Acedo and Adriana Nicolae.
I’ve said before that in the cases where a computable rate of convergence is known not to exist, there still is something weaker that we may settle for. That something is an equivalent statement which is in a generalized sense — what is known in logic as the Kreisel no-counterexample interpretation or the Herbrand normal form, the dual of the much better-known Skolem normal form.
In the case of asymptotic regularity, which is a special case of convergence towards zero, the transformed sentence looks like this:
and a corresponding rate (a bound on the in terms of the , and other parameters) was first obtained for iterations of quasi-nonexpansive mappings by Kohlenbach and his student Branimir Lambov in this paper. More importantly, though, in the case where one deals with the convergence of the sequence itself, where as I said, the statement which one analyzes is the one of Cauchyness, the corresponding Herbrandization (which first appeared in this paper of Kohlenbach where he treats convergence in compact subsets using the concept of a modulus of total boundedness, an argument then vastly generalized in this paper of him together with Leuștean and Nicolae) looks like this:
This is the famous “metastability” statement, which was independently rediscovered by Terence Tao in 2007 during his work on multiple ergodic averages, introduced at the time to the public in this blog post and named as such by Jennifer Chayes. Kohlenbach’s general logical metatheorems of proof mining, which also came out in the mid-2000s, provide a guarantee that out of proofs formulated in quite powerful logical systems one can extract corresponding rates of metastability, and in addition, given the modularity of the Gödelian functional interpretation, if one has to deal with a genuine statement which arises as a further consequence of the Cauchyness statement, the only quantitative information which is needed in order to extract its corresponding bound is already contained in the rate of metastability.
With the advent of the metatheorems, the way was now open towards finding applications that would make serious use of these polished tools. The first target was the classical mean ergodic theorem of von Neumann in the usual formulation due to Riesz, which states that if is a Hilbert space, is a linear operator such that for all , , then for any , we have that the corresponding sequence of ergodic averages , where for each ,
is convergent. It was Avigad, Gerhardy and Towsner who first managed to analyze Riesz’s proof in order to obtain a rate of metastability, while Kohlenbach and Leuștean later analyzed a simpler proof due to Birkhoff in this paper.
A crucial feature of both of the above analyses is that they do not aim to analyze the proof as it is (i.e. as formulated in a system stronger than first-order arithmetic which would need bar recursion for its interpretation) but rather attempt to reduce the strength of the principles involved by arithmetizing them, in order, on one hand, to obtain a much more readable rate and on the other, to show that those strong principles are not actually needed in practical mathematics, which ties into the phenomenon of proof-theoretic tameness that I mentioned last time. Another instance of this has been the extraction of rates of metastability from the proofs of theorems of Browder and Wittmann, where an essential lemma miraculously turns out (as pointed out by Eyvind Briseid) to have an almost trivial interpretation (a proof-theoretic explanation for this has very recently been put forward by Ferreira, Leuștean and Pinto in this paper, by means of the bounded functional interpretation of Ferreira and Oliva). Similarly, the extension of Wittmann’s theorem to CAT(0) spaces due to Saejung, whose proof used very strong set-theoretic principles through the use of Banach limits, was shown by Kohlenbach and Leuștean to be provable using simpler combinatorial ideas (and later this interpreted proof was by itself generalized to CAT(κ) spaces by Leuștean and Nicolae). Unfortunately, in Kohlenbach’s only rate of metastability extracted from a proof of weak convergence, namely that of Baillon’s theorem, the use of bar recursion could only be eliminated post factum by outside means such as ordinal analysis.
What is now considered to be a “borderline case” for proof-theoretic tameness (see these slides) is the recent analysis by Kohlenbach and myself of the strong convergence theorem of Reich, a generalization of Browder’s theorem mentioned above. Even though we have managed to express the proof entirely within first-order arithmetic, the extraction is the most complex one considered so far that takes place entirely within that proof-theoretic strength and even later considerations only place the obtained formula within . It remains to be seen whether some future simplification will be able to yield a rate of primitive-recursive complexity.
In contrast to that, the last couple of years have also seen a return to the earlier goal of finding rates of asymptotic regularity, a highly compelling goal given that it concerns a particularly well-motivated statement — among those efforts, I would mention the analysis by Kohlenbach of Bauschke’s solution of the so-called “zero displacement conjecture”, where even though the proof uses strong set-theoretic principles via the use of classical results such as Minty’s theorem, its analysis shows that these are inconsequential to the computation of the actual rate, which turns out to be polynomial of degree eight.
As we’ve seen in the CAT(κ) example above, research in proof mining can easily spill over into the fields whose results it is analyzing. For example, firmly nonexpansive operators — a class of operators is fundamental to convex optimization — were first introduced in a nonlinear context in this paper by Ariza-Ruiz, Leuștean and López-Acedo. Later, these ideas were extended in order to introduce jointly firmly nonexpansive families of mappings, a tool via which one may formulate an abstract version of a highly-used algorithm of convex optimization known as the proximal point algorithm (the notion was recently revisited here).
Last but not least, I would like to mention some contributions of the American school of proof mining — in addition to the result on ergodic theory mentioned above. Specifically, Henry Towsner has obtained more intricate quantitative results on Banach spaces such as this one, while together with William Simmons (see these two papers) he has extended previous quantitative work on commutative algebra and even dived into differential algebra.
The upcoming part will be the final one in this series, so if you have any suggestions of topics that you are curious about or that you think I have overlooked, now is the time to tell me; anyway, I’ll try to cover everything that I feel I haven’t covered so far and should have — including prospects for future research and how does one go about actually doing proof mining.
2 thoughts on “What proof mining is about, Part IV: Kohlenbach & friends”
Hi Andrei, thanks for the interesting (series of) post(s), – it is quite useful for the non-proof-mining-focussed-proof-theorist! Could you add some dates of publication throughout the series of posts? It would be good to have an idea of the chronology. I chased a few, e.g. Kohlenbach’s *2001* series of papers, and certainly in that case it was interesting to know that his proof-extraction bound remained the best known for 13 years!
I wanted to dig deeper into the appropriate notion of here, for the non-specialist. I suppose that the higher-typed setting means that you deal with sentences where the quantifiers may range over objects of higher types, e.g. the real numbers or functions, – I suppose we can consider these sentences(?). For instance the equation display under the “Oh Mann, das ist Skolem!” photo uses such quantifiers. I would like to check some intuitions:
Hello Anupam, thanks for the appreciation!
* I’ve tried to present everything roughly in the historical order, but I only put minimal chronological signposts here in order to be able to jump around a bit when it’s thematically convenient. Maybe I’ll add some more.
* Yes, they are – this is why I said “in a generalized sense” and why I didn’t put any superscripts. Also, there are some type restrictions when it comes to interpreting theories with strength greater than PA.
* You’re right about the bounded quantifier thing.
* Yes, all the theories involved are weakly extensional (full extensionality may be dealt with sometimes by an elimination technique due to Luckhardt – see Sect. 10.4 in Kohlenbach’s book) and one needs in addition to prove extensionality in some cases for the additional objects involved (there are instances where it turned out that some condition added by analysts to the mappings was exactly what was needed in order to show some weak form of extensionality – look for “Condition E” and “Condition W” in Kohlenbach’s papers). I did not say anything about this, firstly in order not to complicate things and secondly because I haven’t done personally any research into that so I may not be that knowledgeable.
(The ul tag in the comment editor doesn’t seem to work for me…)