James Franklin
British Journal for the Philosophy of Science 38 (1987), pp. 1-18
(An updated version of this article appears as chapter 15 of An Aristotelian Realist Philosophy of Mathematics and separately in A. Aberdein and I. Dove, eds, The Argument of Mathematics, Springer, 2013, 11-29 ... full text.)
Abstract: Mathematicians often speak of conjectures as being confirmed by evidence that falls short of proof. For their own conjectures, evidence justifies further work in looking for a proof. Those conjectures of mathematics that have long resisted proof, such as Fermat's Last Theorem and the Riemann Hypothesis, have had to be considered in terms of the evidence for and against them. It is argued here that it is not adequate to describe the relation of evidence to hypothesis as `subjective', `heuristic' or `pragmatic', but that there must be an element of what it is rational to believe on the evidence, that is, of non-deductive logic.
The occurrence of non-deductive logic, or logical probability, in mathematics is an embarrassment. It is embarrassing to mathematicians, used to regarding deductive logic as the only real logic. It is embarrassing for those statisticians who wish to see probability as solely about random processes or relative frequencies: surely there is nothing probabilistic about the truths of mathematics? It is a problem for philosophers who believe that induction is justified not by logic but by natural laws or the ‘uniformity of nature’: mathematics is the same no matter how lawless nature may be. It is awkward even for proponents of non-deductive logic. If non-deductive logic deals with logical relations weaker than entailment, how can such relations hold between the necessary truths of mathematics?
Previous work on this topic has therefore been rare. There is one notable exception, the pair of books by the mathematician George Polya on Mathematics and Plausible Reasoning [1954]. Despite their excellence, they have been little noticed by mathematicians, and even less by philosophers. Undoubtedly this is largely because of Polya’s unfortunate choice of the word ‘plausible’ in his title – ‘plausible’ has a subjective, psychological ring to it, so that the word is almost equivalent to ‘convincing’ or ‘rhetorically persuasive’. Arguments that happen to persuade, for psychological reasons, are rightly regarded as of little interest in mathematics and philosophy. Polya in fact made it clear, however, that he was not concerned with subjective impressions, but with what degree of belief was justified by the evidence ([1954] I p. 68). This will be the point of view argued for here, and there will be frequent reference to Polya’s books.
Non-deductive logic deals with the support, short of entailment, that some propositions give to others. If a proposition has already been proved true, there is of course no longer any need to consider non-conclusive evidence for it. Consequently, non-deductive logic will be found in mathematics in those areas where mathematicians consider propositions which are not yet proved. These are of two kinds. First there are those that any working mathematician deals with in his preliminary work before finding the proofs he hopes to publish, or indeed before finding the theorems he hopes to prove. The second kind are the long-standing conjectures which have been written about by many mathematicians, but which have resisted proof.
It is obvious on reflection that a mathematician must use non-deductive logic in the first stages of his work on a problem. Mathematics cannot consist just of conjectures, refutations and proofs. Anyone can generate conjectures, but which ones are worth investigating? Which ones are relevant to the problem at hand? Which can be confirmed or refuted in some easy cases, so that there will be some indication of their truth in a reasonable time? Which might be capable of proof by a method in the mathematician’s repertoire? Which might follow from someone else’s theorem? Which are unlikely to yield an answer until after the next review of tenure? The mathematician must answer these questions to allocate his time and effort. But not all answers to these questions are equally good. To stay employed as a mathematician, he must answer a proportion of them well. But to say that some answers are better than others is to admit that some are, on the evidence he has, more reasonable than others, that is, are rationally better supported by the evidence. This is to accept a role for non-deductive logic.
The area where a mathematician must make the finest discriminations of this kind – and where he might, in theory, be guilty of professional negligence if he makes the wrong decisions – is as a supervisor advising a prospective Ph.D. student. It is usual for a student beginning a Ph.D. to choose some general field of mathematics and then to approach an expert in the field as a supervisor. The supervisor then chooses a problem in that field for the student to investigate. In mathematics, more than in any other discipline, the initial choice of problem is the crucial event in the Ph.D.-gathering process. The problem must be
but most importantly
It is recognised that of the enormous number of unsolved problems that have been or could be thought of, the tractable ones form a small proportion, and that it is difficult to discern which they are. The skill in non-deductive logic required of a supervisor is high. Hence the advice to Ph.D. students not to worry too much about what field or problem to choose, but to concentrate on finding a good supervisor.
It is also clear why it is hard to find Ph.D. problems that are also
It is not possible to dismiss these non-deductive techniques as simply ‘heuristic’ or ‘pragmatic’ or ‘subjective’. Although these are correct descriptions as far as they go, they give no insight into the crucial differences among techniques, namely, that some are more reasonable and consistently more successful than others. ‘Successful’ can mean ‘lucky’, but ‘consistently successful’ cannot. ‘if you have a lot of lucky breaks, it isn’t just an accident’, as Marx said (Chandler [1980], p. 560). Many techniques can be heuristic, in the sense of leading to the discovery of a true result, but we are especially interested n those which give reason to believe the truth has been arrived at, and justify further research. Allocation of effort on attempted proofs may be guided by many factors, which can hence be called ‘pragmatic’, but from those, such as sheer stubbornness, which are not. Opinions on which approaches are likely to be fruitful in solving some problem may differ, and hence be called ‘subjective’, but the beginning graduate student is not advised to pit his subjective opinion against the experts’ without good reason. Damon Runyon’s observation on horse-racing applies equally to courses of study: ‘The race is not always to the swift, nor the battle to the strong, but that’s the way you bet’ (Fadiman [1955], p. 794). An example where the experts agreed on their opinion, and were eventually proved right, is the classification of finite simple groups, described in the next section.
It is true that similar remarks could also be made about any attempt to see rational principles at work in the evaluation of hypotheses, not just those in mathematical research. In scientific investigations, various inductive principles obviously produce results, and are not simply dismissed as pragmatic, heuristic or subjective. Yet it is common to suppose that they are not principles of logic, but work because of natural laws (or the principle of causality, or the regularity of nature). This option is not available in the mathematical case. Mathematics is true in all worlds, chaotic or regular; any principles governing the relationship between hypothesis and evidence n mathematics can only be logical.
In modern mathematics, it is usual to cover up the processes leading to the construction of a proof, when publishing it – naturally enough, since once a result is proved, any non-conclusive evidence that existed before the proof is no longer of interest. This was not always the case. Euler, in the eighteenth century, regularly published conjectures which he could not prove, with his evidence for them. He used, for example, some daring and obviously far from rigorous methods to conclude that the infinite sum
1 + 1/4 + 1/9 + 1/16 + ...
(where the numbers on the bottom of the fractions are the successive squares of whole
numbers) is equal to the prima facie unlikely value π2/6 . Finding that the two expressions agreed to seven decimal places, and that a similar method of argument led to the already proved result
1 - 1/3 + 1/5 - 1/7 + ... = π/4
Euler concluded, ‘For our method, which may appear to some as not reliable enough, a great confirmation comes here to light. Therefore, we shall not doubt at all of the other things which are derived by the same method’. He later proved the result (Polya [1954], I, pp. 18-21). A translation of another of Euler’s publications devoted to presenting ‘such evidence … as might be regarded as almost equivalent to a rigorous demonstration’ of a proposition is given as a chapter in Polya’s books ([1954], I, pp. 91-98).
Even today, in fact, mathematicians occasionally mention in print the evidence that led to a theorem. Since the introduction of computers, and even more since the recent use of symbolic manipulation of software packages, it has become possible to collect large amounts of evidence for certain kinds of conjectures (Andrews [1984], pp. 18 & 24).
At present, it is usual to delay publication until proofs have been found. This rule is broken only in work on those long-standing conjectures of mathematics which are believed to be true, but have so far resisted proof. Three of these are especially famous. To non-mathematicians, the best known are probably the conjectures in number theory called Fermat’s Last Theorem and Goldbach’s Conjecture. The evidence for these will be mentioned below. The most notable to mathematicians themselves, however, is undoubtedly the Riemann Hypothesis.
Riemann stated in his [1859] that he thought it ‘very likely’ that
All the roots of the Riemann zeta function (with certain trivial exceptions) have real part equal to ½.
This is the Riemann Hypothesis. The precise meaning of the terms involved is not very difficult to grasp (account in Edwards [1974]) but for the present purpose it is only necessary to observe that this is a simple universal proposition like ‘all ravens are black’. It is also true that the roots of the Riemann zeta function have a natural order, so that one scan speak of ‘the first million roots’ (there are infinitely many roots). Once it became clear that the Riemann Hypothesis would be very hard to prove, it was natural to look for evidence of its truth (or falsity). The simplest kind of evidence would be ordinary induction: Calculate as many of the roots as possible and see if they all have real part ½. This is in principle straightforward (though in practice computational mathematics is difficult, since one needs to devise subtle algorithms which save as much calculation as possible, so that the results can go as far as possible). Such numerical work was begun by Riemann, and was carried on later with the results below:
Worker |
Number of roots found to have real part ½ |
Gram (1903) Backlund (1914) Hutchinson (1925) Titchmarch (1935/6) |
15 79 138 1,041 |
‘Broadly speaking, the computations of Gram, Backlund and Hutchinson contributed substantially to the plausibility of the Riemann Hypothesis, but gave no insight into the question of why it might be true’ (Edwards [1974], p. 97).
The next investigations were able to use electronic computers, and the results were:
Lehmer (1956) |
25,000 |
Meller (1958) |
35,337 |
Lehman (1966) |
250,000 |
Rosser, Yohe & Schoenfeld (1968) |
3,500,000 |
Brent (1979) |
81,000,001 |
(Table from Brent et al. [1982].)
In recent years, work with the latest computers has proceeded with a collaboration between R. P. Brent, in Canberra, and van de Lune, te Riele and Winter, in Amsterdam. The latest result in print is from the Dutch team [1983]: the first 300,000,001 roots of the Riemann zeta function have real part ½. They promise further results. It is one of the largest inductions in the world.
Besides this simple inductive evidence, there are some other reasons for believing that Riemann’s Hypothesis is true (and some reasons for doubting it). In favour, there are:
For any e > 0, with probability one the number of heads minus the number of tails in n tosses grows less rapidly than n1/2+e .
(Recall that n1/2 is another notation for √ n.)
Now we form a sequence of ‘heads’ and ‘tails’ by the following rule: Go along he sequence of numbers and look at their prime factors. If a number has two or more prime factors equal (i.e., is divisible by a square), do nothing. If not, its prime factors must be all different; if it has an even number of prime factors, write ‘heads’. If it has an odd number of prime factors, write ‘tails’. The sequence begins:
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 …
|| || || || || || || || ||
2 2´
3 2
3
2´
5 2
´
3 2´
7 3´
5 2
T T T H T H T T H H T
The resulting sequence is of course not ‘random’ in the sense of ‘probabilistic’, since it is totally determined. But it does look ‘random’ in the sense of ‘patternless’ or ‘erratic’ (such sequences are common in number theory, and are studied by the branch of the subject called, perhaps misleadingly, ‘probabilistic number theory’; see Kubilius [1964].) From the above, it is likely that
For any e > 0, the number of heads minus the number of tails in the first n ‘tosses’ in this sequence grows less rapidly than n1/2+e .
This statement is equivalent to Riemann’s Hypothesis. Edwards comments, in his book on the Riemann zeta function,
One of the things which makes the Riemann hypothesis so difficult is the fact that there is no plausibility argument, no hint of a reason, however unrigorous, why it should be true. This fact gives some importance to Denjoy’s probabilistic interpretation of the Riemann hypothesis which, though it is quite absurd when considered carefully, gives a fleeting glimmer of plausibility to the Riemann hypothesis (Edwards [1974], p. 268).
In the balance against the Riemann Hypothesis there are the following arguments:
The discoveries … completely vitiate any argument based on the Riemann-Siegel formula and suggest that, unless some basic cause is operating which has eluded mathematicians for 110 years, occasional roots r off the line [i.e., with real part not ½] are altogether possible. In short, although Riemann’s insight was stupendous it was not supernatural, and what seemed ‘probable’ to him in 1859 might seem less so today (Edwards [1974], p. 166).
This is an example of the non-deductive rule given by Polya, ‘Our confidence in a conjecture can only diminish when a possible ground for the conjecture is exploded’ (Polya [1954], II, p. 20).
If tables are drawn up for the number of primes less than x and the values of this integral for x as far as calculations can reach, then it is always found that the number of primes less than x is actually less than the integral. On this evidence, it was thought for many years that this was true for all x. Nevertheless Littlewood proved that this is false. While he did not produce an actual number for which it is false, I appears that the first such number is extremely large – well beyond the range of computer calculations. Edwards comments
In the light of these observations, the evidence for the Riemann hypothesis provided by the computations of Rosser et al. … loses all its force (Edwards [1974], p. 269).
This seems too strong a conclusion, since the relevance of Littlewood’s discovery to Riemann’s Hypothesis is far from clear. But it does give some reason to suspect that there may be a very large counterexample to Riemann’s Hypothesis, although there are no small ones.
It is plain, then, that there is much more to be said about Riemann’s Hypothesis than, ‘It is neither proved nor disproved’. Without non-deductive logic, though, nothing more can be said.
The situation with Fermat’s Last Theorem and Goldbach’s Conjecture is similar, and the evidence for them can be described briefly.
Fermat’s Last Theorem, asserted by Fermat about 1637, is
It is impossible to find whole numbers X,Y,Z and n, with n ≥ 3, such that Xn + Yn = Zn (with X, Y, Z not zero) (see Edwards [1977]).
Calculations can be undertaken to check that Xn + Yn ≠ Zn for small values of X, Y, Z and n. It is often possible, however, for fixed values of the exponent n, to deal with all he (infinitely many) X, Y, Z at once. Fermat proved (with one of the many proofs of number theory that have survived for centuries without any counterexamples, monster-barring, concept-stretching or other approved aids) that there can be no solutions if n = 4. Other values were added later, and Kummer in 1847 showed that the Fermat equation has no solutions whenever n is a "regular" prime. The condition for being "regular" is rather complicated; it is believed (but not proved) that most primes are regular, and the first non-regular prime is 37. Later, it was found that if there is a counterexample to Fermat’s Last Theorem, then this implies certain conditions on n which can be checked by calculation. The best result in this direction is that of Wagstaff [1978], who checked by computer:
Fermat’s Last Theorem is true for all n up to 125,000.
It is also useful to divide Fermat’s Last Theorem into two cases according to some rather complicated conditions on X, Y, Z and n. The simpler first case has been checked by computer for n up to 6 ´ 109 (Lehmer [1981]).
In a letter to Euler of 1742, Goldbach conjectured that
Every even number (except 2) is the sum of two primes.
This conjecture is still neither proved nor disproved. Various consequences of it have been proved (Polya [1954], II, p. 210) and, remarkably, connections have appeared between Goldbach’s Conjecture and Riemann’s Hypothesis. Hardy and Littlewood proved in 1924 that a generalisation of Riemann’s Hypothesis and a certain estimate implied that most even integers are the sum of two primes. Vinogradov in [1937] showed that every sufficiently large odd integer is the sum of three primes, and these methods were soon adapted to show Hardy and Littlewood’s result without any assumptions. In [1948] Renyi found that every even number is the sum of a prime and an ‘almost prime’ (a number with few prime factors). Linnik in [1952] showed that Riemann’s Hypothesis itself implied a proposition relevant to Goldbach’s Conjecture. Results on the problem are still sometimes found, but there do not seem to have been dramatic advances in the last thirty years.
A last mathematical example of the central role of non-deductive inference is provided by the classification of finite simple groups, one of the great co-operative efforts of modern pure mathematics. As a case study, it has the merit that the non-deductive character of certain aspects was admitted rather explicitly by the principals. This happened because of the size of the project. Since so many people were involved, living in different continents and working over some years, it was necessary to present partial findings in print and at conferences, with explanations as to how these bore on the overall results hoped for.
Groups are one of the basic abstract entities of mathematics, having uses in describing symmetry, in classifying the various kinds of curved surfaces and in many other areas. (‘Group’ is a technical term, and does not mean ‘collection’ as it does in English.) To read the following it is only necessary to know:
A fundamental question is then: how many different finite simple groups are there? And what is the order of each? (One needs of course to prove the answers are correct.) It is these questions that were attacked by the classification of finite groups project.
The project proper covers the twenty years from 1962 to 1981 inclusive. Groups had been studied in the nineteenth and early twentieth centuries, and various finite simple groups were found. It was discovered that most of them fell into a number of infinite families. These families were quite well described by the mid-1950s, with some mopping-up operations later. There were, however, five finite simple groups left over from this classification, called the Mathieu groups after their discoverer in the 1860s. Around 1960 it was not known whether any more should be expected, or, if not, how much work it might take to prove that these were the only possible ones.
The field was opened up by the celebrated theorem of Feit and Thompson (of Cornell and Chicago Universities respectively) in [1963]:
The order of any finite simple group is an even number.
Though the result is easy to state and understand, their proof required an entire 255-page issue of the Pacific Journal of Mathematics. This theorem is a consequence of the full classification result (since if one knew all the finite simple groups, one could simply check that the order of each of them was even). It thus appeared that if the full classification could be found at all, it would be a vast undertaking.
In fact, the final step in the answer was added in February, 1981. The full proof is spread over some 300 to 500 journal papers, taking up somewhere between 5,000 and 10,000 pages (Gorenstein [1982], p. 1).
We wish to examine the logical situation as the proof developed, noting particularly the confidence – justified as it happened – that the workers in the field had in the answer long before the end was reached.
It turned out that the five Mathieu groups were not the only ‘sporadic’ groups, as groups outside the infinite families came to be called. The first new one was discovered by Zvonimir Janko [1966], in Canberra, and excitement ran high as researchers applied many methods and discovered more. The final tally of sporadic groups stands at 26. These ‘discoveries’ had in many cases a strong non-deductive aspect, as explained by Daniel Gorenstein, of Rutgers, who later became the father figure of the project and leading expert on how it was progressing:
Another aspect of sporadic group theory makes the analogy with elementary particle theory even more apt. In a number of cases (primarily but not exclusively those in which computer calculations were ultimately required) "discover" did not include the actual construction of a group – all that was established was strong evidence for the existence of a simple group G satisfying some specified set of conditions X. The operative metamathematical group principle is this: if the investigation of an arbitrary group G having property X does not lead to a contradiction but rather to a "compatible" internal subgroup structure, then there exists an actual group with property X. In all cases, the principle has been vindicated; however, the interval between discovery and construction has varied from a few months to several years (Gorenstein [1982], pp. 3-4).
Michael Aschbacher, another leader of the field in the 1970s, distinguished three stages for any new group: discovery, existence and uniqueness.
I understand a sporadic group to be discovered when a sufficient amount of self-consistent information about the group is available … Notice that under this definition the group can be discovered before it is shown to exist … Of course the group is said to exist when there is a proof that there exists some finite simple group satisfying P … (Aschbacher [1980], pp. 6-7).
Some groups attracted more suspicion than others; for example that discovered by Richard Lyons was for some time habitually denoted Ly? and spoken of in such terms as, ‘If this group exists, it has the following properties …’ (Tits [1969/70], p. 204). Lyons entitled his original paper ‘Evidence for the existence of a new finite simple group’. A similar situation arose with another of the later groups, discovered by O’Nan. His paper, ‘Some evidence for the existence of a new simple group’, was devoted to finding ‘some properties of the new simple group G, whose existence is pointed at by the above theorems’ (O’Nan [1976], p. 422).
The rate of discovery of new sporadic groups slowed down after 1970, and attention turned more to the problem of showing that there were no more possible. At a conference at the University of Chicago in 1972 Gorenstein laid out a sixteen-point program for completing the classification (Gorenstein [1979]). It was thought over-optimistic at the time, but immense strides were soon made by Aschbacher, Glauberman and others, more or less following Gorenstein’s program.
The turning point undoubtedly occurred at the 1976 summer conference in Duluth, Minnesota. The theorems presented there were so strong that the audience was unable to avoid the conclusion that the full classification could not be far off. From that point on, the practicing finite group theorists became increasingly convinced that the "end was near" – at first within five years, then within two years, and finally momentarily. Residual skepticism was confined largely to the general mathematical community, which quite reasonably would not accept at face value the assertion that the classification theorem was "almost proved" (Gorenstein [1982], pp. 5-6).
(Notice that ‘almost proved’ indeed does not mean anything in deductive logic. With hindsight, one can say that a theorem was almost proved when most of the steps in the proof were found; but before a proof is complete, there can only be good reason to believe that a sequence of existing steps will constitute most of a future proof.)
By the time of the conference at Durham, England in 1978 (described in its Proceedings as on ‘the classification of simple groups, a programme which is now almost complete’) optimism ran even higher. At that stage existence and uniqueness had been proved for 24 of the sporadic groups, leaving two ‘for which considerable evidence exists’ (Collins [1980], p. 21). One of these was successfully dealt with in 1980 (‘four years after Janko’s initial evidence for such a sporadic group’ (Gorenstein [1982], p. 110), and attention focussed on the last one, known as the ‘Monster’ because of its immense size (order about 1054).
That the search for sporadic groups was not totally haphazard can be seen from the remarkable simultaneous realization by Fischer in West Germany and Griess in the United States in 1974 that there might be a simple group having a covering group … (Gorenstein [1982], p. 92).
Consequences of the existence of this group were then studied:
Soon after the initial "discovery", Griess, Conway and Norton noticed that every nontrivial irreducible character of a group G of type F, has degree at least 196,883 and very likely such a group G must have a character of this exact degree. Indeed, on this assumption, Fischer, D. Livingstone and Thorne eventually computed the full character table of such a group G (Gorenstein [1982], pp. 126-7).
Aschbacher, lecturing at Yale in 1978, said ([1980], p. 13):
When the Monster was discovered it was observed that, if the group existed, it must contain two new sporadic groups (the groups denoted by F3 and F5 in Table 2) whose existence had not been suspected up to that time. That is, these groups were discovered as subgroups of the Monster. Since that time the groups F3 and F5 have been shown to exist. This is analogous to the situation in the physical sciences where a theory is constructed which predicts certain physical phenomena that are later verified experimentally. Such verification is usually interpreted as evidence that the theory is correct. In this case, I take the existence of F3 and F5 to be very good evidence that the Monster exists.
He added, concerning the prospects of discovering more sporadic groups:
My belief is that there are at most a few groups yet to be discovered. If I were to bet, I would say no more (pp. 14-15).
Gorenstein’s survey article of 1978 (Gorenstein [1979]) contains perhaps the experts’ last sop to deductivism, the thesis that all logic is deductive. He wrote:
At the present time the determination of all finite simple groups is very nearly complete. Such an assertion is obviously presumptuous, if not meaningless, since one does not speak of theorems as "almost proved" (pp. 50-1).
To the deductivist, the fact that most steps in a proposed proof are completed is no reason to believe that the rest will be.
Undeterred, however, Gorenstein went on to say:
The complete proof, when it is obtained, will run to well over 5,000 journal pages! Moreover, it is likely that at the present time more than 80% of those pages exist (p. 51).
The assertion that the classification is nearly complete is really a prediction that the presently available techniques will be sufficient to deal with the problems still outstanding. In its support, we cite the fact that, with two exceptions, all open questions are open because no one has yet examined them and not because they involve some intrinsic difficulty (p. 52).
A year after the Durham conference, the experts assembled again at Santa Cruz, California, in a mood of what could only be called supreme confidence. Gorenstein’s survey opened with the remark:
My aim here is to present a brief outline of the classification of the finite simple groups, now rapidly nearing completion (Gorenstein [1980], p. 3).
Another contributor to the conference began his talk:
Now that the problem of classifying finite simple groups is probably close to completion … (Hunt [1980], p. 507).
What concern there was was less about the completion of the project than about what to do next; the editor of the conference proceedings began by commenting, ‘In the last year or so there have been widespread rumors that group theory is finished, that there is nothing more to be done’ (Mason [1980], p. xiii). These became something more than rumours when the New York Times Week in Review (June 22, 1980) headlined an article ‘A School of Theorists Works Itself Out of a Job.’
All this confidence proved justified. Griess was able to show the existence of the Monster, and finally, in 1981, Simon Norton of Cambridge University completed the proof of the uniqueness of the Monster (Gorenstein [1982], p. 1).
3 AXIOMATISATION OF NON-DEDUCTIVE LOGIC
The correctness of the above arguments is not affected by the success or failure of any attempts to formalise, or give axioms for, the notion of non-deductive support between propositions. Many fields of study, such as geometry in the time of Pythagoras or pattern-recognition today, have yielded bodies of truths while still resisting reduction to formal rules. Even so, it is natural to ask whether the concept is easily formalisable. This is not the place for detailed discussion, since the problem has nothing to do with mathematics, and has been dealt with mainly in the context of the philosophy of science. The axiomatisation that has proved serviceable is the familiar axiom system of conditional probability: if h (for ‘hypothesis’) and e (for ‘evidence’) are two propositions, P(h,e) is a number between 0 and 1 inclusive expressing the degree to which h is supported by e, which satisfies
P (not-h,e) = 1 – P(h,e)
P (h’, h & e) ´ P(h,e) = P(h, h’ & e) ´ P(h’,e)
While some authors (Carnap [1950], Stove [1973]) have been satisfied with this system, others (Keynes [1921], Koopman [1940] have thought it too strong to attribute an exact number to P(h,e) in all cases, and have weakened the axioms accordingly. Their modifications are essentially minor.
Needless to say, command of these principles alone will not make anyone a shrewd judge of hypotheses, any more than perfection in deductive logic will make him a great mathematician. To achieve fame in mathematics, it is only necessary to string together enough deductive steps to prove an interesting proposition, and submit the results to Inventiones Mathematicae. The trick is finding the steps. Similarly in non-deductive logic, the problem is not in knowing the principles, but in bringing to bear the relevant evidence.
The principles nevertheless provide some help in deciding what evidence will be helpful in confirming the truth of a hypothesis. It is easy to derive from the above axioms the principle
If h & b implies e, but P(e,b) < 1, then P(h,e & b) > P(h,b).
If h is thought of as hypothesis, b as background information, and e as new evidence, this principle can be expressed as ‘The verification of a consequence renders a conjecture more probable’, in Polya’s words ([1954], II, p. 5). He calls this the ‘fundamental inductive pattern’; its use was amply illustrated in the examples in section 2 above. Further patterns of inductive inference, with mathematical examples, are given in Polya ([1954], ch. XII-XIII).
There is one point that needs to be made precise especially in applying these rules in mathematics. If e entails h, then P(h,e) is 1. But in mathematics, the typical case is that e does entail h, though this is perhaps as yet unknown. If, however, P(h,e) is really 1, how is it possible in the meantime to discuss the (non-deductive) support that e may give to h, that is, to treat P(h,e) as not equal to 1? In other words, if h and e are necessarily true or false, how can P(h,e) be other than 0 or 1?
The answer is that, in both deductive and non-deductive logic, there can be many logical relations between two propositions; some may be known and some not. To take an artificially simple example in deductive logic, consider the argument
If all men are mortal, then this man is mortal
All men are mortal
Therefore, this man is mortal
The premises entail the conclusion, certainly, but there is more to it than that. They entail the conclusion in two ways: firstly, by modus ponens, and secondly by instantiation from the second premise alone. More complicated and realistic cases are common in the mathematical literature. Feit and Thompson’s proof that all finite simple groups have even order, occupying 255 pages, was simplified by Bender [1970]. That means that Bender found a different, and shorter, logical route from the definition of ‘finite simple group’ to the proposition, ‘All finite simple groups have even order’ than the one known to Feit and Thompson.
Now just as there can be two deductive paths between premises and conclusion, so there can be a deductive and non-deductive path, with only the latter known. Before the Greeks’ development of deductive geometry, it was possible to argue
All equilateral (plane) triangles so far measured have been found to be equiangular
This triangle is equilateral
Therefore, this triangle is equiangular
There is a non-deductive logical relation between the premises and the conclusion; the premises support the conclusion. But when deductive geometry appeared, it was found that there was also a deductive relation, since the second premise alone entails the conclusion. This discovery in no way vitiates the correctness of the previous non-deductive reasoning, or casts doubt on the existence of the non-deductive relation.
4 THE PROBLEM OF INDUCTION IN MATHEMATICS
That non-deductive logic is used in mathematics is important first of all to mathematics. But there is also some wider significance for philosophy, in relation to the problem of induction, or inference from the observed to the unobserved.
It is common to discuss induction using only examples from the natural world, such as, ‘All observed flames have been hot, so the next flame observed will be hot’ and ‘All observed ravens have been black, so the next observed raven will be black’. This has encouraged the view that the problem of induction should be solved in terms of natural laws (or causes, or dispositions, or the regularity of nature) that provide a kind of cement to bind the observed to the unobserved. The difficulty for such a view is that it does not apply to mathematics, where induction works just as well as in natural science.
Examples of this were given above in section 2, in connection with Riemann’s Hypothesis, but let us take a particularly straightforward case:
The first million digits of p are random
Therefore, the second million digits of p are random.
(‘Random’ here means ‘without pattern’, not ‘probabilistically generated’.)
The number π has the decimal expansion
3.14159265358979323846264338327950288419716939937...
There is no apparent pattern in these numbers. The first million digits have in fact been written down (Guilloud and Bouyer [1974]). Inspection of these digits reveals no pattern, and computer calculations can confirm this impression. It can then be argued inductively that the second million digits will likewise exhibit no pattern. This induction is a good one (indeed, everyone believes that the digits of p continue to be random indefinitely, though there is no proof), and there seems to be no reason to distinguish the reasoning involved here from that used in inductions about flames or ravens. But the digits of p are the same in all possible worlds, whatever natural laws may hold in them, or fail to. Any reasoning about p is also rational or otherwise, regardless of any empirical facts about natural laws. Therefore, induction can be rational independently of whether there are natural laws.
This argument does not show that natural laws have no place in discussing induction. It may be that mathematical examples of induction are rational because there are mathematical laws, and that the aim with scientific examples is to find some substitute, such as natural laws, which will take the place of mathemaical laws in accounting for the continuance of regularity. But if this line of reasoning is pursued, it is clear that simply making the supposition, ‘There are laws’, is of little hep in making inductive inferences. No doubt mathematics is completely lawlike, but this does not help at all in deciding whether the digits of p continue to be random. In the absence of any proofs, induction is needed to support the law (if it is a law), ‘The digits of p are random’, rather than the law giving support to the induction. Either ‘The digits of p are random’ or ‘The digits of p are not random’ is a law, but in the absence of knowledge as to which, we are left only with the ccocnfirmation the evidence gives to the first of these hypotheses. Thus consideration of a mathematical example reveals what can be lost sight of in the search for laws: laws or no laws, non-deductive logic is needed to make inductive inferences.
It is worth noting that there are also mathematical analogues of Goodman’s ‘grue’ paradox. Let a number of be called ‘prue’ of its decimal expanbsion is random for the firs million digits and 6 thereafter. The predicate ‘prue’ is like ‘grue’ in not being projectible. ‘p is random for the first million digits’ is logically equivalent to ‘p is prue for the first million digits’, but this proposition supports ‘p is random always’, not ‘p is prue’. Any solutions to the ‘grue’ paradox must allow projectible or ‘natural’ properties to be found not only in nature but also in mathematics.
These considerations illustrate Polya’s remark that non-deductivev logic is better appreciated in mathematics than in the natural sciences ([1954], II, p. 24). In mathematics there can be no confusion over natural laws, the regularity of nature, approximations, propensities, the theory-ladenness of observation, pragmatics, scientific revolutions, the social relations of science or any other red herrings. There are only the hypothesis, the evidence and the logical relations between them.
REFERENCES
Andrews, G. E. [1984]: ‘On the Wall Polymonials and the L-M-W conjectures’, J. Australian Math. Soc., Ser. A, 37, pp. 17-26.
Aschbacher, M. [1980]: The Finite Simple Groups and Their Classification. Yale University Press.
Bender, H. [1970]: ‘On the Uniqueness Theorem’, Illinois J. Math., 14, pp. 376-84.
Brent, R. P., Van de Lune, J., Te Riele, H. J. J. and Winter, D. T. [1982]: ‘On the Zeros of the Riemann Zeta Function in the Critical Strip (II)’, Mathematics of Computation, 39, pp. 681-88.
Carnap, R. [1950]: Logical Foundations of Probability. Routledge.
Chandler, C. [1980]: Hello, I Must Be Going: Groucho Marx and His Friends. Sphere.
Collins, M. J. (ed.) [1980]: Finite Simple Groups II. Academic
Cooperstein, B., and Mason, G. (eds.) [1980]: The Santa Cruz Conference on Finite Groups, Proc. Of Symposia in Pure Mathematics. Amer. Math. Soc., vol. 37.
Davis, P. J. and Hersh, R. [1981]: The Mathematical Experience. Birkhauser. Ch. 8.
Edwards, H. M. [1974]: Riemann’s Zeta Function. Academic.
Edwards, H. M. [1977]: Fermat’s Last Theorem. Springer.
Fadiman, C. [1955]: The American Treasury. Harper.
Feit, W. and Thompson, J. [1963]: ‘Solvability of Groups of Odd Order’, Pacific J. Math., 13, pp. 775-1029.
Gorenstein, D. [1979]: ‘The Classification of Finite Simple Groups (I), Bull. Amer. Math. Soc. New Series, 1, pp. 43-199.
Gorenstein, D. [1980]: ‘An Outline of the Classification of Finite Simple Groups’, in Cooperstein and Mason [1980], pp. 3-28.
Gorenstein, D. [1982]: Finite Simple Groups. Plenum.
Guilloud, J., and Bouyer, M. [1974]: 1,000,000 de decimales de p . Commissariat à l’energie Atomique.
Hunt, D. C. [1980]: ‘A Computer-based Atlas of Finite Simple Groups’, in Cooperstein and Mason [1980], pp. 507-10.
James, R. D. [1949]: ‘Recent Progress in the Goldbach Problem’, Bull. Amer. Math. Soc., 55, pp. 246-60.
Janko, Z. [1966]: ‘A New Finite Simple Group With Abelian 2-Sylow Subgroups and Its Characterization’, J. Algebra, 3, pp. 147-86.
Keynes, J. M. [1921]: A Treatise on Probability. Macmillan.
Kolata, G. B. [1976]: ‘Mathematical Proofs: The Genesis of Reasonable Doubt’, Science, 192, pp. 989-90.
Koopman, B. O. [1940]: ‘The Axioms and Algebra of Intuitive Probability’, Annals of Mathematics, 42, pp. 269-92.
Kubilius, J. [1964]: Probabilistic Methods in the Theory of Numbers. Amer. Math. Soc. Translations of Mathematical Monographs, no. 11.
Lehmer, D. H. [1981]: ‘On Fermat’s Quotient, Base Two’, Mathematics of Computation, 36, pp. 289-90.
Linnik, Yu. [1952]: ‘Some Conditional Theorems Concerning the Binary Goldbach Problem’, Izv. Akad. Nauk SSSR, 16, pp. 503-20.
Lyons, R. [1972]: ‘Evidence for a New Finite Simple Group’, J. Algebra, 20, pp. 540-69.
Mason, G. [1980]: Preface to Cooperstein and Mason (1980).
O’Nan, M. [1976]: ‘Some Evidence for the Existence of a New Finite Simple Group’, Proc. London Math. Soc. 32, pp. 421-79.
Polya, G. [1954]: Mathematics and Plausible Reasoning (vol. I, Induction and Analogy in Mathematics, and vol. II, Patterns of Plausible Inference). Princeton University Press.
Renyi, A. [1962]: ‘On the Representation of an Even Number as the Sum of a Prime and an Almost Prime’, Amer. Math. Soc. Translations (2), 19, pp. 299-321.
Riemann, B. [1859]: ‘On the Number of Primes Less Than a Given Magnitude’, translated in Edwards [1974], pp. 299-305.
Stove, D. C. [1973]: Probability and Hume’s Inductive Scepticism. Oxford University Press.
Tits, J. [1969/70]: ‘Groupes Finis Simples Sporadiques’, Séminaire Bourbaki, in Springer Lecture Notes no. 180.
Van de Lune, J. and Te Riele, H. J. J. [1983]: ‘On the Zeros of the Riemann Zeta Function in the Critical Strip (III)’, Mathematics of Computation, 41, pp. 759-67.
Wagstaff, S. S. [1978]: ‘The Irregular Primes to 125000’, Mathematics of Computation, 32, pp. 583-91.
Weil, A. [1948]: Courbes Algébriques et Variétés Abéliennes. Hermann.
Epilogue
The belief that Fermat’s Last Theorem was true was confirmed by the celebrated proof of Andrew Wiles and Richard Taylor, published in 1995. Riemann’s Hypothesis remains unproved, but belief in it remains undiminished.
There is now a journal Experimental Mathematics, some of the papers in which include experimental evidence for conjectures. The Centre for Experimental and Constructive Mathematics at Simon Fraser University (www.cecm.sfu.ca) is a leader I in the field. There has been some general discussion, but no overarching attempt at a theory of the confirmation of conjectures in mathematics.
BORWEIN, J., BORWEIN, P., GINGENSOHN, R. and PARNES, S., `Making sense of experimental mathematics’, Mathematical Intelligencer, 18 (4) (Fall, 1996), pp. 12-18.
EPSTEIN, D., LEVY, S. and R. DE LA LLAVE, `About this journal’, Experimental Mathematics 1 (1992), pp. 1-13.