Cryonics: The Probability of Rescue

Probability concepts often seem simple, and indeed not only experts but ordinary people apply them usefully every day; they are indispensable to survival, let alone successful living.

But at the same time, some of the most fundamental concepts are so subtle and slippery that even the greatest names in mathematics have often been confused.

When we try to apply probability to cryonics, we are combining two concepts, both very poorly understood by almost everyone, including the “experts”–biologists and mathematicians.

The following is an attempt to summarize and clarify the evidence for eventual reversibility of freezing and other damage to cryonic suspension patients, and to put it in a probability-theory framework. Laymen and scientists face two different problems in understanding this approach.

For the laymen, the greatest obstacle is just their own humility; they have often been trained to disbelieve in their own powers of analysis and judgment. It is useful to remember that the Commander-in-Chief of the U.S. armed forces is usually a layman (the President), and that the CEOs of major corporations selling technology are seldom engineers; yet they assess and understand technical advice (in broad outline) and make the ultimate decisions. Non-scientists reading the material below will find some of it too technical, but most of it understandable, if not easy. In particular, they will learn why their intuitions are often correct.

Scientists will face a harder job-viz., breaking the habit of preferring precise but irrelevant numbers over imprecise but pertinent data. What this means will become apparent as we proceed.

The first part of the discussion, laying the groundwork, may be the most difficult for both laymen and scientists (for different reasons). I hope the former will not be put off by the mathematics, which they can skip, and that the latter will pay attention.

Some scientists–usually honest–have forgotten their ethics when discussing cryonics, and made totally irresponsible and unfounded statements about the alleged “low probability” of success–without ever making a probability calculation, or offering any basis for such a calculation! I hope some of them will reconsider.

FOUNDATIONS OF PROBABILITY THEORY

(These notes are adapted from original work I did about forty five years ago; the work was reviewed and stamped kosher by experts in the field.)

Approach A: von Mises

One of the best known “frequency” theories of probability is that advocated by Richard von Mises. [1] According to him, the probability of’ an event, with reference to an experiment, is defined as the limiting relative frequency (in the sense of measure theory) of occurrence of the event in an infinite sequence of identical experiments; the sequence has the randomness property that the results are “indifferent to place selection.” (Lay readers, please don’t get nervous; what this means will gradually become clear, at least in outline.)

That neither infinite sequences nor identical experiments are ever encountered in the real world does not disturb von Mises; he points out that theories generally apply idealized mathematical models to nonideal situations, and that his theory meets the tests of’ useful description and prediction.

He considers that the technical term probability can apply only to mass phenomena or repetitive events, and never to “isolated” events or to intensity of belief as met in common parlance. “…if one speaks of the probability that the two poems known as the Iliad and the Odyssey have the same author, no reference to an infinite sequence of cases is possible and it makes no sense to assign a numerical value to such a ‘possibility.'” [2]

Remarks:

a) The experiments cannot be identical, even for idealized sequences, since if the experiments were truly identical so would be the results, giving rise to certainty rather than probability. (This is on the classical level, not the quantum level.) What is actually required for the idealized sequence is that the experiments be identical on the operational level; but it is also necessary that the operator’s control be imperfect.

b) One of the points slurred over in the exposition by von Mises is that one can only estimate, but never know, any probability. For example, even if a perfectly symmetrical coin were available, no finite sequence of trials could do more than suggest that the probability p for “heads” is in the neighborhood of 1/2. To leap from this estimate to the conclusion that “p actually is exactly 1/2” would require an additional postulate of an ad hoc character.

c) Leibnitz is said to have objected to frequency theories on the ground that one could never exactly prescribe the conditions of the experiment.The answer of the frequentists seems to be that the application of a mathematical model is generally inexact. But von Mises’ theory is extraordinarily vague with respect to application; in fact, this vagueness seems severely to restrict its usefulness; and it has led to the contention by Jeffreys [7], Koopman [4], and many others that the theory really relies on intuition for its application, so that it is not a true frequency theory at all.

In particular, von Mises does not anywhere answer the important question: when is “reference to an infinite sequence of cases” possible? In other words, by what criterion is one to distinguish “repetitive” from “isolated” events? This point may be worth elaborating with a couple of examples:

The toss of a die is specifically admitted by von Mises as a typical repetitive event; yet by what criterion can it be so regarded? The criterion cannot be the fact that dice have often been tossed: for it is clear that the toss of–e.g.–a 12-sided die would be equally a “repetitive” event, although such a thing may never have existed before and might be destroyed after one toss.

The criterion cannot be merely the range of one’s imagination. Von Mises would regard the question of life on Mars as outside the theory; yet it may be as easy to imagine a population of planets Mars–or of Milky Way galaxies for that matter–as to imagine a sequence of tosses of 12-sided dice.

Since frequentists are generally definite enough about whether a particular event is within the scope of the theory, they are certainly applying some criterion, although they do not seem to know what it is. In the synthesis later on, I claim to make this criterion explicit, and thereby to extend the scope of the theory.

The key lies in this question: just how does one obtain information (as in the case of the 12-sided die) from a hypothetical experiment?

Approach B: Doob

An approach somewhat different from that of v. Mises may be found in what is sometimes called the “classical frequency” theory of probability; names associated with this are Doob, Neyman, Feller, and Cramer.

A mathematical model is set up on the basis of measure theory. Following Cramer [3], whose ideas for our purpose are similar enough to those of Doob, we may paraphrase the chief axiom of the theory thus:

AXIOM: “To any random variable X in the n-dimensional space Rn there corresponds a set function P(S) uniquely defined for all Borel sets S in Rn, such that P(S) represents the probability of the event X S; the function P(S) is a non-negative and additive set function such that P(Rn) = 1.”

O.K., not many readers are at home with Borel sets. Again, don’t get nervous; we’ll just say that pursuing this approach gives us an interpretation of the “limit” to which a relative frequency is said to tend.

But it turns out that the application of this theory also is a matter of judgment and guesswork. In the case of a die, for example, the axiom tells us that certain probabilities exist such that P(ace) + P(deuce) + P(three) + P(four) + P(five) + P(six) = 1,

but what these numbers may be, or whether they are all the same, the theory cannot say.

Hence some of the same criticisms made of the v. Mises theory apply here also. We have a new notion of limit, and a new and more powerful calculus of probability, but with respect to application–finding the underlying probabilities–we find the same vagueness.

Approach C: Laplace

That definition of probability most commonly called “classical” is associated chiefly with the names of Pascal, Fermat, and Laplace. It states that if an experiment can have any of n “equipossible” and mutually exclusive outcomes, and if m of these outcomes are “favorable” to the event in question, then the probability of the event, referred to the experiment, is the ratio m/n.

This definiton does not seem “circular”, as claimed by v. Mises, because the notion of “equally likely” is less general, and more primitive, than the Laplacian notion of numerical probability. The definition is certainly incomplete, however, since it leaves “equally likely” as an undefined term. In philosophical language, the definition has only a “constitutive” aspect, and no “epistemic” aspect.

The definition is also said to be weak in that it cannot apply to those cases (e.g. a loaded die) where no breakdown into equally likely cases seems possible. However, the criticism seems to me ill founded. What we essentially have here is simply another starting point for a probability calculus, a viewpoint slightly different from that of v. Mises or Doob; with respect to application, this definition seems hardly weaker than the latter two; there are ways in which the Laplacian notion can be doctored up to include such experiments as tosses of a biased die. (One might say, e.g., that the die was equally likely to be in any of several specified positions as it approached the ground; this specification would favor the heavy side.)

The three theories so far considered all recognize that there must be some sort of correspondence between probabilities and relative frequencies, but none of them gives a criterion for actually assigning a probability to a physical event–not even the theory of v. Mises, although he claims to give the physical events first consideration.

The Laplacian definition is also said to be ambiguous when the possible outcomes are infinitely numerous. Mood e.g. points out that the probability that an integer drawn at random from all the integers will be even depends on the ordering of the integers. However, this objection at most seems academic rather than practical. After all, there does not seem to be any way to draw an integer at random from all the integers! (Even if you could, in the sense of measure theory there would be a zero probability of drawing one small enough to write down!)

In short, the Laplacian definition seems to me to rest on essentially equal footing with the two previously mentioned theories with regard to logical adequacy. From the practical standpoint, there are many instances where the application of the Laplace rule is the quickest and easiest resort.

Approach D: Koopman

There is a (minority) school, mostly in England, holding that the frequency theory of probability is inadequate; some names which have been associated with this thesis are Keynes, Koopman, Good, Jeffreys, Kendall, and de Finetti.

B.O. Koopman [4] e.g,. has produced a detailed axiomatic theory, based on intuition, which seems to be in many respects as practical as v. Mises’ frequency theory and yet broader.

His axioms are based on what he calls “laws of thought,” which he claims are “not subject to experimental verification.” Using two axioms, together with a “body of beliefs,” he is able to construct probability statements about all kinds of events. In general, these statements are non-statistical and non-numerical, but for a certain class of problems Koopman deduces a numerical theory which, for many important applications, is in agreement with frequency theory.

On the epistemic side, Koopman’s position is esentially as follows: “The intuitive thesis in probability holds that both in its meaning and in the laws which it obeys, probability derives directly from the intuition, and is prior to objective experience; … and it holds that all the so-called objective definitions of probability depend for their effective application to concrete cases upon their translation into the terms of intuitive probability.”

What he means is closely related to the remarks made in connection with the v. Mises theory. In the case of a die, for example, the frequency theory does not give any explicit criterion which allows us to regard the toss as a repetitive event; neither does it give us any justification for taking the probability of a deuce e.g. as exactly 1/6. In practice, we rely on intuition, any frequency correspondence we may be able to exhibit being only an approximate one.

From one point of view, therefore, the difference between the Koopman “intuitive” theory and the v. Mises “frequency” theory is not so great after all. They are built on different formal bases; but in each case, in applications one must start with given probabilities, neither theory giving any explicit way to assign a value to an initial probability. The v. Mises theory supposedly restricts the events considered to outcomes of sequences of experiments; but, as pointed out by Feller, these are only conceptual experiments–and we are not told just what is and what is not “conceivable”.

Further remarks:

Kendall [5] has said that the intuitive attitude toward probability is one which “. . . takes probability as ‘a degree of rational belief’… and does not attempt to analyze it into simpler ideas,” and this is true of the Koopman view. The question is, can intuition be analyzed into simpler ideas, or is intuition indeed “prior to objective experience”?

To maintain that intuition cannot be analyzed seems philosophically and psychologically naive; and in fact many studies are available which show just how certain kinds of “intuition” are molded by experience. Intuition varies from one person to another, from time to time in a given individual, and from age to age with respect to mankind as a whole. Yesterday’s obvious truth becomes today’s fallacy, and today’s difficult lesson becomes tomorrow’s truism. In short, it seems consistent with modern ideas to assert, in flat contradiction to Koopman, that intuition derives from experience, and can in principle be explicitly analyzed in terms of experience. In fact, I claim below to show how probability intuitions arise. (This really isn’t very mysterious, or shouldn’t be.)

SYNTHESIS

Reconciliation of “Frequency” Probability and “Personal” Probability

We seek to show that the dominant frequency theory of probability can be applied in an extremely simple and natural way so as to include “single events” and “subjective” probability. It will then be possible to assign a definite (although not necessarily precise) numerical probability to any event of whatever kind: this probability will be at once objective (frequency) and subjective (dependent on state of knowledge).

(We are considering the epistemic or operational definition of probability, and not the calculus of probability or axioms of combination.)

We start with the v. Mises definition of probability of an event with reference to an experiment: the probability is the postulated limiting relative frequency of occurrence of the event in an infinite sequence of “identical” experiments whose results are indifferent to place selection. (The experiments are “identical” on the operational level.) Now let us note the important features of the application of this definition.

1. The postulated limiting frequency is never known exactly. Unless one adopts the dubious alternative of special postulates, such as ergodic principles, one cannot do more than estimate the probability to a finite number of decimal places on the basis of experience.

(Even when tossing a perfect coin, one cannot assume that the nervous system of the person doing the tossing yields perfectly random results; but experience indicates that the magnitude of any shove delivered by human muscles, below a certain level of accuracy, is random or nearly so.)

It is important to note here that the accuracy with which experience allows us to estimate a probability is of no theoretical importance: a probability of 0.5 +/- 0.4 is just as much in the scope of the theory as a probability of 0.5 +/- 0.0000001.

2. In practice, the experiments are not identical even on the operational level. Coins are asymmetrical; mortality tables are subject to revision; etc.

It is important to note therefore that any sequence of experiments remains in the scope of the theory so long as the experiments are sufficiently nearly identical to allow useful calculation.

3. Although frequentists allege that “isolated events” are outside the theory, there are at least two kinds of such events which they save from isolation:

a) Suppose I make a new toy, the 12-sided die mentioned earlier, toss it once and then destroy it. The event that on this toss a particular face will land up is in a sense an isolated event, since experience contains no sequence of such tosses. We therefore seek a larger class of experiments to save the event from isolation; in fact we take cognizance of our experience with symmetrical bodies and random shoves, and this does the trick.

b) “Derivative” probabilities are also recognized. E.g., the event “Phogbound will be reelected senator” is an isolated event; but it is essentially equivalent to the event “A majority of voters favor Phogbound,” and the probability of this latter event can be estimated by sampling.

4) We emphasize again what we stated in the definition: a probability refers not only to an event but also to an experiment in a sequence. Since specifying a particular experiment and sequence is equivalent to specifying a state of knowledge, we see that ordinary frequency probability has a subjective aspect. For examples see below.

Next, a short discussion of certain contentions by frequentists and subjectivists will clear the way for our general approach.

Frequentists say that the vague common parlance notion of “probability”, which applies to single events, hypotheses, etc., is outside the theory. They seem to overlook the fact that this common parlance notion has enormous practical success every day for everyone, and that this success should be capable of analysis. In fact, I assert that this success is due to the (non-explicit) application of just the theory here presented.

Subjectivists say that intuition is at the root of probability, and that intuition is prior to experience. This, as noted, seems to be a psychologically naive position: intuition is varied, intuition is educable, and in fact intuition derives mainly (although unconsciously) from experience.

An example will illustrate our method, and should clarify almost everything for those who have been uneasy with the stilted language of mathematics.

Suppose that next year an inter-conference football game is scheduled between Wayne State University and Michigan State University. We shall find three different “probabilities” that Wayne will win–each objective and with a frequency interpretation, but of course posited on different states of knowledge or different experiments.

Bettor A from Alabama knows nothing about Michigan teams, but he does know that the Associated Press poll has picked Michigan State (by a margin not specified), and that in several years this poll has consistently picked about 80% winners, give or take. For A, therefore, the probability that Wayne will win is about 20% or 0.20 or 1/5, and refers to the following experiment: pick the team chosen by the A.P. poll, and in the long run you will be right about 80% of the time.

Bettor B is a visiting Bantu with no knowledge of American football or polls. For him, the probability that Wayne will win is 1/2, and refers to the following experiment: pick a team by some arbitrary system, perhaps the team wearing a color you like better; in the long run, you will pick about 50% winners.

Bettor C is a coach who “rates” Michigan State four touchdowns better than Wayne; he further knows that, over a period of years, only 5% of four-touchdown underdogs, so rated by him, have won. For him, then, the probability that Wayne will win is 0.05 or 1/20, referred to the experiment indicated.

We see that by suitably examining experience one can find several probabilities for the same event; it is of obvious importance to choose that one which is based on the most appropriate experience (the most appropriate sequence of experiments).

As another example we choose an extreme instance specifically cited by v. Mises as outside the theory, and show that there is at least one way in which it can be treated on exactly the same basis as any other event, viz., the event “The Iliad and the Odyssey have the same author.”

We might make a list of eminent literary historians, tabulate their performance records with reference to disputes, as similar to this as possible, which have finally been settled, and then canvass their opinions. Thus we might arrive at a probability p for the event “the prevailing opinion is right,” referred to the following experiment: when a question of literary history arises, consult this group of experts and adopt the consensus opinion; in the long run, you will be right 100 p percent of the time.

Note that, in examining experience to construct the Kollektiv (v. Mises’ word for the set of events or sequence or experiments), we must look for instances as “similar as possible” to the one in question, and it might be thought that often one would be unable to find enough sufficiently similar instances. This is not the case. By considering ever broader classes of experiments, one can always find a suitable sequence of recorded experiments wbich are “identical” with respect to some sufficiently loose operational criterion.

In practice, one will often be confronted with the following dilemma: whether to use a sequence of experiments which are only roughly similar with respect to some relatively strict criterion; or to use a sequence of experiments which are very nearly identical, but with respect to a much looser criterion. In the football example above, bettor C, the coach, might have gradually changed his rating system over the years, so that the indicated recorded sequence contains experiments by no means identical; while bettor B, the Bantu, refers to an ideal sequence of indistinguishable experiments–and yet the Kollektiv constructed by C is clearly superior.

Scientists tend to love precisely quantified information, and with good reason; but sometimes they forget that the relevance of the information is more important than its precision.

Note, finally, that there is no difference at all, in principle, between constructing a Kollektiv for an historical hypothesis or, for example, for a combination at cards–even though such great minds as that of von Mises have been confused on this point. The difference is merely that in the latter case (cards) it is more obvious which Kollektiv is most appropriate, and it is one with minimal difference in the experiments; relevance and precision here happily coincide, but in most problems of real life this is not the case.

Deciding which body of experience is most appropriate is not difficult in principle: that body of recorded experiments is most appropriate which bears the most detailed resemblance to the experiment at hand. In the football example above, the Coach’s experience was obviously the most suitable (if available) because it most nearly approximated a sequence of games Wayne-vs.-Michigan State-1953. The Bantu’s sequence, on the other hand, was random team-vs.-random team. The Bantu’s probability number is much less uncertain (indeed it is ideally precise), but also much less appropriate, corresponding to a much less relevant body of knowledge.

Briefly to recapitulate thus far:

We define probability with reference to a v. Mises Kollektiv (except that ours is always finite, while v. Mises’ is infinite). To apply the definition in a particular case, we must find a recorded sequence of experiments which are sufficiently similar by some suitable criterion. This can be done in many ways (corresponding to different bodies of knowledge), but usually one can find a most appropriate way; in principle, one always can. The recorded relative frequency of occurrence of the event in this sequence gives the value of p, and no better value can be found without introducing new postulates.

(It should be carefully noticed that the rough, experimental value p is NOT an approximation to some precisely defined but unknown “real” value of p. There is no such “real” value, just as there are no infinite sequences of experiments conducted under precisely defined conditions–leaving out of account the question of physical statistics at the level of elementary particles. The value of p is almost always, necessarily, and inherently vague to some extent …. In estimating population means by sampling, it is true that a population mean can have a well defined value in certain cases, namely cases where the population is unchanging–but this is a rarity in the real world and of very little interest.)

Thus we treat frequency probability and personal probability on a single objective basis, such that we can assign a definite (although imprecise) numerical probability to any event of whatever kind. But “whatever kind” still needs some clarification, as does the question of the uncertainty in the number p, and certain other old questions of doctrine.

Some Probability Notions Clarified

In this section, arguments are abridged even more drastically, and sometimes only conclusions are stated. Those interested in the full discussion may write the author.

A random experiment–following Cramer [3]–is just one whose outcome cannot be surely predicted. Hence randomness is a quality of the observer as well as of the experiment.

Single Experiments: Since we define a probability as a relative frequency, what in meant by the probability of the outcome of an isolated experiment?

Von Mises says that although actuarial statistics may show that only about 1% of Americans die in their fortieth year, it is meaningless to speak about the probability of a particular individual dying during the year. He will live or he will die; he will not live 99% (at least according to pre-cryonics ideas). Yet it is obvious that if we select such a person at random, we have reason for considerable confidence that he will live. In fact, in contrast to that of v. Mises, our formulation above for the probability of an event permits it to be unique in some respects, yet also demands that it be considered part of a set.

Past Events & States of Nature. A great deal has been written about the distinction between “random variables” and “states of nature,” and whether it is possible to make “true probability statements” about the latter. Mood [9] and Neyman [10] for example claim a distinction between “random variables” and “unknown constants” or states of nature. For example, past events (the notion goes) cannot have probabilities; they either occurred or they did not.

In my view there is no such proper distinction. The outcome of every random experiment is equally a “random variable” in that it is not known in advance; likewise, every such outcome is equally a “state of nature,” in that (from a deterministic viewpoint, on the classical physics level) it is unalterably fixed in the structure of the world. To keep the discussion brief, I’ll cite just one example, which should be convincing.

Everyone agrees that coin-tossing is within the scope of the theory: the future fall of a coin is a random variable. But what about a coin already tossed but not yet inspected? A coin already tossed represents a past event and a state of nature–yet from the standpoint of the observer, there is no difference whatever between a future toss and a past (but still unknown) toss. The observer’s bet will be the same in either case.

Similar remarks apply to sample means, point estimation, confidence intervals, and to all questions of a priori versus a posteriori probabilities. All are treated on the same basis.

The Principle of Insufficient Reason. Disagreements persist about the “principle of indifference,” “principle of cogent reason,” “equal distribution of ignorance,” etc. My own view is one I have not seen expressed by any single writer, although its elements are scattered through the literature. The idea has much practical importance.

Thomas Bayes published his famous essay [6] in 1764, containing a theorem about the probability of an hypothesis as a function of the outcome of an experiment or test. One element of the formula is the a priori probability of the hypothesis. (After a bit we’ll see what this means, with an example.) Many writers consider this a priori probability to be, in general, impossible to ascertain or even meaningless, hence useless, and they avoid this very powerful theorem.

Often those who shy away from a Bayes estimate of an unknown parameter will use, instead, a “maximum likelihood” estimate due to R.A. Fisher–but it turns out that this is the same as the estimate obtained by maximizing the Bayes a posteriori probability, IF THE A PRIORI DISTRIBUTION IS UNIFORM, and this is an implicit application of the “principle of insufficient reason”–i.e., assuming all outcomes equally likely, in the absence of any information. In confidence interval estimation, also, since confidence intervals usually have maximum likelihood estimates as their centers, anyone who uses them is ACTING just AS THOUGH he believed in the principle of insufficient reason.

Finally we note, with Jeffreys [7], that in practice one cannot avoid using the principle of insufficient reason, in the following sense. Whenever one seeks a probability, one looks for relevant information, and he bases his estimate exclusively on this. But this is the same as saying that one begins with the principle of insufficient reason, and then modifies his judgment according to the information at hand!

This is more than a play on words. Consider a probability that is “well known” on the basis of a very long series of observations–perhaps the probability of heads on a coin toss, which we tried to show was based on total human experience with giving random shoves to symmetrical bodies. In the strict sense, this “well known probability” is merely a sample mean, and we should estimate the “true” probability by use of Bayes’ formula. But we cannot do this, because there does not exist any larger background of experience in terms of which we might define an a priori probability. We therefore fall back on the principle of indifference in the form of a maximum likelihood estimate, and this estimate equates the sample mean with the population mean, fixing the probability (and leaving it slightly blurred).

In a manner of speaking, then, what we do–because there is nothing else to do–is this: we assume that the portion of the world that we know is a representative sample of the whole. The interpretation of this principle, however, can be very tricky.

APPLICATIONS

In the period of more than forty five years since I did this work, it apparently has remained customary for statistical studies to favor the “maximum likelihood” approach, and many biologists and sociologists, for example, do not know any other. This leads to many absurdities and even economic waste.

For example, consider the claim of Dr. Bob-a-loo, the shaman, who says his chants can bring rain in the desert–and suppose that, in a test observation, rain does indeed follow the chant, although the U.S. weather service assessed a very small chance of rain. How are we to judge the probability that the medicine man can influence the weather?

Using the usual maximum likelihood approach, it might be calculated, say, that the observed result had less than a 1% chance of occurrence on a random basis. Hence, a statistician might say, the claim is “validated, at the 0.99 significance level”–even though common sense tells us this is balderdash.

Common sense is saved–and your intuition vindicated–by throwing out the Neyman-Pearson approach with its “Type I and Type II errors” and appealing to Bayes, even though the Bayes formula requires an estimate of the a priori probability that the shaman is a rain-maker. It’s true the this probability is not accurately known, but we do know that it is very small–certainly much smaller than 0.01. (We know this because, in all the history of the world, despite many claims, there is no proven case of chanting or dancing affecting the weather.) Using some appropriately small number for the a priori probability–perhaps the number 0.000000000001–the a posteriori probability, even after the one positive observation, turns out very small, and we dismiss the claim. (Actually, we wouldn’t even have bothered watching the demonstration.) … It is true, of course, that a long succession of successes would make us take the shaman seriously, but one or two or a few will not.

Something similar happens in studies of “extra-sensory perception,” ESP. If a “success” occurs that would have been unlikely on a random basis, the ESP researcher may assert that the “faculty” (of telepathy, psychokinesis, etc.) is established, at a “significance level” of 0.99, or 0.999, or 0.9999, or whatever.

Again, the approach was inappropriate; a Bayes approach should have been used, with some extremely small number for the a priori probability that the faculty exists. (Again, in the whole history of the world, despite innumerable citations, there is not a single proven instance of any such faculty, as far as I have been able to determine.)

Actually, those who use the Neyman-Pearson or similar “significance level” methods are only kidding themselves, because they are left with the problem of deciding the appropriate level–and this decision, if correctly made, is equivalent to making an estimate of the a priori probability. One might as well do it explicitly, and get the advantage of the Bayes method.

In the world of commerce, it still appears common to use the Neyman-Pearson approach, which I have proven to be inferior. In quality testing, for example, a simple-minded “significance level” test after sampling might show quality to be too low. But if the manufacturer has a good reputation, and there are other favorable circumstances such as details of production design, a favorable a priori estimate might yield a Bayes calculation showing satisfactory quality.

People tend to be afraid of a priori estimates when they are only rough guesses. But a rough guess, properly used, is much more useful than an exact datum inappropriately applied. (Think again about the football example: Bettor B’s number was the most precise, but Bettor C’s was by far the most useful.)

A Suggested Estimate for the Exponential Life Parameter

In 1953 I applied these ideas to a practical technical and commercial problem, that of estimating the mean.life of vacuum tubes by sampling in ordered observations. While vacuum tubes have mostly gone down the tubes, the “exponential life parameter” still has its uses.

The following little problem is not meant for the average reader, but is intended to abash partly-educated biologists. (The average reader is invited back to the next section, however.)

The expressions that follow will look a bit unfamiliar and clumsy to mathematicians. The main reason for using this unconventional notation is to reduce downloading time. Any reader who wants to take the trouble can, of course, convert to conventional notation by using the following equivalences:

h = theta; Int [f] = integral of f from 0 to infinity; sigma 1, r [xi] = sum from 1 to r of xi. So:

If the underlying distribution is

(1) f(x; h) = 1/h [exp (-k/h)] (x > 0, h > 0)

then the maximum likelihood estimate for h based on the first r out of n observations is [8]

(2) HF = N/r, where N = {sigma 1, r [ xi] + (n – r) xr}/r

Although this estimate is unbiased, minimum variance, efficient, and sufficient, there is an important sense in which it is not the best possible estimator.

The maximum likelihood estimate is that estimate which maximizes the probability density for

(h; x1,…..,xr) on the assumption that all values of h are equally likely a priori. But this assumption is very unrealistic; e.g. in tube testing it is certainly reasonable to assume that a moderate value for the mean life is more likely than a very large or a very small value, and the estimate should be weighted accordingly.

I shall therefore suggest a plausible a priori distribution, and a new estimate based thereon, indicate some of its properties, and show in what way it is “better” than the maximum likelihood estimate.

The following assumed a priori distribution for h is certainly a better approximation to the “true” a priori distribution than is the uniform; it also has a plausible look and contains room for adjustment depending on experience in the industry, and it is simple to handle:

(3) g(h) = [a/hb] exp[-c/h] (a > 0, c > 0, b > 1)

where b and c are location and scale parameters respectively and a is the normalization factor.

Maximizing the probability density

(4) f1 (h; x1,….,xr) = {f2 [x1,…,xr;h] g(h)}/ { Int f2 (x1,….,xr; h) g(h) dh}

gives the new estimate

(5) HB = {sigma 1,r [xi] + (n – r) xr + c}/(r + b) = (HF + c/r) [r/(r + b)]

with a p.d.f. which can be shown to be

(6) h (HB) = {[r + b]/[ r(r – 1)!]} (r/h)r [HB(r + b)/r – c/r]r – 1 exp {-r/h[HB(r + b)/r – c/r]

The mean of this estimate turns out to be

(7) E(HB) = [r/(r + b)] [h + c/r]

so that while HB is not unbiased in general for small samples, it becomes unbiased for large samples; we see also that when h = c/b, HB is unbiased for samples of any size, and h = c/b maximizes g(h).

The variance of HB is easily shown to be

(8) Var HB = [r2/(r + b)2] [h2/r] = [r2/(r + b)2] Var HF

which is smaller than the variance of HF!

The important feature of HB is that it does not throw away the information that moderate values of h are more likely a priori than extreme values.

Let us review once more what the above means:

If tubes are produced in batches, and mean life varies from batch to batch, and a steady buyer wants to estimate as well as possible on the average, then the Modernized Bayes approach is indicated; a less satisfactory substitute would be a confidence interval approach using maximum likelihood estimation.

If however some important decision, such as the letting of a contract or installation of military equipment, rests (for some reason) on a single sampling, then the situation is essentially different. Here the required probability must apply not to over-all repeated sampling, but just to that sub-sequence of samples yielding exactly those observations actually obtained! Hence the Classical Bayes approach is required, and any other is a poor substitute. In such a case one must not be deterred by the fact that the a priori distribution of is known only roughly; one must not throw away important information merely because that information is less exact than one might wish!

Aspects of Probability Concerning Cryonics

We have said that it is possible to calculate a probability (indeed, usually many probabilities, corresponding to different states of knowledge) for any event of whatever kind. But we have also pointed out that observed probabilities (those taken directly from the most appropriate experience) always have some uncertainty–sometimes so much that the number has little value; and subsidiary or derived probabilities–those calculated from the observed probabilities by the rules of combination–have even more uncertainty. Sometimes the end result is simply p = 0.5 ± 0.5, which is the equivalent of a shrug.

But it must be emphasized, first, that an unknown probability (one which has not yet been investigated), or a highly uncertain one, is not the same as a small probability. Saying the odds are unknown is NOT the same as saying the odds are adverse.

But we are not admitting even that the odds are unknown. Stick around.

We are talking, of course, about the chance of eventual rescue of patients frozen today, or in the near future, by relatively crude methods.

There are several aspects to the estimate. First, we disregard the chance of global nuclear war or other world-wide catastrophe, which might destroy both the dead and the living: in most respects we do not plan our individual lives with this in mind, after all. (But the Cryonics Institute, for example, does make plans intended to minimize risk in limited nuclear war.)

Second, we disregard the business risks of any particular organization (although, again, the Cryonics Institute for example strives for extreme prudence, with never any debt. ) In almost any industry, some will dive and some will thrive.

Third, we disregard the risks of politics, assuming that in the U.S.A. at least there will persist a sufficient degree of personal freedom to allow our movement to exist.

In all three of the aforesaid risk areas, it should be again noted, we are dealing not with fixed probabilities, but with conditions in flux and subject to feedbacks: we are dealing with MOTIVATED BEHAVIOR, and in this area no one has yet developed adequate statistical tools. In other words, when things start to look bad, usually somebody does something about it, and pretty soon things look better again.

The remaining area of interest is the scientific chance of rescue–the chance of development of skills sufficient to reverse freezing damage and senile debility as well as other deficits in the frozen patient, allowing restoration to youthful good health with memory and personality largely intact.

To persuade a lay reader, gradually, that (s)he has or can acquire the competence to assess questions of technical feasibility–this is not easy, but I think not impossible. Remember the Commander-in-Chief and the CEOs. It may help to remind ourselves of the astonishing feats we accomplish every day.

Intuition and Probability

While intuition is certainly fallible, it is also educable, and in many areas all of us routinely rely on it for life-and-death estimates.

Think about crossing the street. Small children, and dogs and cats, may not think about it, or forget abut it, and die. But most of us have learned to gauge our chances in traffic very quickly and very efficiently–and we don’t need the slightest knowledge of mathematics in any explicit or conscious sense.

In a sense, gauging traffic is an awesome calculation. We have to estimate the speeds and trajectories sometimes of several vehicles, as well as our own speed and agility: a computer directing anti-tank fire and evasive maneuvers in a similar situation has only been available in recent years–and in some respects the human brain (even in unconscious behavior!) is still far ahead of the computer’s capabilities.

Some types of mental analog computation involve much more abstract thought than the computation of traffic patterns. Think for a moment about the prediction by Goddard and Tsiolkovsky, in the early part of this century, that there would be moon rockets. Why did these rocket experts know there would be moon rockets, while most other experts–even at a much later date–doubted or denied it?

The answer is that they looked at the central issue and at the sweep of history, ignoring some of the troublesome details, including the expense.

It is high-school simple (once a Goddard has shown you how) to prove mathematically that a rocket can reach the moon: all you need is enough explosive and a way of controlled release. But in the early part of the century many ingredients of practical importance were missing–high quality refractories and electronic instrumentation, for example, as well as the economic resources. (It took a national effort by the world’s wealthiest country to accomplish it.) But anyone with a a ense of history had to know that these details were only a matter of time and determination. There was always a question about the political feasibility of a moon rocket; there was never a question of its scient.ific or engineering feasibility.

In assuming that the petty details would in time be worked out, Goddard and Tsiolkovsky relied on intuition; they did not make explicit calculations of probability. But their view of history entailed an implicit calculation which–however rough–was relevant. They had a better grip on reality than the fusspot experts whose aberrant intuition focused on little, temporary obstacles of no long-term importance.

For another example of educated intuition, consider Leonardo da Vinci, whose “inventions” included the airplane, in a manner of speaking.

Leonardo did not really design a working flying machine, of course. He only dabbled in concepts, and in any case the materials and technologies were not available to implement t hem. But he saw to the center of the problem. Could a flying machine be built? Of course, since flying machines (birds for example) already existed; the air obviously provides support to a suitably designed frame, and the rest is detail. Given time, Leonardo would have worked out the details too, including innovations in materials and power supply.

Most experts’ failure of intuition occurs because they are unconsciously tied to a short time frame: if we can’t do it soon (in the framework of a professional career), we can’t do it. But frozen patients have a longer time frame; they can wait.

Unlimited Wealth–Not “Probable” but Certain

Many people admit the goals of cryonics are possible of achievement, but they put a low estimate on the probability or feasibility–often on the basis of economics.

John W. Campbell, late editor of Astounding science fiction magazine, said he could not conceive of repair after rupture by freezing of every single cell in the body. (This does not occur, but his ignorance is not the point here.) He didn’t think cell repair physically impossible; he just considered the repair job too monumentally difficult-i.e., expensive. But the fact is that (in the indefinite future) expense is no object. Fact–not conjecture.

Eric Drexler, Conrad Schneiker, and many others have given rather detailed arguments to show that automated repair mechanisms will become available at some date–machines that are self-replicating, self-improving, and “intelligent” to any necessary degree. This implies not only the physical capacity for microscopic repair, but also the economic capacity, since such machines will represent unlimited wealth. (Raw materials abound; it is organization of matter and energy that is critical.)

But unlimited wealth does not depend on the invention of any particular new devices. Its basis already exists! Specifically, the exponential growth of wealth is already with us; some people will recognize it under the name of compound interest.

Obscuring the fact, unfortunately, is the fog of destructive tendencies in human society, which can hide or even destroy any amount of productivity. Wars, heedless breeding, and assorted insanities have kept the standard of living orders of magnitude below what technology has made possible. But the exponential growth machine already exists. (The “peace dividend” has contributed much to U.S. prosperity in recent years.)

(The existence of the money machine is obscured not only by outright ignorance and recklessness, but also by milder forms of imprudence–in capitalist societies, for example, by mindless frivolity. There is staggering waste in the constant churn of fashion and proliferation of fads and choices: do we really need a new model auto every year, let alone dozens of them? But this will have to be cured by education and maturation; the socialist planned economy is worse.)

We NOW have the technology and natural resources (even on a world-wide basis) to give everyone a decent living (with family planning), and a good deal left over for investment. Much of this investment should take the form of research and development. This alone would GUARANTEE an exponential upward spiral, wealth growing without bound. Again, the underlying assumption is that political disasters will not ruin everything; we are concerned with the scientific probability of success. In the U.S., real income per capita does grow every year, albeit unevenly. The compound interest djinn exists even without robots. Any amount of money will eventually be available for the repair, revival, rejuvenation, and rehabilitation of the frozen patients.

This money growth will take two forms: first, the dollar growth in the trust funds or organizational funds of the patients through compound interest over the years (assuming that organizations such as the Cryonics Institute are successful in keeping the ravages of occasional inflation at bay); second, and more important, the exponential growth of productivity will make everything cheaper relative to income. (How many people could afford an air conditioner in 1936?)

This is without counting on self-replicating machines. With such machines, the curve of exponential growth becomes nearly vertical.

Summing Up

“Probability” Derives from “Experience”

The main thrust of our argument has been that, in attempting to estimate any probability, one must abstract or summarize as much as possible of the most relevant experience. This is scientific, not the myopic obsession with precise but irrelevant data one sees so often.

For a summary of experience bearing on cryonics, please review the six pages following (reprinted in many issues of The Immortalist). (On the Web site, go to Contents, then Principles of Experience.)

We can recapitulate in slightly different terms:

1. In the modern era, not a single goal of science, so far as I know, has been shown impossible (although some have proven more difficult than expected, and others become irrelevant). Odds-on for success. See below for clarification and expansion of this.

2. Many cells survive even uncontrolled freezing, and there have been partial successes with freezing mammalian brains: odds-on that, even in freeze-damaged brains, injury is limited and reparable.

3. The Precedent Principle, the Feinberg Principle, and the prospect of nanotechnology assure us that atom-by-atom manipulation of tissue (frozen or not) will allow construction or reconstruction, in finest detail, of any human configuration known, designable, or capable of inference. (For a discussion of prospective molecular repair technology, see the Drexler citation on the last page.) Odds-on that you can be restored.

In still other words, from the whole sweep of history, and our best understanding of the way the world works, we conclude the “gamble” of cryonics is odds-on in your favor: the probability of success (from the standpoint of technical feasibility) is much closer to one (certainty) than to zero. The number may be imprecise, but it is the best and most scientific estimate available.

Congratulations! You have just won a ticket to forever, transfers included.

P.S. If “forever” is a little too long, focus instead on just the next century or two, then reconsider.

ADDENDUM

Goals of Technology–The Record

This is added to clarify and expand the application of probability to the problem of repair of cryoinjured patients.

To estimate a probability, we need a recorded sequence of experiments “similar” to the one at hand. By broadening our definition of “similarity” as much as necessary, we can always do this. The broader the criteria of similarity, the less precise the estimate will be, to be sure–but even very broad criteria can still yield useful information. (See the football example previous.)

Now we want to estimate the probability of success in repairing and reviving frozen cryonics patients. Nothing very similar has ever been attempted, so we loosen criteria until we get to “attempts to achieve difficult new technology.” There have been many such, especially in the last few centuries. Can we compile statistics on how many have succeeded and how many have failed and how many are still open?

My own sense of history tells me this explicit calculation is unnecessary, but we can at least look at a few indicators.

First, let’s try to think of attempts which have failed, using reasonable definitions. An “attempt” means a serious effort of competent people in their area of competence. “Attempt” is also defined with reference to ends, not means. “Failed” means abandoned by all serious people.

One place to start might be the patent office record of patents that were refused because the examiners were convinced the gadget didn’t work or could never work.

Gadgets that could never work? A prime and recurrent example might be a “perpetual motion machine”–something that violates the first or second law of thermodynamics, yielding “free” energy. But gadgets like this are NOT submitted by serious, competent people; they don’t count.

Gadgets that don’t work as submitted? Ornithopters might be an example–machines that fly by flapping their wings. The ones actually built either didn’t work at all, or else worked only for a very short time and then crashed. But these don’t count either, for two reasons.

First, a practical flapping-wing machine may yet be built. Birds and bats and insects and (formerly) pterosaurs fly by flapping their wings; and with future materials and power plants and stabilizing systems, larger machines may also.

Second, we must look at the ends and not just the means. Before there were any heavier-than-air flying machines, there were several possibilities. These included ornithopters, planes with airscrews, jets, and rockets. All were once considered impossible or forever impractical, but the last three have been realized. The end–flying–has been achieved. Only one of the means–ornithopters–has not been successful, and that one still may be one day.

Hold on–what about Lysenko? The Russian biologist–who claimed acquired traits could be inherited–was regarded as “serious” only in the Soviet Union, but that was a large constituency. Shouldn’t that count as a failed goal of technology? Well, aside from the “iffy” characterization of him as serious, at worst only the means failed, not the ends. The ends were to breed better plants, and that is being done apace–even if Lysenko’s methods have been discarded and disredited.

Maybe I haven’t tried hard enough, but I have not thought of a single example of a serious goal of technology that has failed and remains without serious advocates.

On the other side, what about goals of technology that were once thought by most people to be impossible or forever impractical, but that in fact were achieved? –or that were previously not even IMAGINED, and yet were achieved? They are many and notorious, including:

Abrasives. Air conditioning and two-way heat pump. Air to air missiles. Algebras. Alloys. Alphabet. Anaesthesia. Analgesics. Analog computers. Anatomies. Antibiotics. Antidepressants. Armies. Artificial insemination. Asepsis in surgery. Assembly line. Auger. Automated factories. Automatic gene sequencing. Automobiles. Adze. Axe.

Babies from frozen embryos. Baking soda, baking powder. Ball bearings. Banking. Bellows. Blacksmithing. Blood and urine analysis. Books. Bow and arrow. Brazing. Bricks. Bridles & saddles. Bubble level. Butter. Buttons.

Calendar. Camera. Canals. Catalysts. Cathode ray tube. Central heating. Centrifuge. Chain. Cheeses. Chess program that can defeat a chess champion. Chunnel. Cities. Cloning. Clothing. Coders and decoders. Codes of law. Cogwheels. Comb. Compass for drawing. Condoms. Contact lenses. Coolidge tube. Corporations. Cosmetics. Cotton gin. Counting. Cryogenics. Cryopreservation of blood and other tissue. Cryosurgery. CT and NMR scanners.

Dental amalgams. Dental floss. Dentist’s drill. Deodorants. Depilatories. Desalinization plants. Dietetics. Digital computers. Digital disk recorder. Directories. Dirigibles. Distillation.Drawing. Drugs. Dynamite.

Earth satellites. Electric generator. Electric lights. Electric motor. Electric shaver. Electrophoresis. Elevators. Endoscopy. Epoxy. Eyeglasses.

Facial tissues. Fake fat. Fake sugar. False teeth. Farming. Fax. Fertilizers. Fiberoptics. File (paper). File (tool). Fingerprints. Fire making. Fishhook. Fletching. Flint chipping. Flying machine. Flypaper. Forceps. Fork lift. Bulldozer. Forks. Freeze drying. Furnace.

Genetic engineering. Genome mapping. Geometries. Glues & adhesives. Goldsmithing. Growth factors. Gyroscopes.

Hair care. Hammer. Horse shoes. Host mothers. Hotels.

Insecticides. Insulin. Intercontinental missiles. Interferons. Internal combustion engine. Internet databases. In-vitro fertilization.

Jet propulsion.

Kevlar. Knife. Knitting. Knots.

Laminates. Language. Laparoscopy. Lathe. Lawnmowers. Laxatives. Letters of credit. Logics. Lubricants. Lying.

Mace. Magnetic tape recorder. Magnets. Manuals of operation. Maps. Masers & lasers & holograms. Metallurgy Microscope. Microtome. Mining. Mirror. Money. Monoclonal antibodies. Moon rocket. Movies, including animation, special effects, and 3-D. Musical instruments.

Nails. Nanotechnology, just beginning. Nations. Navigation and location by satellite.Navigation systems. Needle & thread. Newspapers. Nuclear energy. Nuclear fusion, still in the works.

Oars & paddles. OCR. Offshore drilling rigs. Organ transplants. Oscilloscope.

Paint. Painting. Paper. Paving machines. Paving. Pendulum. Phonograph. Piping & tubing. Planing tool. Planting machines. Plastics. Pliers. Pneumatics. Pockets. Poetry. Polygraph. Potter’s wheel. Pressure pump. Prostheses for limbs. Protein sequencing. Pulley.

Quantum computers, just beginning. Quantum chemistry.

Radar. Radiation therapy and chemotherapy. Radio telescopes. Radio. Railroads. Refractories. Rivets. Roads. Roller bearings. Rope. Ruler.

Safety razor. Sails. Sandpaper. Saw. Scanning tunneling microscope, atomic force microscope, etc. Schools. Scintillation counters. Scissors. Screw machine. Screws. Sculpture. Scythe. Seismic prospecting. Semaphore. Sewers. Sextant. Shoes. Sickle. Skin grafts. Skyscrapers. Sledges. Smart bombs. Smelting. Soap. Soldering. Solvents. Sonar. Sonograms. Soup. Spectroscope. Spinning and weaving machines. Spoons. Stapling machines. Steam power. Stove. Stylus. Submarines. Surveying. Sutures. Sword. Synthetic hormones. Syringe.

Tables of organization. Tattoos. Telegraph. Telephone. Telescope. Television. Tempering steel. Thermometer. Tilling machines. Tissue typing. Toilet. Toilet paper. Tongs. Tooth brush. Tooth paste. Transatlantic cable. Transgenesis. Transistors. Transit. Traps & snares. Travois. Triodes. Turbines. Typewriters.

Use of radioisotopes.

Vaccines. Vacuum pump. Virtual reality. Vise. Vitamins.

Wallpaper. Watches and clocks. Water pump. Welding. Well-digging. Wheel. Wrenches. Writing.

X-ray diffraction techniques. X-rays. Xerography.

Yurt making. Yo-Yo.

Ziploc. Zipper. Zeppelin.

Sure, some of the above are arguably minor or redundant. Still, I think they were all reasonably important in their time, and not obvious or easy ahead of time. But a great many have doubtless been omitted. The important thing is that there have been very many successful projects of technology, and very few or none that have failed.

If you can think of a failure, that qualifies according to our criteria, please let us know.

REFERENCES

1. v. Mises, Richard: Notes on the Mathematical Theory of Probability and Statistics, Harvard U. Press, 1946.

2. v. Mises, Richard: “On the Foundations of Probability and Statistics,” Annals of Math. Stat. 12 (1941), p. 191.

3. Cramer, Harald: Mathematical Models of Statistics, Princeton U. Press, 1951.

4. Koopman, B.O.: “The Axioms and Algebra of Intuitive Probability,” Annals of Mathematics 41 (1940), p. 269.

5. Kendall, M.G.: “On the Reconciliation of Theories of Probability,” Biometrika 36 (1949), p.101.

6. Bayes, Thomas: “An Essay toward Solving a Problem in the Doctrine of Chances,” Philosophical Transactions, 1764.

7. Jeffreys, H.: Theory of Probability, Clarendon Press, Oxford, 1939.

8. Epstein, Benjamin and Sobel, Milton: “Some Tests Based on the First r Ordered Observations Drawn from an Exponential Distribution,” Wayne State U. Technical Report No. 1, 1952.

9. Mood, Alexander McF.: Introduction to the Theory of Statistics, McGraw-Hill, 1950.

10. Neyman, Jerzy: Lectures & Conferences on Mathematical Statistics and Probability, U.S. Dept. of Agriculture, 1952.

11. Doob, J.L.: “Probability as Measure,” Ann. Math. Stat. 12 (1941).