Applications in science, medicine, and operations research
We may assess or interpret probabilities in different ways according to the context. But, as David Hand wrote in his Statistics: A Very Short Introduction,‘... the calculus is the same’, i.e. how probabilities are manipulated does not change.
Keep in mind the central ideas of the subject: the Addition and Multiplication Laws; independence; the Laws of Large Numbers linking frequencies to objective probabilities; Gaussian distributions when summing random quantities; other frequently arising distributions; means and variances as useful summaries.
We may not expect our knowledge of the relevant probabilities to have the precision available for the examples in the previous chapter, but an approximate answer to the right question can be a reliable guide to good decisions. As statistician George Box said:‘All models are wrong, but some are useful.’
The next two chapters illustrate applications, loosely grouped under the chapter titles.
Brownian motion, and random walks
In 1827, the botanist Robert Brown observed that pollen particles suspended in liquid move around, apparently at random. Nearly eighty years later, Albert Einstein gave an explanation: the particles were constantly being buffeted by the molecules in the liquid. This movement is, of course, in three dimensions, but to build a satisfactory model, we first consider movement just along a straight line.
Suppose that each step is a jump of some fixed distance, sometimes left and sometimes right, independently each time. This notion is termed a random walk. The position after many jumps depends only on the difference between the numbers of jumps in each direction; the mean and variance of the distance from the start point are proportional to the number of jumps made.
Make a delicate computation: over a fixed time period, increase the frequency of the jumps, and decrease the distance jumped. With the correct balance between these two factors, the limit becomes continuous motion, the random distance moved having(by the Central Limit Theorem) a Gaussian distribution whose mean and variance are both proportional to the length of the time period. If movements left or right are equally likely, the mean will be zero.
Einstein’s explanation for Brown’s observations is that particles move in three dimensions, movement in each dimension following a Gaussian law for the reasons given. He made predictions about how atoms and molecules behave, provoking experiments that removed any lingering doubts about their existence.
The term‘Brownian motion’ ought to be reserved for the actual movement of particles in a liquid, but it is also often used for this mathematical model of that movement.
Random numbers
The phrase‘random number’ refers to one of two ideas. First, as in ideal games with dice or roulette wheels, one number from a finite list is chosen, all of them being equally likely. Second, as in the notion of snapping a stick at a random point, some point in a continuous interval is chosen, no part of that interval being favoured over another. The facility to choose long sequences of such random numbers, each value being independent of all the others, has many applications, as the next section will illustrate.
In 1955, a splendid book One Million Random Digits was published. It follows its title exactly: page after page of the digits zero to nine, grouped in blocks for ease of reading, but successive digits are entirely unpredictable– whatever the recent sequence of digits, you have one chance in ten of guessing the next one. Today, modern computers have built-in software to achieve the same ends. An initial value(the seed) is fed in, and a fixed mathematical formula produces the next value, which acts as a new seed, and so on. There is nothing random about this process at all, and if the same initial seed is used, the same sequence is generated. But, with a cunning choice of this mathematical formula, the sequence generated passes a battery of statistical tests and looks, to all intents and purposes, as though it were random. The term pseudo-random sequence is used.
No matter how much care is taken in this process, there will always be some lingering fear that hidden flaws in the method used will matter in the use to which the numbers are put. With that caveat, and relying on the experience of a large number of respected scientists, I am prepared to act as though my computer produces acceptable sequences of random numbers on demand.(The obvious danger of insider fraud means that these methods have no place in choosing numbers in Lotteries, or in UK Premium Bonds.)
Monte Carlo methods
How many different numbers will appear on 37 consecutive spins of a standard European roulette wheel? In theory, it could be anything between one and 37, but those extremes would occur very rarely; what is the most likely number of different numbers?
When this problem was first put to me, I did not immediately see an easy way to solve it. There are 3737(a number with 59 decimal digits) possible outcomes of spinning the wheel 37 times, and when you try to write down all the ways in which, say, 28 different numbers could arise, you quickly lose enthusiasm. A more appealing approach was to perform a so-called Monte Carlo simulation.
Here, the computer’s stream of random numbers was used to simulate the outcomes of 37 spins of a wheel, after which the computer counted how many different numbers had arisen. This process was repeated one million times, leading to 24 different numbers on 203,739 occasions, while 23 arose just 199,262 times.
The nearest rivals, 22 or 25 numbers, each happened fewer than 160,000 times. The Law of Large Numbers says that the frequencies of the different outcomes will settle down to their respective probabilities, and these figures essentially settled the matter: the most likely result is that 24 different numbers will arise, and the chance of this is just over 20%.
Days later, I kicked myself for not spotting a standard way to solve the problem! I could calculate the exact probability of getting X different numbers in 37 spins, for any value of X, confirming the conclusion described above. But this does not invalidate the use of simulation to attack this sort of problem– quick and dirty answers can be useful. Indeed, the fact that the simulation gave answers consistent with the exact calculation boosted my general faith that the computer’s random number generator was behaving as intended.
A more serious use of Monte Carlo methods occurs in polymer chemistry. A molecule consists of a large number of atoms, connected along a randomly twisting chain. Atoms can occur only at places on an evenly spaced lattice and, crucially, no two atoms can be in the same place. How far is it likely to be from one end of the molecule to the other?
We can think of the atoms as being at the places visited by some drunkard, staggering around at random on a three-dimensional lattice for a while, but somehow never visiting the same place twice. Without the requirement that no place be revisited, mathematical experts can make good progress, but that restriction seems to complicate the problem beyond theoretical attack.
However, even a semi-competent computer programmer can write a sensible simulation of this complex, twisting, chain, and, by making one million, ten million, even a billion repetitions, obtain an answer as precise as is required.(Recall de Moivre’s work, that precision increases only as the square root of the size of the simulation.)
Or suppose you want to estimate the area of an irregularly shaped leaf. Draw a rectangle around the leaf, and then simulate the positions of a large number of points scattered at random within that rectangle. Your estimate comes from multiplying the area of the whole rectangle by the proportion of points that fall within the leaf’s boundaries.
As a final illustrative application, suppose Paul is hoping to set up a new petrol filling station. If he installs four pumps, the minimum viable number, there will be room for up to eight other cars to wait in a queue; each extra pump removes two waiting spaces, so if he installed the maximum of eight pumps, there would be nowhere to wait. To work out how many pumps will maximize his profits, he can carry out simulations of what would happen if he installed four, five, six, seven, or eight pumps.
As well as the installation costs, the running costs, and the profit margins, he would need to know the rate of arrival of potential customers and the distribution of the time it takes from a car pulling up at a pump until the pump becomes free. He should also take account of the chance that a potential customer would drive past if no pump were free, or the queue too long. All these figures are relatively easy to find or estimate, and it is far cheaper to make computer simulations than to experiment physically with various numbers of pumps over several months. Since he can use the same initial seed each time, he can run all his simulations under precisely the same conditions, thereby improving the comparison between the different estimates of profits.
Why‘Monte Carlo’? Despite the obvious connection between random numbers and casino games, the name was originally just a code word to refer to the use of these methods in military matters, including the early development of nuclear bombs.
Errors in codes
Morse code demonstrates how to transmit messages using only two symbols, 0 and 1 say. But some symbols may get corrupted, so that an initial 0 arrives as a 1, and vice versa. Even with a low error rate, the message received could mean something quite different from that sent. How best to deal with this?
Suppose each symbol sent has, independently, some small probability of being corrupted. We could repeat the symbols, but a moment’s thought shows that sending 00 and 11 rather than 0 and 1 respectively is no use at all: if 01 or 10 arrives, it is a pure guess as to whether 00 or 11 had been sent. We’ll guess correctly half the time, but doubling up on the symbols sent means that we should expect twice as many errors, so these factors largely cancel out. But consider transmitting 000 or 111 instead of 0 or 1.
Using‘majority vote’ to decode messages, all of{000, 100, 010, 001} will be interpreted as 0, the other four possibilities will be interpreted as 1. If just 1% of symbols sent are corrupted, then when 000 is sent, using the binomial distribution shows that there is a 99.97% chance that one of those four sequences above arrives. That means that the error rate drops from 1% to 0.03%, a factor of over thirty. We would do even better if each digit were repeated five times, but at the expense of the message length. The best choice will depend on the size of the inherent error rate, and the speed of transmission.
Amniocentesis
Prospective parents(and statisticians) Juanjuan Fan and Richard Levine considered whether or not Fan should undergo an amniocentesis– a test to see whether the foetus she was carrying might have Down’s Syndrome. Their experience can act as a template for others in a similar situation.
Knowledge of Fan’s age, and a simple blood test, put the chance of Down’s as 1 in 80; an ultrasound image was encouraging, leading to a use of Bayes’ Rule that reduced the chance to 1 in 120. Amniocentesis is an invasive test– a hollow needle is inserted into the abdomen to extract a sample of amniotic fluid; if the extra copy on chromosome 21 that is characteristic of Down’s is present, it will certainly be detected, but the test has a risk, estimated in this case at 1 in 200, of causing a miscarriage. Should parents, who would choose an abortion if Down’s were present, take this test?
Fan and Levine reached their decision by the logical process of maximizing their expected utility. The worst possible outcome, miscarriage of a foetus without Down’s, was assigned utility zero, the best outcome, birth without Down’s, was given the utility of unity. To give birth to a Down’s child, having opted out of amniocentesis, was allocated utility x, while to take the test and find Down’s has a somewhat greater utility, y.(The possibility of a miscarriage is irrelevant in this last case, as the foetus would be aborted anyway.)
The expected utilities with and without the test are made. The test should be taken if the first exceeds the second which, in this case, reduces to requiring that y should exceed(119/200)+ x; in round figures, y is bigger than 0.6+ x.
If Fan and Levine had felt that the utility of discovering the presence of Down’s, and thus having an abortion, was below 0.6, there would never be any point in taking the test. And the higher the utility they attached to having a child, albeit with Down’s, the higher that threshold would become. If that utility were above 0.4, they should never take the test.
Choosing appropriate values for x and y requires some thought; and if the basic parameters– a 1 in 200 chance of a miscarriage through the test, a 1 in 120 chance of Down’s without the test– were different, the final criterion would change.(See the Appendix.) Plainly, if the chance of Down’s were less than the chance of a miscarriage, it would never be rational to take the test(yes?).
Fan and Levene discussed their dilemma, and their agreed choice of utilities led them to opt for the test. There was a happy ending: no extra chromosome, and no miscarriage.
Haemophilia
Haemophilia is the general name given to a group of disorders that prevent blood from clotting if a cut occurs. The clotting agent is found on the X chromosome, and the chance it is missing is under 1 in 5,000. As females have two X chromosomes, they would suffer from haemophilia only if it is absent from both copies, giving a chance of under 1 in 25 million, but males possess only one X(and one Y chromosome), so almost all instances of the disease are in males.
If males do have haemophilia, this becomes known well before they have a chance to father children, but a female may unknowingly have one normal X and one without the clotting agent. Such females are termed carriers, and the chance they pass on the affected gene to any child is 50%. A daughter who receives this affected gene becomes a carrier, a son will suffer from haemophilia. That Queen Victoria was a carrier is certain, as her son Leopold was a haemophiliac, and at least two of her five daughters were carriers. She had three other sons who did not suffer that disease.
Suppose Betty has a brother who suffers from this disease; Betty has several children, including Anne. What is the chance that Anne is a carrier?
To answer this question, it is enough to find the probability that Betty is a carrier; Anne’s chance will always be half that figure. We know that Betty’s mother is a carrier, as Betty’s brother is a haemophiliac. From this information alone, the probability Betty is a carrier is 50%. And if any of Anne’s brothers have the disease, Betty would be certain to be a carrier, so we look at the case when all Anne’s brothers are healthy.
This situation is tailored for Bayes’ Rule. Let Guilt mean that Betty is a carrier: since the initial probability of Guilt is 50%, the prior odds are unity. If Betty is Innocent(not a carrier), the probability of the Evidence(all brothers healthy) is plainly 100%. But if Betty is Guilty, each brother has, independently, a 50% chance of escaping the faulty gene, so each healthy brother halves the Likelihood Ratio. Turning the posterior odds into probabilities, the chance Betty is a carrier is successively 1/3, 1/5, 1/9, 1/17,..., according as she has one, two, three, four,... brothers, all healthy.
10. Family relationships
For your own entertainment, suppose that Anne also has sisters, some of whom have sons, and none of these nephews of Anne are sufferers. How does that affect the chance Anne is a carrier? Check your answer against that given in the Appendix.
Epidemics
The phrase herd immunity expresses the fact that if sufficiently many people are vaccinated, then should a trace of infection enter the community, no epidemic will occur. Thus even those who were not vaccinated are very unlikely to catch the disease. Why should this be so, and how do we discover what‘sufficiently many’ means?
Typically, those infected may transmit the disease to others, but those who recover have acquired immunity. So we label people as one of Susceptible, Infected, or Removed. The latter are those immune from infection because of vaccination, recovery, physical isolation, or death. It may seem callous to give these four outcomes the same word, but the brutal truth is that, so far as the spread of the epidemic is concerned, they really are equivalent!
To see how an epidemic might develop, let S be the number susceptible and I the number infected. Multiplying these two numbers together gives the total number of possible encounters between an Infected and a Susceptible. Crowded urban populations will be in close contact more often than scattered rural populations of the same size, and the probability of the Susceptible becoming Infected through an encounter will depend on the infectiousness of the disease. Overall, the probability that,during a tiny time period, some Susceptible individual becomes Infected will be of the formβ*S*I, whereβ depends on both the infectiousness and how much people mix.
In this same tiny time period, any Infected person may move into the Removed category. Hence the probability that Infecteds reduce by one member will be proportional to the number infected, so takes the formγ*I, whereγ depends on how fast those infected recover, are isolated, or die.
We have found the respective chances of an increase, or a decrease, in the number Infected. The balance between these two probabilities,β*S*I andγ*I, determines whether an epidemic occurs. There is a good analogy with gambling. If the game is loaded against you, each bet is more likely to reduce your capital than increase it: your capital follows a random walk, with an inexorable drift towards zero. But if the game favours you, provided that bad luck does not bankrupt you early, the random walk ushers you far enough from zero to outrun any losing streak. To win a large sum, it is necessary, but not sufficient, for the game to be in your favour.
In the epidemic context, this means that the only time an epidemic(= large fortune) might occur is when any change in the number infected is more likely to be an increase than a decrease. In symbols,β*S*I must exceedγ*I, which is the same as asking that S, the number of Susceptibles, exceeds the ratioγ/β, a quantity termed the threshold for this population. This is exactly what we were looking for: even if the disease enters into a population,
an epidemic can only occur when the number susceptible exceeds this threshold.
William Kermack and Anderson McKendrick presented this result in 1927. Epidemics can be avoided by keeping the number susceptible below the threshold, which can be achieved in two distinct ways. First, vaccinate to reduce the number susceptible. Second, find means to increase the threshold. Being a ratio, a threshold increases when the numerator increases– we speed up recovery rates, or isolate infected people faster– or when the denominator decreases– we may be able to reduce the infectiousness, or we can ensure that people mix less, by closing schools temporarily, or postponing sports events where large crowds would gather. We can also assess the likely size of the benefits of these different responses, and so judge which are worth pursuing.
The same principles apply to controlling epidemics in animals. The first step taken to end a foot-and-mouth outbreak in cattle is usually to restrict cattle movements, which increases the threshold size by reducing the denominator. This is often accompanied by mass slaughter(not available with diseases in humans!), which not only reduces the size of the susceptible population, but also increases the numerator in the threshold.
This analysis also explains why we should expect epidemics of childhood diseases at fairly regular intervals. After an epidemic reduces the size of the susceptible population to below the threshold, new births, with insufficient vaccination, gradually take the size above the threshold, so creating the conditions for the next outbreak. The longer the interval between epidemics, the higher the vulnerable population, and the more severe will any epidemic be.
Knowing about probabilities may not cure diseases, but it can help mitigate their effects.
Batch testing
The army wishes to identify which of 1,000 potential recruits may be vulnerable to a certain disease, and thus unfit to serve. A blood test is available, but it costs£50 each time. Can the job be done for less than£50,000?
Provided that only a fairly small proportion will prove vulnerable, the answer is‘yes’. Choose some number K, and pool blood samples from K recruits; then test this pooled sample. If the result is negative, then all those who contributed blood are clear, and need no more testing; otherwise, with a positive result, at least one person in the group would test positive, so we use K more tests, one for each of them, to settle the matter. If we are lucky, one test will suffice, but we might have to make K+1 tests. We hope to make fewer tests than if each person is tested individually.
The best choice of how many samples to pool depends on the probability of a positive test. Suppose this is 1%. Then if we pool ten samples, there will be a positive sample among them about 10% of the time, while the chance all are negative is about 90%. So we will need only one test about 90% of the time, but eleven tests 10% of the time. That leads to about two tests on average. Pooling ten samples reduces the mean cost for those ten recruits from£500 to£100. If we split the 1,000 recruits into 100 groups of size ten, and pool their samples, we expect to save 80% of the initial estimate of£50,000!
More refined calculations show that, when the probability of a positive test is indeed 1%, we would do slightly better pooling groups of eleven, rather than ten, but the difference is very marginal. However, the best choice of the size of a pooling group is quite sensitive to the probability of a positive test. If we expect 2% of recruits to test positive, costs are minimized if we pool eight blood samples; with 5% the best choice is to pool five samples, while if 10% will test positive, pooling four samples turns out best.(Once again, use of the binomial distribution led to these answers.)
In World War II, this simple idea saved the USA about 80% of its initial expected costs.
Airline overbooking
Even though they must compensate passengers who are bumped off fiights for which they have paid, airlines routinely sell more tickets than a plane’s capacity. Simple economics is the reason: the cost to the airline of making the fiight hardly changes with the number of passengers, but each empty seat is lost revenue. Expecting that not everyone who has booked a particular fiight will show up, how does an airline work out the best amount of overbooking?
Suppose that the plane has one hundred seats, at a fare of£100 per seat, but a passenger is paid£200 if they have to be turned away because the fiight is full. The airline needs a good estimate of the probability that a passenger who has booked will actually turn up. For charter fiights to holiday destinations, this probability will be close to 100%, but it will be rather lower for passengers who have more fiexibility in their travel plans. Frequency data will help airlines to estimate these chances.
Perhaps each passenger who books has, independently, an 80% chance of turning up for the fiight. If 120 tickets are sold, total revenue is£12,000, and, although only 96 passengers will turn up on average, there is a chance, around 15%, that more than 100 show up, and at least one passenger must be left behind.(These numbers again come from using the binomial distribution.) The mean amount to be refunded because of overbooking in this case turns out to be£80. Selling five more tickets raises the revenue by£500, and the total mean refund increases by only£275, so this policy is expected to be more profitable. The most profitable policy, on average, comes from selling 128 tickets– compared with selling 125, the extra revenue of£300 just outweighs the mean extra refund cost of£295, while 129 tickets would be slightly worse than this.
It is more realistic to suppose that some passengers are more likely to turn up than others, and that groups of passengers who book together will either all turn up, or none of them will. But these details can be incorporated into the model, and airlines will continue to sell seats until the expected cost of compensation exceeds the extra revenue.
Queues
One of the best-developed applications of probability is the study of queues of various kinds. The initial impetus came from attempts to understand congestion in telephone lines– the work of the Danish telephone engineer Agner Erlang is honoured by the use of his name as the unit of the volume of telephone traffic. Queuing theory contributed to the success of the Berlin airlift of 1948/9, and the systematic study of queues flourished during the next twenty years.
David Kendall introduced notation, with the format A/B/ n, now universally accepted as a shorthand way of describing queues where customers arrive singly. The first component, A, refers to the distribution of the time between customer arrivals, while B describes the distribution of the time it takes to serve a customer, and n is the number of servers.
For example, in the expression D/D/3,‘D’ is short for deterministic, meaning that there is no randomness at all. Customers arrive on the dot at fixed time intervals, all service times also have exactly the same length, and there are three servers. This queue would be of little interest in the field of probability, as there is no variability. But suppose there is a huge number of potential customers, each of whom has a tiny chance of turning up in a given short time interval, so that customers arrive at some overall average rate, but completely at random. Here the symbol M is used, to honour Andrey Markov, so M/D/2 means that customers arrive at random, and select either of two servers, who each take some fixed time to do their job.
We want to know how queues will behave. The main questions are how long do customers have to wait, how frequently are servers just sitting on their thumbs, and what can we do to improve matters? The‘servers’ may be the intensive care beds, while‘customers’ are patients needing that care.
If customers arrive every five minutes on average, and there are three servers, then unless the mean service time is less than fifteen minutes, a queue of indefinite length will build up, and the whole operation is unsustainable. So we must assume that the mean service time, taking account of the number of servers, is less than the mean time between arrivals. The ratio of these two mean times is termed the traffic intensity, some figure between zero and unity.
In an ideal world, customers would face only short queues and the server would be busy nearly all the time. But these two requirements are diametrically opposed. Take a simple case with one server and customers arriving at random. If the traffic intensity were 0.9, calculations show that we might expect about five waiting customers on average, and an empty queue about 10%
of the time. If the intensity rose to 0.98, the server would be unoccupied just 2% of the time, but the mean queue length would shoot up to 25. Most customers would consider this a worse arrangement. Unless servers have enough‘idle time’, customers will get angry, leave the system, or do both.
The queue behaviour depends on much more than just the traffic intensity. Other things being equal, the more variable the service time, the longer you can expect the queue to be. With several servers, it matters whether they operate as in my railway station, where one central queue feeds up to six servers, or as in my supermarket, where I choose which aisle to join, and stay there. In some situations, e.g. ambulance calls, some customers may have higher priority. Many queues use the‘First come, first served’ rule, but when non-perishable goods are stored on a shelf awaiting use,‘Last come, first served’ may apply. Some queues feed other queues, the servers may work at different speeds, bunches of customers could arrive together. Eagle-eyed queuing theorists have found answers to the central questions under almost any realistic model you can think of.