At the beginning of this book, I noted that some aspects of probability appear, at first sight, to defy common sense. Examples have arisen as the story has unfolded. Here are some other circumstances where intuition can be misleading, but, with sufficient care, these apparent contradictions can be explained. The subject of probability is wholly free from real paradoxes. But although ideas of probability can help us make sensible decisions, we may also find that even thinking about the probabilities of certain events might lead to uncomfortable dilemmas.
Parrondo’s paradox
Graham Greene’s novel Loser Takes All is a fine read, but based on a false premise: that there is some clever mathematical way of combining bets on a roulette wheel to give the player an advantage, rather than the house. On the contrary: mathematics has proved that, when all individual bets favour the house, no combination whatsoever can turn matters round and favour the player. Sorry, folks.
Juan Parrondo has shown that you have to be very precise in how you formulate a general claim that, whenever all bets favour one side, it is impossible to combine bets so that the other side has an advantage. I describe here a modification of his idea, due to Dean Astumian, who described a simple game played on the board with five slots, shown in Figure 11 . (This is not a serious game. It was constructed merely to make this point.)
11. The board for Astumian’s game
You need some way of generating a random event that will occur 1% of the time: perhaps a bag with 99 White balls and one Black ball, or a spinner that is equally likely to come to rest on any one of its one hundred sides. To begin the game, place a token on the slot marked ‘Start’. Every move will take the token one step left, or one step right, and you win if the token reaches Win before it hits Lose.
There are two basic sets of rules, call them Andy and Bert. With Andy, from Start you always move to Left, and from Right, you always move to Win. From Left, you use the spinner, to give a 1% chance of moving to Lose, and a 99% chance of moving back to Start. With Bert, the spinner is used for Start to give a 99% chance of moving to Right, and a 1% chance of going to Left. From Right,
you always return to Start, while from Left, it is the same as in Andy – the spinner gives a 1% chance of moving to Lose, a 99% chance of returning to Start.
Analysis of these games is simple. In Andy, there is no provision ever to reach Right; you shuffle around between Start and Left, until random chance takes you from Left to Lose. In Bert, you usually shuffle between Start and Right, with occasional visits to Left. Eventually, on one of these sojourns to Left, random chance takes you to Lose. In either game, the chance of reaching Win is zero.
For the new game, Chris, you also need a fair coin. At each turn, toss this coin: if it shows Heads, use the rules of Andy, if it shows Tails, use the rules of Bert.
It turns out that your winning chance in Chris exceeds 98%! It is easy to see why you are strong favourite: if ever you get to Left, you are overwhelmingly likely to return to the safety of Start. From Start, you play Bert half the time, with its 99% chance of getting to Right; and in Right, you play Andy half the time, inevitably winning.
Following either Andy or Bert, you must lose: flip between these games at random, and you win nearly every time! Framing a mathematical theorem that excludes examples like this, but confirms that Greene’s plot rests on shaky ground, requires very precise language!
2+2=4, or 2+2=6?
Suppose we carry out Bernoulli trials with a fair coin, i.e. each toss, independently, is equally likely to be Heads or Tails. A typical outcome will be HHTHTTTHT. . . . The mean number of tosses until Heads appears is two; but what is the mean number of tosses until we see HT, or HH?
The intuitive answer is four, as we expect to wait two throws for the first symbol, then another two throws for the second. And the mean number of throws until we see HT is indeed four, but this is not the case for HH. To see that pattern, the mean number of throws is six!
The reason for the difference is that, to get HT, it is correct to argue that we expect to take two throws to get the H, then another two to get the T that completes the pattern. And Two plus Two equals Four. But for HH, after we have the first H, the next throw will be T half the time, and we must begin again – all throws up to that point will have been wasted. The algebra leading to the correct answer is in the Appendix.
Between H and T, each is equally likely to appear first; what about between HH and HT? Again, each is equally likely to arise before the other, since we must wait for the first Head, and then the next throw determines the answer. However, between HH and TH, the latter is three times as likely to appear first! The reason is simple: the sequence will begin with HH one-quarter of the time, but unless this happens, it is inevitable that TH appears first. (Think about it.)
The game Penney-ante is based on the above ideas. You invite your opponent to select any of the eight possible triples like HHT, or THT, etc., that might occur in three consecutive throws of a fair coin. You select a different one, a neutral person tosses the coin repeatedly, and the winner is the person whose triple is seen first. Despite the apparent generosity of allowing your opponent to have first pick, this game favours you – if you know what you are doing. Whatever she chooses, you can select a triple that will appear before hers at least 2/3 of the time! The winning recipe is in the Appendix.
Give me a clue…
(1) Three double-sided cards of identical shape and size are placed in a bag. One card is Blue on both sides, one is Pink on both sides, the last is Pink on one side, Blue on the other. One card is selected at random, and one side of it is exposed, and seen to be Pink. Is the other side more likely to be Pink or Blue? Or are the chances equal? Over to you – answer below.
(2) Careful counting shows that a bridge hand of thirteen cards, dealt from a well-shuffled deck, will contain two or more Aces about 26% of the time. You deal out a hand to Lucy. To the question ‘Does your hand contain at least one Ace?’, she answers ‘Yes’. On a separate occasion, you deal a hand to Tina, and ask ‘Does your hand contain the Ace of Spades?’ She also responds ‘Yes’. Which of the two hands is the more likely to contain two or more Aces? Or are the chances equal? See below.
(3) Suppose that, among 1,000 males and 1,000 females, all with satisfactory qualifications, 480 males but only 240 females gain admission to a university. A clear case of sex discrimination – men are twice as likely as women to be admitted?
The answers? With the Pink/Blue cards, seeing a Pink side plainly eliminates the double Blue card. All three cards were equally likely, just two are left, Pink/Pink or Pink/Blue. With one of these cards, the reverse side is Pink, but with the other card, the reverse side is Blue. It looks as though Pink and Blue are equally likely.
That reasoning is sloppy: Pink is twice as likely as Blue, and you can check this by doing this experiment a dozen or so times. Better, note that the cards have three Pink sides between them, and all of those are equally likely to be the side seen. But only one Pink side has Blue on the reverse – twice, a Pink side also has Pink on the reverse. (You could use Bayes’ Rule, but that would be sledgehammer and nut territory.)
As an impoverished graduate student, Warren Weaver, one of the founders of Information Theory , taught other students the usefulness of understanding probability by consistently winning money from them when playing this game.
With the deck of cards, we know both times that the hand has at least one Ace, and many people would suggest that Tina and Lucy are equally likely to hold two or more Aces – all Aces are equally likely, so why should Tina confessing to the Spade Ace in particular make any difference? Reject those thoughts, and do the counting properly.
For Lucy, among the hands with at least one Ace, we find the proportion that have two or more – it is about 37%. For Tina, along with the Ace of Spades, she has twelve more cards, chosen at random from the remaining fifty-one. About 56% of the time, these include another Ace: Tina is far more likely than Lucy to have two or more Aces.
Your suspicious mind rightly tells you that the answer to the third question is ‘No’. For suppose that, in the English department, 20% of 950 women who applied, and 10% of the 50 men, were admitted; in Business Studies, all 50 women gained admission, but only half the 950 men. Do the sums: 240 women and 480 men were successful, but, in each department , the success rate for women was twice that for men. Any discrimination was against men, not women!
Indeed, in real life, from thousands of applications to Berkeley’s Graduate School, 44% of males but only 35% of females were admitted. However, when the applications were broken down into the separate departments, there was hardly any difference between the admission rates of men and women. But admission rates did vary between departments, and female applications were largest to those departments that admitted a smaller proportion from both sexes.
This counter-intuitive result is an example of Simpson’s Paradox . It shows the perils of working with proportions, rather than absolute numbers, and it turns up all over the place.
It is far more than a mere curiosity. You have no justifiable claim to be numerate unless you know what it is about.
Do you want to know?
I have argued that probability is the key to making decisions under uncertainty, and I will not retreat from that. But the ability to know probabilities more precisely, and in new circumstances, throws up some uncomfortable dilemmas.
It is now possible for individuals to have their own entire genetic code sequenced, but Nobel Laureate James Watson and Harvard psychologist Steven Pinker have both opted not to know which version of a gene known as APOE they carry. Having one copy of the epsilon4 version of this gene quadruples the chance of developing Alzheimer’s disease, while having two copies is associated with a twenty-fold increase in the chance. (Paradoxically, having this epsilon4 version is also associated with benefits during one’s youth.) Another Nobel Laureate, Craig Ventner, knows that he does have one copy of epsilon4. One research laboratory has the policy of never disclosing to volunteers their APOE status, on the grounds that, with current knowledge, there is no treatment available to mitigate bad effects.
But some commercial companies may be very interested in your APOE status, indeed in your whole genome. If your genetic composition suggests a high probability of an early death, they may be willing to give greatly enhanced annuity rates – but may also demand much higher medical premiums. Companies who have full genetic information on an individual might ‘offer’ a bespoke service, tailored exactly to the life prospects of the client.
John and Tom are both 65 years old, and will each use £15,000 to buy an annuity; normal life expectancy is, say, 15 years, but John’s genes suggest 10 years longer, Tom’s 10 years less. Ignoring genetics, Company A offers both the same sum, £1,000 per year. But company B uses the genetic information, and offers Tom £3,000 per year, but John only £600 per year. Recall the aphorism that, in the long run, averages rule. Both men would accept their higher offer, so company A should expect to pay out £25,000 to men like John, a loss of £10,000 each time,while company B expects to pay £15,000 to Tom and his ilk, and so break even. Company A will collapse, B will survive.
If the only viable insurance companies are those like B, we can expect many miserable people, who either cannot get medical/travel insurance at all, or who discover that their retirement plans are thrown askew because they are forecast to outlive their savings.
Barristers are advised to ask, in cross-examination, only those questions to which they already know the answers. Before you ask for your genome to be sequenced, assure yourself that you are fully prepared for what you might learn. Think of all the stages in life: a printout of your child’s genome at birth may have devastating news; when contemplating marriage, should you and your betrothed take steps to learn the likelihood of any children being severely afflicted? Should your employer have the right to deny you promotion because you have an increased risk of some illness? Should candidates for high public office, say president or prime minister, have to disclose their genome, so that voters are more aware of any genetic propensity to become unstable?
A randomly selected UK female has a 12% chance of contracting breast cancer. But if she has inherited certain mutations of genes known as BRCA1 or BRCA2, the chance increases to 60%. Should a mother of three daughters, whose sister has this mutation, be tested herself? And if she is tested and gets bad news, at what ages (if ever) should her daughters be told that they each have a 50% chance of having inherited this mutation?
Whatever you feel in such uncomfortable circumstances, recall that it is ‘probability’, and not ‘certainty’. If the probability of having the mutation is 10% for Emma, and 60% for Fiona, it may well turn out that Emma develops breast cancer, while Fiona does not. If they know their chances of having this mutation, they have to deal with this knowledge in their own way. To repeat the central dogma of decision theory: the rational action is the one that maximizes the mean utility of the consequences. You can never be sure you have taken the action that would have worked out best, but you have made optimal use of the information you have. You cannot ask for more.