# Tails I win

### 2018-06-21

Here’s a problem from Alexander Bogomolny on Twitter:

a counterintuitive property of coin-tossing: If Alice tosses a coin until she sees a head followed by a tail, and Bob tosses a coin until he sees two heads in a row, then on average, Alice will require four tosses while Bob will require six tosses

Can I demonstrate this in R?

Let’s make Alice and Bob flip coins:

```
# returns the number of flips needed till the start of the pattern
flips_till_pattern <- function(pattern) {
a <- rbinom(n=50,size=1,prob = 0.5) # simulate tossing a coin 50 times.
# make a string of the form "Heads-Tails..." showing the sequence of random results
str <- paste(stringr::str_replace_all(stringr::str_replace_all(a,"1","Heads"),"0","Tails"), collapse = '-')
# a list of all the indices in the string where pattern holds
inds <- c(1, unlist(gregexpr(pattern, str)))
m <- substring(str, head(inds, -1), tail(inds, -1))
ln <- lengths(strsplit(m, '-'))
cumsum(c(ln[1], ln[-1] - 1))[1]+1
}
```

Playing this game 1000 times, here’s the average number of flips needed to get “Heads-Tails”

`mean(replicate(1000,flips_till_pattern("Heads-Tails")))`

`## [1] 3.957`

and the average number needed for “Heads-Heads”

`mean(replicate(1000,flips_till_pattern("Heads-Heads")))`

`## [1] 6.056`

## More general solution

Now, you could say the above solution is terribly inelegant. There’s gotta be a way to implement this in a single line. But one advantage of my crude, string-based approach is that I can easily test other versions of the game.

For example, here’s the number of flips needed to get the pattern “Heads-Heads-Heads”

`mean(replicate(1000,flips_till_pattern("Heads-Heads-Heads")))`

`## [1] 12.236`

Or how about “Heads-Tails-Heads”

`mean(replicate(1000,flips_till_pattern("Heads-Tails-Heads")))`

`## [1] 8.527`

Interesting that a sequence where you throw the same result multiple times in a row is rarer than one in which you throw a pattern.

## Simpler solutions

In this solution, 1 = Heads, 0 = Tails, and I can compute all the locations in a vector of 50 tosses where I find the pattern 1,1:

```
v = rbinom(n=50, size = 1, prob = 0.5)
x = c(1,1)
idx <- which(v == x[1]) # two heads in a row
idx[sapply(idx, function(i) all(v[i:(i+(length(x)-1))] == x))]
```

`## [1] 6 7 8 18 38 39 40 48 49 NA`

That’s a little shorter than the string-based solution above, but it’s not the one-line (or close) that I’d like. Any hints?

**Update** Yes, there are much better solutions.

from David Robinson tweet:

```
library(tidyverse)
crossing(trial = 1:10000,
flips = 1:100) %>%
mutate(heads = rbinom(n(), 1, 0.5)) %>%
group_by(trial) %>%
mutate(next_flip = lead(heads),
hh = heads & next_flip,
ht = heads & !next_flip) %>%
summarize(first_hh = which(hh)[1] + 1,
first_ht = which(ht)[1] + 1) %>%
summarize(first_hh = mean(first_hh),
first_ht = mean(first_ht))
```

```
## # A tibble: 1 x 2
## first_hh first_ht
## <dbl> <dbl>
## 1 5.99 4.01
```

and, if you’re so inclined, math.stackexchange.com has a mathematical solution (with fancy equations)

Thanks to R4ds and Twitter for the excellent solutions!

*edit* Now Correlaid has posted another great solution, with Markov diagrams.