Suppose you want to pick the best item out of a collection, but you must decide after seeing each item whether to keep it and walk away or discard it and keep looking, loosing this item forever. An apartment search in competitive cities is an example of this kind of decision process. The optimal stopping rule is to examine the first 37% items without committing, then choose the item that is the best among the ones seen so far; incredibly, following this strategy one selects the absolute best item in 37% of the cases. But what happens when the best item is not selected?

This problem is also known as the secretary problem (where a boss has to pick the best candidate interviewing for a secretary position), and it is not hard to show that the optimal cutoff is $n/e\approx 0.3678\cdot n$ and the probability of selecting the best is also $1/e$. What is amazing is that this probability does not decrease with the number of items to choose from, so even if one could, for example, date the entire world population, they would end up choosing the best partner with a probability of 37%. Conveniently, when one does not have too many items to choose from, the probability of picking the best is even a bit higher, for example it is about 41% with ten items.

As I tend to be rather risk-averse, I could not help but wonder what happens when the best is not selected? In other words, what does an “average” choice looks like? Fortunately, this decision process is quite easy to simulate, for example in R:

choose37 <- function(ranks) {
  m <- as.integer(length(ranks) * exp(-1))
  best_of_m <- length(ranks)
  for(i in 1:length(ranks)) {
    r <- ranks[i]
    if(r < best_of_m) {
      if(i <= m) {
        best_of_m <- r
      }
      else {
        break
      }
    }
  }
  c(i, r)
}

This function takes as input the ranks of the items so that the best has rank one. We can now simulate random set of items and choose according to this rule simply via choose37(sample(n)). I then simulated 100,000 draws from 1,000 objects and computed statistics on the chosen rank.

These are the probabilities for the first ten ranks:

Rank Prob.
1 0,368
2 0,137
3 0,0634
4 0,0311
5 0,0167
6 0,00832
7 0,0054
8 0,00324
9 0,00185
Cum. 0,63501

Therefore, the 37% rule will pick an item ranked in the top 1% with 63.5% probability. The distribution of ranks looks like this:

Min. 1st Qu. Median Mean 3rd Qu. Max.
1 1 2 184.5 316 1,000

Even though in some cases we end up with the absolute worst item, on average we picked the top 18% and with 75% probability within the top 31.6%. That is not bad at all!

For a comprehensive picture, the (empirical) probability distribution looks like this:

probs

Quite interestingly, the distribution after rank 1.5% seems to be rather uniform, with a density that likely depends on the number of items.

In conclusion, the 37% rule works extremely well compared to its simplicity!