This is an exploration of a problem I first started thinking about in my senior year of high school. It was proposed by my friend Cooper: suppose you have a shuffled standard deck of cards, what is the expected value of the number of cards that are in the correct position? More broadly, what is the expected value for any set of \(n\) objects (you can use any given arbitrary ordering as the “correct” ordering)? At the time, I explored this problem by empirically computing the values with code. I had some very surprising results, but I didn’t have the math to make much sense of them back then. I’ll briefly go over my early approaches first, then revisit the problem and the results with some new techniques.

A Bit of Python and Some Shocking Results

This was my original code. Seeing it again makes me feel much better about my current code quality.

My algorithm relied on one concept: the number of ways to reorder \(n\) objects s.t. none of the objects are in the correct position. Let’s call this value \(a_n\). Then the number of ways to put exactly \(k\) out of \(n\) objects in the right position is \[{n\choose k}a_{n-k} = \frac{n!}{k!(n-k)!}a_{n-k}.\] i.e. it’s the number of ways to choose \(k\) out of \(n\) objects (to be in the right position) times the number of ways to place all the remaining \(n-k\) objects in the wrong position. This means that if we can compute \(a_n\), then we can just use a summation to compute our desired expected value: \[\frac{1}{n!}\Bigg(n + \sum_{k=0}^{n-1}{k}{n\choose k}a_{n-k}\Bigg).\] It looks like there’s an extra term here (the \(n\)), and our summation stops early at \(k=n-1\). This ugly duckling represents the case where all objects are in the correct position (\(k = n\)); it’s different because I don’t think that \(a_{n-n} = a_0\) is well defined. What does it mean for all 0 objects to be in the wrong position? There is a clear value for \(a_0\) that makes all the summations work (namely \(a_0=1\)), but I can’t bring myself to accept a definition, so we have this extra term: it is what it is.

The second insight was that \(a_n\) can be defined recursively i.e. the number of ways to have all \(n\) objects in the wrong position is equal to the total number of ways to order \(n\) objects minus the number of ways to have exactly 1, 2, \(\ldots\), or n objects in the right position: \[a_n = n! - \sum_{k=1}^{n-1}{n\choose k}a_{n-k} - 1.\] Again, the extra 1 is to handle the case where all objects are in the right position (because of our \(a_0\) conundrum).

Computing \(a_n\) with this recursion yields the sequence \[0, 1, 2, 9, 44, 265, 1854, 14833, 133496, 1334961, \ldots\] Note that the base case here is \(a_1=0\) since there’s only one place to put the one object. At the time, I also verified that this lines with the brute force counting solution where you just enumerate all possible permutations and count up the permutations where no elements are in the right position. Armed with the \(a_n\) values, I computed the expected values and got the following sequence: \[1, 1, 1, 1, 1, 1, \ldots.\]

Ok well, technically I got a sequence of around 15 1’s, then the sequence started exhibiting tiny deviations (on the order of 1e-16 or less), but it was clear that this was just the imprecision that comes with floating point math and not an actual difference. This result was quite surprising, so I tried estimating the expected value experimentally by randomly sampling permutations and computing an empirical average. These values were (as expected) a bit noisier but still within 1e-4 of 1.

The next result was even more surprising. I was curious how the values of \(a_n\) changed over time, so I computed a new value: the fraction of permutations that are completely wrong. Let’s call this \[\alpha_n = \frac{a_n}{n!}.\] This produced the sequence \[0, 0.5, 0.33\overline{3}, 0.375, 0.36\overline{6}, 0.3680\overline{5},\]\[0.367857…, 0.3678819\overline{4}, 0.3678792…, 0.3678795…\] As you can see, this sequence seems to converge — and fairly quickly — to a seemingly random decimal: around 0.367879. It turns out that the value of the limit is actually \[\frac{1}{e}\approx 0.367879441171.\] Obviously this blew the mind of 18 year old Christy, but I couldn’t come up with an explanation for this bizarre phenomenon. I was doing competition math at the time and felt pretty hand with combinatorics, but combinatorics were not quite enough to understand this.

The Probability Perspective

In my sophomore year of college, I took Stat 110: Intro to Probability taught by the incredible Professor Joe Blitzstein. This class taught me a whole lot, and it’s the class where I got to truly appreciate the potential of math to be so elegant, beautiful, and simple. Here’s how a Stat 110 student would approach the problem: let \(S\) be a random variable denoting the number of items that are in the correct position. Then we can say that \(S = \sum_{i=1}^{n}{X_i}\) where \(X_i\) is an indicator variable representing whether the \(i\)th object is in the correct position. Note that an indicator variable is a binary variable that evaluates to 1 when a condition is true and 0 when a condition is false. An important property of expectations is linearity which implies that \(E[X+Y] = E[X] + E[Y]\) i.e. the expected value of the sum of random variables is equal to the sum of the expected values of each variable. Leveraging this property, we get \[E[S] = E\bigg[\sum_{i=1}^{n}{X_i}\bigg] = \sum_{i=1}^{n}{E[X_i]}.\] However, there’s nothing special about any given \(X_i\) — they all behave the same way regardless of the value of \(i\). This property is called symmetry, and by symmetry, \(E[S] = n{E[X_i]}\). Lastly, we can just compute \(E[X_i]\). There’s \(n\) possible positions, and only one is right, so there’s a \(\frac{1}{n}\) chance that the \(i\)th object is in the right position and a \(\frac{n - 1}{n}\) chance that it’s not. Thus \[E[X_i] = 1\cdot P(X_i = 1) + 0\cdot P(X_i = 0) = 1\cdot \frac{1}{n} + 0\cdot \frac{n-1}{n} = \frac{1}{n},\] which of course implies that \[E[S] = n{E[X_i]} = n\cdot \frac{1}{n} = 1.\] Ok, so we proved the first surprising result. Now let’s move onto the second with real analysis.

Wish I had Some Real Analysis

Recall that \(\alpha_n\) is the fraction of permutations that are completely wrong, so we can substitute \[a_n = \alpha_n{n!}\] into our recurrence: \[a_n = n! - \sum_{k=1}^{n-1}{n\choose k}a_{n-k} - 1 \implies \alpha_n{n!} = n! - \sum_{k=1}^{n-1}{n\choose k}\alpha_{n-k}{(n-k)!} - 1.\] Expanding the choose, canceling out like terms, and dividing through by \(n!\) yields \[\alpha_n{n!} = n! - \sum_{k=1}^{n-1}\frac{n!}{k!(n-k)!}\alpha_{n-k}{(n-k)!} - 1 \implies \alpha_n = 1 - \sum_{k=1}^{n-1}\frac{\alpha_{n-k}}{k!} - \frac{1}{n!}.\]

Ok, now I’m going to do something that I was avoiding at all costs: we’re gonna define \(a_0 = \alpha_0 = 1\). I stand by my statement that \(a_0\) is not well defined: instead we’re just going to treat this as an arbitrary sequence with an initial term and a recurrence formula. Since this sequence is still clearly equal to the original definition of \(\alpha_n\), they will still share all of the same properties. I’m doing this purely to make the notation simpler.

Ok, so now our glorious sequence is: \[\alpha_0 = 1,\quad \alpha_n = 1 - \sum_{k=1}^{n}{\frac{\alpha_{n-k}}{k!}}.\]

First, note that if we assume that the sequence \(\alpha_n\) converges to some value \(\alpha\) then we could rewrite this recurrence as (roughly): \[\alpha = 1 - \alpha\sum_{k=1}^{\infty}{\frac{1}{k!}} \implies \alpha \Bigg(1 + \sum_{k=1}^{\infty}{\frac{1}{k!}}\Bigg) = 1 \implies \alpha \Bigg(\sum_{k=0}^{\infty}{\frac{1}{k!}}\Bigg) = 1\implies \alpha \cdot e = 1.\]

Note that the starting index shifts from \(k=1\) to \(k=0\) in the second step. And regarding the final step, Leonard Euler himself proved that infinite series of the inverse factorial sums to \(e\).

We can also rewrite the recurrence as \[\alpha_n = 1 - \sum_{k=1}^{n}{\frac{\alpha_{n-k}}{k!}} = 1 - \sum_{k=1}^{n}{\frac{(\alpha_{n-k} - \alpha) + \alpha}{k!}} = 1 - \alpha\sum_{k=1}^{n}{\frac{1}{k!}} - \sum_{k=1}^{n}{\frac{(\alpha_{n-k} - \alpha)}{k!}}.\]

As we saw above, the first two terms will converge to \(\alpha = \frac{1}{e}\) and the third term is a sort of “factorially” weighted error correction term (that converges to 0). To see this, note that an early error like \(\alpha_0 - \alpha\) gets weighted by \(\frac{1}{n!}\), and a recent error like \(\alpha_{n-1} - \alpha\) gets weighted by 1. This is reminiscent of exponential moving averages and lag polynomials. In fact, (I’m near certain that) this sequence will converge to \(\alpha = \frac{1}{e}\) for any initial condition, and I tested this out by computing the sequence for different values of \(\alpha_0\) — including values as large as 5e10 and even negative ones.

This is where I get stuck. As far as I could find online, there’s only three ways to prove that this sequence converges. The first is the traditional epsilon-N proof i.e. that terms of the sequence get arbitarily close to the limit. The second is proving that the sequence is a Cauchy sequence i.e. that terms of the sequence get arbitrarily close to each other. The third is the monotonic convergence theorem i.e. that the sequence always moves in the same direction (up or down), and it always stays within some bounds. The third does not apply because the sequence is not monotonic: \(\alpha_{n} - \alpha_{n-1}\) seems to oscillate between positive and negative values — which makes sense for an error correcting formula. The alternative application of the theorem is to find two monotonic subsequences, but I’m not sure this is possible/provable. The epsilon-N proof is particularly hard because we don’t have an explicit formula for \(\alpha_n\). My guess is that both this approach and the Cauchy sequence approach would require some sort of inductive argument i.e. you’d rely on knowing/showing that the property holds for larger \(\epsilon\) to show that it also holds for small \(\epsilon\). Alas my real analysis knowledge is either too rusty or too patchy to actually prove this convergence. In some weird way, this does make me feel a little vindicated for thinking that learning real analysis was kinda useless.

Possible Intuition

There’s only one explanation I can think of for why this probability converges to \(\frac{1}{e}\) of all values. We know that the probability of putting any one element in the wrong position is \(1 - \frac{1}{n}\). If these objects didn’t interfere with each other i.e. if they were independent events, then the probability of every object being in the wrong position would just be the product of the individual probabilities i.e. \[\bigg(1 - \frac{1}{n}\bigg) ^ n.\] This might be the case in the limit (as \(n\to \infty\)). If you think about placing the objects one at a time, then when \(n\) is large, earlier placements have little effect on later placements as the number of available (but wrong) slots is still close to \(n - 1\).

Ok, so supposing that this expression is useful in the limit, we can note that: \[1 - \frac{1}{n} = \frac{n - 1}{n} = \bigg(\frac{n}{n - 1}\bigg)^{-1} = \bigg(1 + \frac{1}{n - 1}\bigg) ^ {-1} = \frac{1}{1 + \frac{1}{n - 1}}.\] Thus \[\bigg(1 - \frac{1}{n}\bigg) ^ n = \bigg(1 - \frac{1}{n}\bigg)\bigg(1 - \frac{1}{n}\bigg) ^ {n-1} = \frac{1 - \frac{1}{n}}{\big(1 + \frac{1}{n - 1}\big) ^ {n-1}}.\]

Both the numerator and the denominator converge (to non-zero values) as \(n\to \infty\), so the limit of the quotient is the quotient of the limits: \[\lim_{n\to\infty} \bigg(1 - \frac{1}{n}\bigg) ^ n = \lim_{n\to\infty} \frac{1 - \frac{1}{n}}{\big(1 + \frac{1}{n - 1}\big) ^ {n-1}} = \frac{ \lim_{n\to\infty} 1 - \frac{1}{n}}{\lim_{n\to\infty} \big(1 + \frac{1}{n - 1}\big) ^ {n-1}} = \frac{1}{e}.\]

Alright well, this was fun. Even though I couldn’t find a convergence proof, it was nice to go back to this problem and spend some time with it (I spent a few days thinking about convergence, and I’m cutting my losses here). Thanks for reading.