## Voting and the Harmonic series

As noted in a previous post my choice of reading material at the moment is a little restricted. As a result at the weekend I finished reading the cryptically titled Gamma – a book about the fourth most famous constant in mathematics.

Overall the book is relatively dry, but there was one bit that caught my eye – a strategy for deciding how to vote (in many ways a topical subject), the analysis of which uses the harmonic series. Now, I’m certainly not the first person to write a blog post applying a simple mathematical model to the British electoral system, but this still seems worth bring to the attention of anyone who cares to read this. We define $H_n:=\sum_{r=1}^{n}\frac{1}{r}$ (the ‘harmonic series’) It is a common undergraduate exercise to show that $H_n\rightarrow\infty$ as $n\rightarrow\infty$, the easiest way of seeing this probably being this demonstration due to Oresme. Don’t even think about trying to see this by adding up a few terms – it would take much longer than the age of the universe for this sum to exceed 100. The reason for the painfully slow divergence begin that the sum grows like the natural logarithm, indeed the constant gamma is defined by the somewhat surprising formula $\gamma:=lim_{n\rightarrow\infty}(H_n-ln(n)).$ These concepts play an important role in many areas of mathematics, are closely related to several open problems and have many extremely interesting properties.

Anyway – how to vote! When confronted with n possible choices, how do you choose the best one? You could read every manifesto cover to cover, weigh up all the pros and cons of every possibility and make a perfectly informed decision – a process far too time consuming for anyone to actually go through (the average constituency in the last UK general election had more than 6 candidates in it, besides would you expect an employer to interview EVERY applicant for a job?) Alternatively you could simply choose randomly – an approach that is very unlikely to arrive at the best choice.

An obvious strategy most voters (at least I think) adopt is to simply ignore some of the candidates and focus on just a few of them – most people will pay no attention to, say, any candidates for the BNP, the Communist Party of Britain or the Monster Raving Looney Party, for example. But how many should we ignore? Ignore too few and we’re still confronted with a dauntingly large array of choices, too many and we’re likely to overlook the best person for the job.

Enter the harmonic series!

[On rereading, my paraphrasing isn’t very clear and certainly doesn’t do the mathematics any justice, but here goes.]

As with any model, several clearly false simiplifying assumptions have to be made and so there are numerous holes you can find in the below discussion, though for an innocent bit of fun there’s no real harm done by approximating reality in this way.

Suppose B is the best candidate and we ignore the first r candidates. If B happens to be in the (r+1)th position we are guaranteed to choose them. The probability of this is 1/n. If B happens to be in the (r+2)th position and the candidate in the (r+1)th position is the best one we have considered so far then we will not successfully choose B. We will thus vote for B if the best yet among the first r+1 choices happens to be among the first r of them – an event that occurs with probability r/(r+1). The total probability of successfully choosing B in this case is then r/n(r+1) (remember the probability of B occupying any given position is 1/n). Similarly the corresponding probabilities for the (r+3)rd position etc are

$\frac{r}{n(r+2)},\mbox{ }\frac{r}{n(r+3)},\ldots,\frac{r}{n(n-1)}.$

The total probability that we will successfully choose B is thus

$P(n,r):=\frac{1}{n}(1+\frac{r}{r+1}+\frac{r}{r+2}+\cdots+\frac{r}{n-1}).$

Which value of r maximizes this? Notice we can rewrite this as

$P(n,r)=\frac{1}{n}(1+r(H_{n-1}-H_r)).$

Given the definition of $H_n$, we can approximate $H_n$ with $ln(n)+\gamma$, enabling us (after half a line of school-level algebra) to rewrite P(n,r) as

$P(n,r)\approx\frac{1}{n}(1+rln(\frac{n-1}{r}))$

and differentiating this gives

$P'(n,r)\approx \frac{1}{n}(ln(\frac{n-1}{r})-1)$

leading us to the conclusion the optimal r (that is, the value of r for which this last formula is 0) is achieved when (n-1)/r=e (Napier’s constant). Certainly if n is large we may as well take r to be n/e. In plain english, if ignore roughly the top third of the ballot paper, we’re still very likely to vote for the best candidate (I wonder which parties tend to be near the top of the ballot paper).

Well, I found that mildly amusing, even if nobody else did.