Posting this question here since y'all are math geniuses:
Suppose I have a set of 100 elements, out of which k of them are black, and the rest are white. Suppose I randomly pick an element from the set. If it is black, then I stop, otherwise, I put the element back into the set and repeat.
Question: what's the expected number of elements I will pick from the set before I successfully find a black element?
Here's my calculation, which is giving me some odd results, so I'm not sure if I did something wrong:
- Since k out of 100 elements are black, the probability that my first pick is black is p=k/100.
- Which means the probability that my first pick is not successful is (1-p) = 1 - k/100 = (100 - k)/100.
- My second pick, again, has a probability of success of p=k/100.
- So the chances that I will need two picks to get a black element is (1-p)*p.
- Similarly, the chances that I will need 3 picks to get a black element is (1-p)*(1-p)*p.
- And so on, for my subsequent picks.
- So, according to the definition of expected value, the average number of elements I have to pick before I find a black one should be:
E[n] = 1*p + 2*(1-p)*p + 3*(1-p)*(1-p)*p + ... + n*(1-p)(n-1)*p.
where p = k/100.
Now here's the strange thing: I wrote a program to calculate the expected number of steps according to the above formula for various values of k, and here's what I got:
k=1: E[n] = 26.8
k=2: E[n] = 30.1
k=3: E[n] = 27.0
k=4: E[n] = 22.9
k=5: E[n] = 19.3
...
Why does the average number of steps required to find a black element increase between k=1 and k=2??? That makes no sense at all to me. When k=1, I have only a 1% chance of finding a black element, but when k=2, I have 2% chance of finding a black element. Surely that means I should be able to find one in a fewer number of steps?? Why are the values for k=2 and k=3 greater than the value for k=1? What did I do wrong?