Confused about probability and expected values

Other scientific, philosophical, mathematical etc. topics go here.

Confused about probability and expected values

Postby quickfur » Sat Feb 02, 2013 12:39 am

Posting this question here since y'all are math geniuses:

Suppose I have a set of 100 elements, out of which k of them are black, and the rest are white. Suppose I randomly pick an element from the set. If it is black, then I stop, otherwise, I put the element back into the set and repeat.

Question: what's the expected number of elements I will pick from the set before I successfully find a black element?

Here's my calculation, which is giving me some odd results, so I'm not sure if I did something wrong:
- Since k out of 100 elements are black, the probability that my first pick is black is p=k/100.
- Which means the probability that my first pick is not successful is (1-p) = 1 - k/100 = (100 - k)/100.
- My second pick, again, has a probability of success of p=k/100.
- So the chances that I will need two picks to get a black element is (1-p)*p.
- Similarly, the chances that I will need 3 picks to get a black element is (1-p)*(1-p)*p.
- And so on, for my subsequent picks.
- So, according to the definition of expected value, the average number of elements I have to pick before I find a black one should be:
E[n] = 1*p + 2*(1-p)*p + 3*(1-p)*(1-p)*p + ... + n*(1-p)(n-1)*p.

where p = k/100.

Now here's the strange thing: I wrote a program to calculate the expected number of steps according to the above formula for various values of k, and here's what I got:
k=1: E[n] = 26.8
k=2: E[n] = 30.1
k=3: E[n] = 27.0
k=4: E[n] = 22.9
k=5: E[n] = 19.3
...

Why does the average number of steps required to find a black element increase between k=1 and k=2??? That makes no sense at all to me. When k=1, I have only a 1% chance of finding a black element, but when k=2, I have 2% chance of finding a black element. Surely that means I should be able to find one in a fewer number of steps?? Why are the values for k=2 and k=3 greater than the value for k=1? What did I do wrong?
quickfur
Pentonian
 
Posts: 2935
Joined: Thu Sep 02, 2004 11:20 pm
Location: The Great White North

Re: Confused about probability and expected values

Postby Klitzing » Sat Feb 02, 2013 6:32 pm

Hy Quickfur,

it is not that your formula is wrong.
It is that you obviously just calculated E[100].

But you wrote that you will put your selection back into the bowl. Thus you could select for ever on.
Accordingly you will have to calculate E[infinity] instead!
Doing so, you'll see that your result just is

E[infinity] = 100/k for k black elements within a total of 100 elements.

--- rk
Klitzing
Pentonian
 
Posts: 1637
Joined: Sun Aug 19, 2012 11:16 am
Location: Heidenheim, Germany

Re: Confused about probability and expected values

Postby quickfur » Sun Feb 03, 2013 12:53 am

Klitzing wrote:Hy Quickfur,

it is not that your formula is wrong.
It is that you obviously just calculated E[100].

But you wrote that you will put your selection back into the bowl. Thus you could select for ever on.
Accordingly you will have to calculate E[infinity] instead!
Doing so, you'll see that your result just is

E[infinity] = 100/k for k black elements within a total of 100 elements.

--- rk

Ahhh I see my mistake now. Thanks!!

Next question: how do I go about calculating the expected deviation (i'm not sure if that's the right term -- basically I want to know how "wide" the actual value of number of steps will deviate from E[infinity])? As you can tell, my grasp of probability is rather weak. :\
quickfur
Pentonian
 
Posts: 2935
Joined: Thu Sep 02, 2004 11:20 pm
Location: The Great White North

Re: Confused about probability and expected values

Postby Klitzing » Mon Feb 04, 2013 10:27 am

Your probability (to get a black element from a set of a total of 100 elements, which contains k black ones) is p = k/100.
But it would be useful to consider the probability of the complement (to get a non-black element) additionally, amounting to q = 1-p.

First consider the random variable X of your case, where you ask for the count of selections (with putting the selection back) until you finally get a black one.
It will amount here in a discrete function, which associates to all values m (m selections) the corresponding probability
p_m = (1-p)^(m-1)*p = q^(m-1)*(1-q)

Then the expected value would be (in its more usual notation, cf. http://en.wikipedia.org/wiki/Expected_value):
E(X) = 1*p_1 + 2*p_2 + 3*p_3 + 4*p_4 + ...
= 1*p + 2*(1-p)*p + 3*(1-p)^2*p + 4*(1-p)^3*p + ...
= 1*(1-q) + 2*q*(1-q) + 3*q^2*(1-q) + 4*q^3*(1-q) + ...
= (1-q)*sum(m=1, infinity, m*q^(m-1))
= ... = 1/p
= 100/k

So far we were already. (I just put the terminology a bit more consistent to literature.)

Now the squared deviation (cf. http://en.wikipedia.org/wiki/Squared_deviations) or variance (cf. http://en.wikipedia.org/wiki/Variance) generally is defined as:
sigma^2 = Var(X) = E(X^2) - (E(X))^2

E(X^2) then here amounts to
= 1*p_1 + 4*p_2 + 9*p_3 + 16*p_4 + ...
= (1-q)*sum(m=1, infinity, m^2*q^(m-1))
= ... = 2*(1/p)^2 - 1/p

Thus the desired variance will result here in:
Var(X) = (2*(1/p)^2 - 1/p) - (1/p)^2
= (1/p)^2 - 1/p
= (1/p)^2*(1-p)
= q/(p^2)
= (1 - k/100)/(k/100)^2
= (100 - k)*100/(k^2)

--- rk
Klitzing
Pentonian
 
Posts: 1637
Joined: Sun Aug 19, 2012 11:16 am
Location: Heidenheim, Germany

Re: Confused about probability and expected values

Postby quickfur » Fri Feb 08, 2013 9:29 pm

Klitzing wrote:[...]
E(X^2) then here amounts to
= 1*p_1 + 4*p_2 + 9*p_3 + 16*p_4 + ...
= (1-q)*sum(m=1, infinity, m^2*q^(m-1))
= ... = 2*(1/p)^2 - 1/p
[...]

How did you derive the last line from the infinite sum? I'm having trouble solving it by hand.
quickfur
Pentonian
 
Posts: 2935
Joined: Thu Sep 02, 2004 11:20 pm
Location: The Great White North

Re: Confused about probability and expected values

Postby Klitzing » Fri Feb 08, 2013 11:07 pm

Easiest way would be to isolate the sum, moving the remainder on one side. I.e.
    sum = (2/p^2 - 1/p)/(1-q)
Then using q only (as the sum uses q only), i.e. replacing p = 1-q here, getting
    sum = (1+q)/(1-q)^3
Now consider the right side as a function in q and derive its Taylor sum of that function around the value q=0.
That should work: f(0) = 1, f'(0) = 4, ...

(But to be honest, I did not solve that sum by any algebraic transformations at all, neither did I for the one of E(X). I just did as you did: I used an excel spreadsheet, put in the formulas of the summands (using an absolute reference to a separate cell for q) respectively of the sequence of the finite sums, coppied that down till about m=5000 and recognized the result as being a quite close approximation to what I wrote, after changing the value of q some few times. ... - Experimental math, you could say :lol: )

--- rk
Klitzing
Pentonian
 
Posts: 1637
Joined: Sun Aug 19, 2012 11:16 am
Location: Heidenheim, Germany


Return to General

Who is online

Users browsing this forum: No registered users and 3 guests