Randomness may not be so random

Posts: 1 · Views: 34
  • 19596

    When requesting a random wallpaper from a larger search (~6000 total) repeatedly, I often see the same result in a short-to-medium period of time(a few dozen requests in 5-10 minutes). I can accept if this is somehow mere chance, but I think results may not be very random.

    With 6000 possible results, I would NOT expect to see the same ones as often as I'm seeing them. Is the distribution even? Is the random number generator being seeded correctly? Has this random option ever been tested numerically?

    I've also tried doing a website search for random images (say "blonde") and see the same marginal entropy. Sometimes the same images will appear randomly in the first 12 returned over a few results. I know this is subjective and lacks statistical proof, but I'm not sure how to test it easily. Apologies if this is somehow observer error.

    Btw: I noticed the same thing with the alpha site, but never reported it, because I couldn't be sure. Now I'm seeing it more, with larger result sets -- perhaps because I'm looking at more carefully.

    
    Some actual statistics calculations --
    
    If you take 12 samples from a pool of 5000, what are the odds you will NOT see one of them again in 12 samples if you do it a second time.
    
    ( (N-r) / N ) ^ 12
    
    ( ( 5000 - 12 ) / 5000 ) ^ 12 =  0.9976 ^ 12 = 0.9715 = ~97% chance of not seeing a duplicate = 3% chance of a dup = 1 in 30 tries.

    I'm seeing a single duplicate much more often. Something like in 1 in 6. (Still not very scientific, but you get the point).

    Added 35 minutes after

    OK.... ad hoc test number two. Pull images 12 at a time. Pick one image from the first group and save in a windo. Repeat the search until that one image comes up again. Since I picked "blonde" and to 5168 total images -- it should be a while before I see that same one again.

    I saw the same image again in 20 tries and a second time in 2 more tries and a third duplicate after 7 more tries. This is way more often than mathematically expected.

    Some more actual calculations --
    If you take 1 sample from a pool of 5168, how many tries on average, before you see it again.
    ( ( n-r ) / n ) ^ 12  ^ N = 0.5
    
    ( ( 5168 - 1 ) / 5168 ) ^ 12  ^ N  = 0.5
    5167/5168 ^ 12 ^ N = 0.5
    0.9998 ^ 12  ^ N = 0.5
    0.99768 ^ N = 0.5
    Log(0.99768) * N = Log(0.5)
    N= Log(0.5) / Log(0.99768) = 298.42 Tries Average (@12 images per go)
    Last updated

Message