Minesweeper Identical Boards Problem, Part II
In the first part of this article I examined Windows Minesweeper code to come up with two hypotheses. The first hypothesis states that collisions (identical boards) are occurring as some people are starting Minesweeper right after booting their computers. I am going to call this hypothesis the boot hypothesis. The second hypothesis states that the collisions are sheer coincidence caused by the small number of initializations (65536) of the pseudo random number generator (PRNG.) I am going to call this hypothesis the coincidence hypothesis.
The populations for the two hypotheses are fairly different but not totally distinct. People who are leaving their computers running all the time can only be in the population for coincidence hypothesis. On the other hand people who are booting often can occur in the population of either hypothesis.
The key to evaluating these hypotheses is the observation that Damien Moore has seen the board with multiple collisions (board1) twice. I am going to call such an event a self-collision.
The probabilities for getting board1 on a particular session of Minesweeper are 1/363 and 1/65536 for the two hypotheses respectively. The player needs to start at least two Minesweeper sessions to get a self-collision. The probabilities of getting a self-collision in two particular sessions of Minesweeper are (1/363)^2 and (1/65536)^2. By 'particular' I mean that the player is not choosing sessions after she sees the results. An example of two particular sessions would be the next two sessions played by a Minesweeper player.
It is unlikely that a player will get a self-collision if she plays only two Minesweeper sessions. What if she plays two Minesweeper sessions every day for a whole year? What are her chances of getting a collision now?
We are looking for two or more favorable outcomes (board1 occurrences) in a total of n outcomes (Minesweeper sessions.) The problem is fairly similar to tossing coins, we are looking for two or more heads in n tosses of a coin. Keep in mind that the probability of heads is not 50 % for this particular problem.
The simplest way to answer this question is to compute the probabilities for board1 appearing 0 or 1 times in n outcomes. The Binomial probability distribution can be used to do this calculation. Adding the two computed numbers will give the probability for the event that the self-collision does not occur. Subtracting that number from 1 will give the probability of a self-collision in n sessions. The following table lists some results:
(Tosses) Boot Hypothesis Coincidence Hypothesis Sessions* Prob+ Prob+ 2 0.0000076 0.00000000023 100 0.031 0.0000015 500 0.401 0.000028 1000 0.762 0.00011
* Number of sessions started under the conditions of the appropriate
+ Probability for self-collision
According to these numbers if a player plays 1000 Minesweeper sessions their likelihood of witnessing a self-collision is excellent according to boot hypothesis and very poor according to coincidence hypothesis. The only problem is that these results are for a particular player, for instance me or Damien Moore. What if a 100 players played according to the conditions of these hypotheses? This is a reasonable question considering the potentially large pool of players who could have reported to Damien.
This problem again is very similar to tossing coins. This time we are looking for 1 or more heads (self-collisions) in n (number of players) tosses. The probabilities for heads are the numbers from the previous table. I am going to assume that players who play Minesweeper often are the only ones likely to report a self-collision. Therefore, I am only going to use the results from the scenario where a player has played 1000 Minesweeper sessions. This assumption is fairly reasonable. Slightly altering the number of sessions will not change the results drastically.
We can use the Binomial distribution again to arrive at the probabilities. The table below summarizes the results:
(Tosses) Boot Hypothesis Coincidence Hypothesis Num Players Prob Prob 10 0.999 0.0011 50 0.999 0.0057 100 0.999 0.0114 500 0.999 0.0559 1000 0.999 0.0108
The numbers above suggest that no matter how small the number of people in the boot hypothesis population, we are likely to see a self-collision. The boot hypothesis explains the observed outcome well, a little too well perhaps.
The coincidence hypothesis doesn't explain things well at all. It requires a large population to have a reasonable chance of observing the data. I do not believe the population of Minesweeper users who are submitting boards to Damien is extremely large. Even if the population did constitute 1000 people it still only has a 10 % chance of a witnessing a self-collision.
The problem with boot hypothesis is that the probability of witnessing a self-collision is very large (0.76.) It explains why Damien got a self-collision but it also predicts that many people should have reported a self-collision to Damien.
The reason for just one report of a self-collision is most likely a small population of players for boot hypothesis. Perhaps just 10 or 15 players constitute the entire population for boot hypothesis. Moreover, if a person gets a good score on board1 and doesn't get a better score in the next occurrence, they are likely to move on and not notice that they have witnessed a self-collision. The probability of board1 occurrence used in the calculations is a rather crude approximation and is also likely to be a factor.
Ultimately all these numbers are based on assumptions and approximations. The computations are mostly to guide intuition. At least to me there is enough in the numbers to suggest a very strong case for the boot hypothesis being the cause of collisions.
In part 3 of this article I will discuss Minesweeper collision avoidance.
by Usman Latif [Oct 05, 2003]