The hypothesis that achieving arc consistency using AC-3 is sufficient to solve the Zebra problem can be quickly disproved using the benchmark Zebra problem presented earlier in this paper. Once arc consistency has been achieved there still remain several variables with more than one possible value in its domain. Similar results can be seen for the Sherlock problem by applying AC-3 to a random case generated by the GenerateRandomProblem() procedure of the preceeding section.
One such sample problem appears in Appendix A
Kumar (in [Kumar92]) reports that arc consistency is not, in general, sufficient to find a solution (or that there is no solution). He then goes on to show that for a contraint graph having n nodes, it can be solved using only arc consistency, if the graph is n-consistent. He also shows that it is sometimes possible to solve a n-node bcsp with less than n-consistency.
Given that arc consistency by itself cannot solve all binary constraint problems it follows that an algorithm such as BM-CBJ2 which can is better at solving the bcsp's. Having said that, comparing the performance of AC-3 to that of BM-CBJ2 for problems that AC-3 can solve is still an interesting exercise. Therefore, using problems generated using GenerateStrongKProblem(), we obtain the following results over 1000 test cases for each of the Zebra and Sherlock problems:
Table 3. AC-3 vs. BM-CBJ2: Zebra
|Fewer Checks (number of cases)||25||975|
|Less Time (number of cases)||277||710|
|Average Number of Checks||23410||3755|
|Minimum Number of Checks||9876||137|
|Maximum Number of Checks||42086||134394|
|Average Time (CPU + System) (seconds)||1.77||1.72|
|Minimum Time (CPU + System) (seconds)||1.57||1.39|
|Maximum Time (CPU + System) (seconds)||1.98||5.24|
Table 4. AC-3 vs. BM-CBJ2: Sherlock
|Fewer Checks (number of cases)||124||876|
|Less Time (number of cases)||417||573|
|Average Number of Checks||89090||139170|
|Minimum Number of Checks||33529||294|
|Maximum Number of Checks||164391||61250399|
|Average Time (CPU + System) (seconds)||2.16||4.67|
|Minimum Time (CPU + System) (seconds)||1.83||1.50|
|Maximum Time (CPU + System) (seconds)||2.56||1145.12|
What is most interesting about the results is the fact that in a small number cases BM-CBJ2 is much worse than AC-3, so much so that for Sherlock most of the performance measures look worse for BM-CBJ2 than AC-3 despite the fact that it was better more than half the time. Exploring the properties of the problems that result in such bad performance is a useful future exercise.
The hypothesis that BM-CBJ2 is more efficient at solving most of a large random sample of of Zebra problems is true, as can be seen below. Further, we can see that the algorithm tested can be ordered, both in terms of constraint checks and in terms of time, as follows: BM-CBJ2 < DoubleX < Xover < Mutate although DoubleX is only slightly better than Xover. The change to Xover to get DoubleX is not a clear winner, but crossover plus mutation is without a doubt better than mutation alone. The numerical results follow.
Table 5. Times one algorithm (row) bettered another (column)
Table 6. Zebra, BM-CBJ2 vs Mutate vs Xover vs DoubleX
|Average Number of Checks||3554||12588855||3897429||3315161|
|Minimum Number of Checks||412||541110||379620||499423|
|Maximum Number of Checks||15093||132081622||323868611||212341720|
|Average Time (CPU + System) (seconds)||2.17||62.65||22.00||19.30|
|Minimum Time (CPU + System) (seconds)||1.82||5.44||4.66||5.14|
|Maximum Time (CPU + System) (seconds)||6.49||686.84||1603.49||986.12|