HiveBrain v1.2.0
Get Started
← Back to all entries
patternjavascriptMinor

My 15 puzzle solver is too slow

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
solverpuzzleslowtoo

Problem

It takes my code over 700 seconds to solve simple puzzle:

2  7 11  5
13  0  9  4
14  1  8  6
10  3 12 15


My function findSolution(puzzle, minH) takes puzzle as an array and minimal h parameter which has to be found.

I'm using A* with Manhattan Distance heuristic function. G is step number and h is Manhattan Distance.

I'm searching for a node with the smallest value of f in queue, where f = g + h, and expanding it (moving tiles to empty space to get another four or less nodes). If any of expanded nodes are already in queue or visited array and its g is smaller than current node, I replace it, or else I put it in queue and I move the current node to visited array.

Every state od puzzle is two 32-bit numbers called as currentHi and currentLo and it has two numbers showing which state was previous: previousHi and previousLo, and also g and h parameters.

These numbers are calculated by hashing function.

Previously I kept states in arrays, but changing to numbers gave me more memory and speed.

I can solve puzzle from example above with minH = 4 in 0.1 s, but I see if I want to get next state with lower h, I have to expand many wrong nodes, because they have lower g + h. That's why it takes so much time.

When I'm improving simple things in my code, it doesn't seem to work faster or I get a few milliseconds extra, but it's not enought. I think the problem is with the algorithm which I'm using:

`Create a node containing the goal state node_goal
Create a node containing the start state node_start
Put node_start on the open list
while the OPEN list is not empty
{
Get the node off the open list with the lowest f and call it node_current
if node_current is the same state as node_goal we have found the solution; break from the while loop
Generate each state node_successor that can come after node_current
for each node_successor of node_current
{
Set the cost of node_successor to be the cos

Solution

Non-optimal solutions

If you just need to compute a relatively fast solution quickly but it doesn't have to be the guaranteed fastest solution, then you can just greedily look for the fastest way to get the fringe row and column (the row and column furthest from the non-tile goal) right, and then solve the remaining 8 puzzle. Related paper. Plain A or IDA should be enough for each subtask.

There seem to be no algorithms that make use of known upper bounds.

Basic algorithm choice

From what I have read, a more space efficient algorithm such as IDA* is needed to make sure you can't get unlucky and run out of memory. It makes the algorithm significantly slower, though. A better heuristic function is needed to make the algorithm faster.

Improved Manhattan distance

A relatively simple tweak is to take the Manhattan distance heuristic function + the linear conflict correction. For the latter you look at every line of the puzzle. If you find two tiles there which are supposed to end up in this line, but which are currently in the wrong order, then you know that the Manhattan distance is too optimistic and you actually need at least 2 more moves to get the two tiles past each other. One can prove that the heuristic function remains admissible (in fact monotone) even if you add 2 for every pair with this problem in any row and also for every pair with the analogous problem in any column.

This paper describes an additional corner-tiles heuristic, but unfortunately it cannot tell us whether the additional moves required are horizontal or vertical. Therefore, for some of the arguments below (the tweak that consists of combining several lower bounds more efficiently by taking separate horizontal and vertical maxima) it can't be included in the improved Manhattan distance, or only with extra care to distinguish various special cases.

Inversion distance

A similarly good heuristic function that can also be computed on the fly is inversion distance. For horizontal inversion distance, we linearise the square (conceptually) by concatenating the rows and looking at the number of inversions, i.e. pairs of tiles that are out of order. (We do not count inversions with the empty place.) Horizontal moves cannot affect horizontal inversions. A vertical move means that from the point of view of our linearised problem, one tile jumps over three other tiles. This changes the number of inversions by ±1 or ±3. Therefore a lower bound for the number of vertical moves in a solution is given by ⌈n/3⌉, where n is the number of inversions in our long row.

Obviously there is a similar lower bound for the number of horizontal moves which we get by considering the vertical inversions. For its computation we have to renumber the tiles in a way that corresponds to a matrix transposition. The total inversion distance is the sum of the two lower bounds we get in this way. Clearly it is admissible.

The author doesn't mention this, but there is actually a second horizontal inversion number, which we get if we renumber the tiles in such a way that it corresponds to putting the rows in the opposite order. This is likely to result in a different lower bound, and we can then take the maximum of the two. Of course we can improve the vertical horizontal inversion number in the same way.

Combined heuristic functions

If you have two (or more) admissible heuristic functions, then their maximum is again admissible. It takes longer to compute, but apart from that it is at least as good as each of the individual functions. In fact, if the functions have different strengths and weaknesses, then their maximum can be significantly better.

This is the case for the combination of Manhattan distance and inversion distance. Presumably the same still holds to some extent for improved Manhattan distance and inversion distance.

One observation that I have not seen before is that since in both Manhattan distance and inversion distance we get separate lower bounds for horizontal and vertical moves, we can also take the maximum before adding. Theoretically the result could be better in up to a quarter of cases. In practice the improvement caused by this tweak will almost certainly be less, but probably still worth it as it comes practically for free.

Pattern database

The basic idea of a pattern database is to relax the original problem in such a way that it becomes feasible to create a complete database of exact solutions, using, for example, dynamic programming. The optimal number of moves for a relaxed problem is a lower bound for the entire problem.

A very general definition of pattern that should cover most practically occurring cases is characterised by the following data:

  • A set of tiles whose horizontal moves you want to count.



  • A set of tiles whose vertical moves you want to count.



  • A partitioning of all positions that says which positions you consider equivalent.



  • A partitioning of all tiles that says which tiles you consider equivalent. No

Context

StackExchange Code Review Q#108582, answer score: 9

Revisions (0)

No revisions yet.