Solving NRK’s game “Former”

NRK published a fun game on their website called “Former” (English: “Shapes”). The gameplay is straightforward: the player clicks on a cell on the board, and as a result every connected cell (up, down, left and right) with the same color vanishes. The shapes fall down and the player is ready for the next move.

The goal is to make all shapes vanish in as few clicks as possible.

In this article we formulate the game as a graph search problem, show how to reduce the size of the graph and attempt to solve it using various graph search algorithms. The full code is on GitHub at github.com/tommyod/NRK-former-game.

Games as directed graphs

Games like these can be represented as directed graphs: nodes represent game states and directed edges represent moves (clicks). The initial node is the first board we see and the terminal node is the empty board.

  • Any path from the initial node to the terminal node is a solution.
  • The optimal solution is the shortest solution path.

The original game has 9 rows and 7 columns, which leads to a huge graph and therefore a huge search space. In the game instance shown in the animation above:

  • There are 39 possible first moves
  • There are 1,446 possible sequences of two moves
  • There are 51,162 possible sequences of three moves
  • There are 1,730,312 possible sequences of four moves
  • There are 55,950,299 possible sequences of five moves

We count one move as one click on any cell in a connected group of cells of the same color. The search space continues to grow as we search for sequences of moves. The branching factor is roughly 35, so at depth 10 we can expect there to be around \(35^{10} \approx 10^{15}\) nodes.

A simpler instance

Let’s examine a simpler instance on a board with 2 rows, 2 columns and 2 colors. Below is the full search space for such a problem instance.

The initial node is on the top and the terminal node is at the bottom of the figure. There are 24 nodes in total. The game can be solved in 3 or 4 moves, depending on which path we choose. The original game instance has billions of such nodes, and solving it comes down to searching the graph efficiently.

Reducing the search space

The first observation we make is that a click on any group of connected colors leads to the same node. It does not matter which specific cell we click as long as it’s connected to a group of the same color, so we might as well only consider the top-left cell in any group of connected colors.

The second observation is that any permutation of colors represents the same state. If there are \(4\) colors, there are \(4! = 24\) ways to permute them. Applying this observation we can reduce the search space from 24 to 12 nodes in this specific problem instance.

The third observation is that horizontal flips represent the same state. If we’ve solved the game at some node \(n\), then we’ve also solved the node \(n'\) with flipped columns. There are two such states (flipped and not flipped), so this does not reduce the search space by that much. By applying this observation we can reduce the search space down to 8 nodes.

Based on these observations we can define a canonical representation of each game state (node) where we re-label the colors and flip the game. There is a tradeoff between:

  1. the computation time spent transforming a game state into canonical form
  2. the storage of game states as we search the graph

Re-coloring reduces the size of the search space by quite a bit, so we’ll apply this transformation. We’ll skip the flipping and full canonicalization.

Optimal solutions

The simplest algorithm that is guaranteed to produce an optimal solution is breadth-first search (BFS). It first explores all nodes one move away from the initial node, then all nodes two moves away, then all nodes three nodes away, and so forth. When we get to an empty board we know that no shorter path exists.

Unfortunately, BFS holds too many nodes in memory and explores every path. A smarter approach is to use A* search. All nodes are put in a priority queue, and we continually pop off the node with the smallest value of

\begin{equation*} f(n) = g(n) + h(n), \end{equation*}

where \(n\) is a node, \(g(n)\) is the length of the path so far and \(h(n)\) is an admissible heuristic that estimates the total remaining path.

Admissible heuristics

If the heuristic \(h\) never overestimates the length of the remaining path, then A* is guaranteed to return an optimal solution. Such a heuristic is called admissible.

  • An obvious admissible \(h\) is the unique number of colors left on the board. This never overestimates the number of moves needed, because at least one move is required to remove all cells of a given color.
  • An even better \(h\) is described in the code below. It’s based on the following idea: if any given color appears on both sides of a column not containing that color, then at least two moves are needed to get rid of that color.
def consecutive_groups(iterable):
    """Count how many consecutive groups of True there are.

    Examples
    --------
    >>> consecutive_groups([False, True, True, False, False, True, True])
    2
    """
    return sum(1 for key, group in itertools.groupby(iterable) if key)


def a_star_heuristic(board: Board) -> int:
    """A lower bound on how many moves are needed to solve.

    This function examines each color in turn. For each color, count whether
    it appears in each column. For each group of columns, separated by
    columns where the color does not appear, we need at least one click.

    Examples
    --------
    >>> board = Board([[1, 2, 1],
    ...                [2, 2, 2],
    ...                [1, 2, 1]])
    >>> a_star_heuristic(board)
    3
    >>> board = Board([[1, 3, 1],
    ...                [2, 3, 1],
    ...                [1, 3, 1]])
    >>> a_star_heuristic(board)
    4
    """
    # Transpose the grid
    matrix = list(map(list, zip(*board.grid)))

    # Get the unique integers larger than zero in the matrix
    unique_integers = set(c for col in matrix for c in col if c > 0)

    # For every integer, see if it exists in each column
    integer_in_col = (((c in col) for col in matrix) for c in unique_integers)

    # Count groups of integers separated by other integers
    return sum(consecutive_groups(integer) for integer in integer_in_col)

Unfortunately, even with this heuristic A* is too slow for boards of decent size. The figure below shows solution times for A* and BFS as a function of board size. Solution times grow exponentially. Whether we re-label colors or not seems to have little practical impact on solution times.

Good-enough solutions in limited time

A heuristic graph search algorithm (not to be confused with the heuristic function \(h\) in A* search) is an algorithm that attempts to find a good solution. With heuristics, there are no guarantees that the optimal solution is found.

Some simple heuristics are:

  • Random search. Always choose a random move.
  • Greedy best-first search. Always choose the move that clears the most cells.
  • Randomized best-first search. Choose a move at random, but use weighted sampling and weigh the moves by the number of cells they remove. Setting weight = cells_removed**power for some power >= 0 is one way to parametrize this graph search. If power = 0 then the search behaves like random search, and if power is very large then it behaves like greedy best-first search. Setting power=None means using no randomness.

The figure below shows what happens when we solve the large \(9 \times 7\) board 100 times using different values of power.

Setting power=5 and power=10 both yield solution solution paths with 17 moves.

We implement three algorithms that yield good-enough solutions quickly. They are called anytime beam search, heuristic search and Monte Carlo search, and all are more clever than simply repeating randomized best-first search many times.

Beam search is a simple algorithm:

  1. Start at the root, expand all children. Keep only the beam_width children that have the largest values of cells_cleared / num_moves.
  2. From the beam_width nodes under consideration, expand all children. Filter them again and keep only the beam_width most promising ones.
  3. Repeat until a solution is found.

One disadvantage of beam search is that for large values of beam_width we have to wait a while until we get a solution. An anytime algorithm is one that yields good-enough solutions quickly, and if it runs for longer it improves on the solution. To make beam search into an anytime algorithm we run it for beam_width=1, 2, 4, 8, ..., keeping track of the best solution seen so far and yielding if a better solution is found.

This algorithm yields better and better solutions as they are found. If it runs long enough, it will eventually find an optimal solution. In practice it will run out of memory before it’s guaranteed to explore every path, so there are no guarantees. It roughly works like this:

  1. Start by running a single deterministic best-first search. This gives a quick upper bound best_solution_length_so_far on the optimal solution that we can use to prune solutions with later.
  2. Sort the priority queue by a non-admissible heuristic function cells_cleared / num_moves. Always consider moves that maximize this function, hoping they lead to a good solution.
  3. Let h(n) be the admissible heuristic function discussed earlier. If num_moves + h(n) >= best_solution_length_so_far, then we can prune node n.

Monte Carlo tree search also yields better and better solutions as they are discovered. Like the algorithm above, it also eventually finds an optimal solution but provides no guarantees when run for a short time. It roughly works like this:

  1. Start with a deterministic best-first search to establish an upper bound used for pruning.
  2. Randomly sample nodes in the explored tree, going down guided by the so-called upper confidence bound, which trades of exploration and exploitation.
  3. When a new unexplored node is reached, run a randomized best-first search from the node down to the bottom of the tree to score the node. If this results in a new best solution, update the upper bound. Propagate the score cells_cleared / num_moves up in the explored tree to the root. This score affects the upper confidence bound used in the next iteration.
  4. Prune by using the admissible heuristic h(n) as the tree is explored.

Running all algorithms for around four minutes on the game instance of size \(9 \times 7\), we get the following results.

  • Anytime beam search finds the best-known solution from NRK in three minutes.
  • Heuristic search finds a solution of length 18 in around 20 seconds, but is unable to discover any better solutions in the remaining time.
  • Monte Carlo search finds a solution of length 14 in around 2 minutes.

Here is the solution sequence for the 12 move solution (click to enlarge).

The NRK record was 12 moves, and we managed to solve the board in 12 moves. On this particular instance, running for longer does not lead to further improvements. On other instances we are not always so lucky; sometimes we are unable to find the best known solution, at least in a few minutes of compute time.

Summary, notes and references

The NRK game “Former” can be formulated as a graph search problem, but the graph is huge. There are billions and billions of nodes in the graph, so complete enumeration is out of the question. While A* search is effective similar puzzles (e.g. the 15 puzzle), here it fails because we were unable to design a great admissible heuristic function.

Heuristic search algorithms are still able to produce good-enough solutions in a short time. We implemented three such algorithms with different flavors: anytime beam search, heuristic search and Monte Carlo search. In the end we managed to find the record solution of 12 moves on this instance. These algorithms can be combined: heuristic search and Monte Carlo search perform better if an upper bound is established quickly, so running beam search as a first step could help.

The game is similar to SameGame, and the paper Single-Player Monte-Carlo Tree Search proposes Monte Carlo search for this game. An implementation by Odin at github.com/odinhg/nrk-former-sp-mcts implements this algorithm. Another paper is Solving SameGame and its Chessboard Variant, which also uses Monte Carlo search. The sgbust program is a SameGame solver written in C++ that uses beam search. A good introductory reference for game solving is Chapter 3 “Solving Problems by Searching” in Artificial Intelligence: A Modern Approach. Another avenue to go down is reinforcement learning, see for instance the book Reinforcement Learning: An Introduction.

Thanks to Andreas for introducing me to the game, and to Eivind and Gunvor who independently discovered the admissible heuristic.

Appendix: more boards

Above we considered the NRK game instance for the 16th of November. Below we show performance on more game instances from three other days.