Sudoku puzzles as optimization problems

Sudoku is a puzzle where the player assigns digits to squares (see Wikipedia for a complete description). The puzzle can be formulated as a mathematical optimization problem and solved in a few milliseconds on a decent laptop. We will show how to transform Sudoku to an optimization problem by incorporating appropriate constraints and optimizing a constant objective function.

While Sudoku is an innocent past-time activity, many real world business problems can also be formulated as optimization problems: budgeting, production planning, staff scheduling, transportation, facility location and supply chain problems, to name a few. Proper use of optimization and mathematical modeling can lead to large savings in many business problems. It’s unfortunate when hours or days of human labor is spent attempting to solve problems where an optimization solver could’ve found an optimal solution in seconds or minutes.

Today I picked up the Norwegian newspaper VG and found some variations on the classical Sudoku puzzle that I haven’t seen before – let’s solve them using optimization!

Table of contents

Inequality Sudoku

This is a straightforward variant of classical Sudoku. In this puzzle the board is smaller and instead of constraints on \(3 \times 3\) sub-squares, we are given inequality constraints on some entries. The Norwegian text roughly reads:

On the finished board, every row and column must contain all numbers from 1 to 7 exactly once. The inequalities must be obeyed by the solution.

This is more or less a freebie, since the standard Mixed Integer Program (MIP) formulation consists of linear inequalities – and here we are given just that; some additional inequalities. To solve this problem, we introduce binary variables \(x_{ijk}\) defined by

\begin{align} x_{ijk}= \begin{cases} 1 \quad \text{ if digit } k \text{ is assigned to square } (i, j)\\ 0 \quad \text{ otherwise}. \end{cases} \end{align}

Without the inequality constraints, the MIP formulation is

\begin{align} & \text{minimize} && 0 \\ & \text{subject to} && \sum_{k} x_{ijk} = 1 && \forall \, (i, j) \\ & && \sum_{i} x_{ijk} = 1 && \forall \, (j, k) \\ & && \sum_{j} x_{ijk} = 1 && \forall \, (i, k) \\ & && x_{ijk} \in \{0, 1\} && \forall \, (i, j, k).\\ \end{align}

The mathematical notation is a bit terse. In English, the first constraint means that “only one digit can be placed in each square”, the second constraint states “each digit may only be placed once in every row” and the third signifies that “each digit may only be placed once in every column.” We minimize the constant \(0\) instead of a meaningful objective function; after all we are interested in any solution obeying the constraints.

To correctly model inequalities we note that the numeric value of entry \((i, j)\) is given by \(\sum_k k x_{ijk}\). We iterate through the set of inequalities given by the problem, parsing all of them as smaller-than inequalities \(a < b\) (with indices \((a_i, a_j)\) and \((b_i, b_j)\)), we add constraints

\begin{align} \sum_k k x_{a_i a_j k} + 1 \leq \sum_k k x_{b_i b_j k} \end{align}

for every \(a < b\) in the set of smaller-than inequalities. This constrains the solution to obey the inequalities. Most solvers use strict inequalities, hence the plus one in the equation above.

Some values are already given and we constrain these. For instance, we add the constraint \(x_{0,3,6} = 1\) to signify that the value \(6\) occupies entry \((0, 3)\). We have formulated every constraint using mathematics, and coding this up will solve the problem.

Sum Sudoku

This variant introduces connected regions of the board that must sum to a certain number. According to the problem text this variant is not for beginners. Indeed the computer requires more time to solve this problem, it uses approximately \(1\) second. The text roughly reads:

Not for beginners! This variant does not have starting numbers. You must assign numbers \(1\) through \(9\) as usual. The board has dashed regions. In reach region, the sum of the numbers you enter should equal the given number.

The MIP formulation is straightforward. For every \(3 \times 3\) square \(S\), we add a constraint that each number must appear exactly once. For every dashed region \(R\), we add a constraint that the numbers must sum to \(\sigma_R\).

\begin{align} & \text{minimize} && 0 \\ & \text{subject to} && \sum_{k} x_{ijk} = 1 && \forall \, (i, j) \\ & && \sum_{i} x_{ijk} = 1 && \forall \, (j, k) \\ & && \sum_{j} x_{ijk} = 1 && \forall \, (i, k) \\ & && \sum_{(i, j) \in S} \sum_k x_{ijk} = 1 && \forall \, S\\ & && \sum_{(i, j) \in R} \sum_k k x_{ijk} = \sigma_R && \forall \, R\\ & && x_{ijk} \in \{0, 1\} && \forall \, (i, j, k)\\ \end{align}

The solution to this particular problem instance is given below, and was found in approximately \(1\) second. The problem instance has \(9^3 = 729\) variables and \(354\) constraints. Below the solution is given in plaintext.

841567932
275493861
396128547
134859276
582716394
967342158
619275483
453681729
728934615

Line Sudoku

This variant does not resemble the original Sudoku puzzle. However, it can still be formulated as a MIP, and therefore it can be solved quickly. The problem description given in the newspaper roughly reads:

Each black square contains a number. Every white square must be tiled using either a single horizontal line or a single vertical line. The number in each black square represents the number of tiles that must be arranged so that non-broken black lines connect to the black square.

I originally had trouble understanding the problem description as given in the newspaper text, and I only understood the problem after I saw a solved instance. Below I’ve included a small problem instance and accompanying solution. In the solution, the white squares are tiled so that the number of non-broken lines adjacent to the black squares equal the numbers in the black squares.

For every white square on the board, we introduce a binary variable \(x_{ij}\) defined by

\begin{align} x_{ij}= \begin{cases} 1 \quad \text{ if entry } (i, j) \text{ contains a horizontal line}\\ 0 \quad \text{ if entry } (i, j) \text{ contains a vertical line}. \end{cases} \end{align}

This defines a tiling of the board, and there are \(2^{\text{white squares}}\) admissible tilings.

Counting lines connected to a black square

To count lines connected to a black square, we will visit every black square and move in every direction \(d\) (up, down, right and left) until we meet the border of the board or another black square. In every direction we’ll add auxiliary variables counting the number of non-broken line segments.

Consider an arbitrary black square, and consider moving e.g. to the right. Let \(s=1, 2,\ldots\) represent the number of steps to the right of the black square. We must count the number of tiles with horizontal lines (\(x_{ij} = 1\)), and the counter must stop once \(x_{ij} = 0\).

We introduce auxiliary binary variables \(z_{i, j, d, s}\) defined by

\begin{align} z_{i, j, d, s} = \begin{cases} 1 \quad \text{ if all white squares from $(i, j)$ to $s$ steps in direction $d$ have horizontal tiles}\\ 0 \quad \text{ otherwise}. \end{cases} \end{align}

For directions \(d\) up/down, swap out “horizontal” with “vertical” in the definition of \(z_{i, j, d, s}\) above.

We will now assume the black square \((i, j)\) and the direction \(d\) is fixed, and hence remove subscripts to ease the notation. For every black square \((i, j)\) going in directions right (and left), we impose constraints

\begin{align} z_s &\leq z_{s-1} && \\ z_s &\leq x_{s} && \quad \forall s \geq 2 \\ z_s & \geq x_s + z_{s-1} - 1 && \end{align}

and a boundary condition \(z_1 = x_1\). Those in the know will recognize the above constraints as a linearization of the \(\text{AND}\) function, modeling the logical relationship \(z_s \Leftrightarrow z_{s-1} \wedge x_s\).

To see how this works, consider a concrete example:

Tiles “–“ “–“ “–“ | | “–“ |
\(s\) \(1\) \(2\) \(3\) \(4\) \(5\) \(6\) \(7\)
\(x_s\) \(1\) \(1\) \(1\) \(0\) \(0\) \(1\) \(0\)
\(z_s\) \(1\) \(1\) \(1\) \(0\) \(0\) \(0\) \(0\)

The constraints force \(z_s = 0\) the first time \(x_s = 0\), after which \(z_s = 0\) for every remaining value of \(s\) looping outwards from the black square (here to the right). This exactly represents the number of tiles arranged so that non-broken black lines connect to the black square when going left or right.

When moving up or down from a black square, we must count the number of tiles with vertical lines (\(x_{ij} = 0\)), and the counter must stop once \(x_{ij} = 1\). To do this we impose constraints

\begin{align} z_s &\leq z_{s-1} && \\ z_s &\leq 1 - x_{s} && \quad \forall s \geq 2 \\ z_s & \geq z_{s-1} - x_s && \end{align}

and a boundary condition \(z_1 = 1 - x_1\).

Mixed Integer Program

The MIP formulation becomes

\begin{align} & \text{minimize} && 0 \\ & \text{subject to} && \sum_d \sum_s z_{ijds} = \sigma_{ij} && \forall \, (i, j) \in \text{black squares} \\ & && \text{for every black square $(i, j)$ and direction $d$ left and right:} \\ & && \quad z_{ijd,1} = x_{ijd,1} \\ & && \quad z_{ijds} \leq z_{ijd, s-1} && \forall s \geq 2 \\ & && \quad z_{ijds} \leq x_{ijds} && \forall s \geq 2 \\ & && \quad z_{ijds} \geq x_{ijd, s} + z_{ijd, s-1} - 1 && \forall s \geq 2 \\ & && \text{for every black square $(i, j)$ and direction $d$ up and down:} \\ & && \quad z_{ijd,1} = 1 - x_{ijd,1} \\ & && \quad z_{ijd,s} \leq z_{ijd, s-1} && \forall s \geq 2 \\ & && \quad z_{ijds} \leq 1 - x_{ijds} && \forall s \geq 2 \\ & && \quad z_{ijd, s} \geq z_{ijd, s-1} - x_{ijd, s} && \forall s \geq 2 \\ & && z_{ijds} \in \{0, 1\} && \forall \, (i, j, d, s) \\ & && x_{ij} \in \{0, 1\} && \forall \, (i, j) \in \text{white squares}\\ \end{align}

In the model formulation I chose to use pseudocode for loops instead of relying even more on indices. The problem data \(\sigma_{ij}\) are the entries in the black squares given in the problem instance. The solution to this particular problem instance is given below, and was found in a few milliseconds. The problem instance has \(181\) variables and \(311\) constraints, and the solution is given in plaintext as:

4 - 5 - - 8 - -
| | | 1 | | | 1
| 1 | | | | | |
| - - - 5 | | 2
8 - - - - | | |
| | 3 - - - 8 -
5 - - - 7 - - -
| - 2 - | 2 - -

Summary

We have solved all three Sudoku variants by formulating them as MIPs. The optimization models will solve any problem instance, not just these particular problems.

The primary advantage of this approach is that no custom algorithms are required – we simply give the model and problem data to a solver, and the solver will make quick work of the problem. This is simple, correct and practical. It also illustrates why problem generalization (and in particular optimization solvers) is so valuable in applied mathematics – when a solver for an abstract problem is available, we can solve a wide range of concrete practical problems.