Optimal skill tree growth

I recently purchased the game “Marvel’s Spider-Man” for PlayStation 4. The game has an interesting skill tree feature. It works like this:

Each character level unlocks a skill point which may be used to unlock a skill in the skill tree. A skill may only be unlocked once its parent is unlocked, and at the end of the game the character has enough skill points to unlock every skill.

Below is a figure showing the in-game representation of the skill tree.

Some skills are considered better than others. For instance, the article Focus on Getting These 16 Best Skills in ‘Spider-Man’ PS4 Right Away urges us to unlock some essential skills as fast as possible. We assume that the values associated with skills are independent, e.g., the value of unlocking skill A and skill B together is the sum of the values, even if they mutually benefit each other in actual gameplay. Knowing that some skills are better than others, we can ask ourselves the following question:

In what order should we unlock skills to maximize utility?

In this article we formalize this problem, formulate it as an optimization problem and solve it as a Mixed Integer Program (MIP). We also sketch a graph search algorithm.

The figure below shows a graph of the in-game skill tree, with an added root vertex. The 16 best skills according to the article mentioned above are shaded in light blue. There are many ways to grow the skill tree as we level up the character. Although finding the best way to unlock skills is hardly a serious problem, completely analogous problems appear in project management and decision challenges. In these scenarios finding the best plan might have real economic value.

Table of contents

Utility function

In this section we formalize a natural utility function to optimize. Every skill is a vertex in the graph, and with every vertex \(i=1, 2, \ldots, N\) we associate a non-negative value \(v_i\). When a skill is unlocked we derive utility from its value for the rest of the game time, and eventually we level up to the maximum level and unlock every skill.

The utility of skill \(i\) is its value \(v_i\) multiplied by the time we enjoy the benefit of having unlocked the skill. More formally, we define the utility of unlocking skill \(i\) at time \(t_i\) as the product \(v_i (N - t_i)\), and the total utility becomes

\begin{align*} \sum_{i=1}^N v_i (N - t_i). \end{align*}

For instance, if a skill \(i\) is unlocked at the final time step \(t_i=N\) we derive no utility from it. However, if a skill \(i\) is unlocked at the first time step \(t_i = 1\) we enjoy its utility \(v_i\) for \((N-1)\) time steps. Since \(N\) and \(\sum_i v_i\) are constants, maximizing the utility function above is equivalent to minimizing

\begin{align*} \sum_{i=1}^N v_i t_i. \end{align*}

In order to solve a concrete problem, we assign values \(v_i = 1\) to all skills mentioned in the article Focus on Getting These 16 Best Skills in ‘Spider-Man’ PS4 Right Away, and values \(v_i = 0\) to all skills not mentioned.

A greedy approach does not work

Consider the skill trees shown above, where blue indicates a value of \(1\) and white indicates a value of \(0\). Two observations can immediately be made based on these figures:

  • Left subfigure. Starting from the root note, a greedy approach (searching for the closest vertex with value \(v_i = 1\)) does not work. Unlocking skills to get the closest value (right path) produces a sub-optimal solution. It’s better to traverse the left path, unlock both skills, then go down the right path.
  • Right subfigure. It might be necessary to trace out one path down the skill tree, abandon it temporarily for another path, and finally continue down the first path. In the right subfigure we go left, then right, then left again for an optimal solution.

In the right subfigure the optimal path down the tree yields a total utility of

\begin{align*} \sum_{i=1}^N v_i (N - t_i) = 1 (5) + 0 (4) + 1 (3) +0(2) + 0(1) + 1 (0) = 8. \end{align*}

Solving the equivalent minimization problem yields an optimal value of

\begin{align*} \sum_{i=1}^N v_i t_i = 1 (1) + 2 (0) + 3 (1) + 4 (0) + 5 (0) + 6 (1) = 10. \end{align*}

MIP model

In this section we formulate the problem as a Mixed Integer Program (MIP).

Growing a skill tree is equivalent to constrained sorting

Going down the skill tree in the optimal order is equivalent to constrained sorting. We want to permute the vertices in an optimal order (sorting) to get value as early as possible in the game, but an edge from vertex \(a\) to \(b\) constrains the solution: \(a\) must come before \(b\) in the final ordering since \(a\) must be unlocked before \(b\) in the game.

The left sub-figure above shows a skill tree with labeled vertices. The right sub-figure shows the skill tree as a Directed Acyclic Graph (DAG). Choosing skills and growing the skill tree is equivalent to constrained sorting, or a weighted topological sort of a DAG. I found no polynomial-time algorithms for this problem.

Permutation matrix and constraints

Sorting a column vector can be expressed as left-multiplication by a permutation matrix, and we use this fact to formulate the optimization problem. We arbitrarily index the vertices by \(i=1, 2, \ldots, n\) and introduce a binary permutation matrix \(X \in \{0, 1\}^{n \times n}\) with entries given by

\begin{align} x_{ij}= \begin{cases} 1 \quad \text{ if } i \text{ maps to } j\\ 0 \quad \text{ otherwise}. \end{cases} \end{align}

The constraints that ensure that a square binary matrix is a permutation matrix are given below. There must be one non-zero entry per column, and one non-zero entry per row.

\begin{align} \sum_{i} x_{ij} = 1 \quad \forall j \qquad \qquad \sum_{j} x_{ij} = 1 \quad \forall i. \end{align}

The new position of element \(i\) in the column vector will be \(\sum_j j x_{ij}\) after left-multiplication with the permutation matrix. For every directed edge \((a, b)\) in the set of all edges \(E\), we add a constraint

\begin{align} \sum_{j} j x_{aj} < \sum_{j} j x_{bj} \end{align}

to enforce the ordering imposed by the DAG structure. All we need to do now is express the objective function and our model will be finished.

MIP model

We previously defined an objective function \(\sum_{i=1}^N v_i t_i\) and we have seen that multiplying with a permutation matrix \(X\) permutes index \(i\) to \(\sum_j j x_{ij}\). Combining these facts, we obtain the following MIP model.

\begin{align} & \underset{x}{\text{minimize}} && \sum_{i} v_i \sum_j j \, x_{ij} \\ & \text{subject to} && \sum_{i} x_{ij} = 1 && \forall \, j \\ & && \sum_{j} x_{ij} = 1 && \forall \, i \\ & && \sum_{j} j x_{aj} < \sum_{j} j x_{bj} && \forall \, (a, b) \in E \\ & && x_{ij} \in \{0, 1\} && \forall \, i,j\\ \end{align}

Solution

The optimization problem has \(|V| \times |V| = 35 \times 35 = 1,225\) variables (entries in the permutation matrix) and \(|V| + |V| + |E| = 35 + 35 + 34 = 104\) constraints (rows and columns in the permutation matrix, plus one constraint per edge). An optimal solution is depicted below. The solution is not unique.

A graph search algorithm

An alternative to the MIP approach is to formulate the problem as a graph search in the state space of the problem.

  • Each state is a feasible selection of skill points.
  • Each transition is the act of unlocking a skill new skill, taking us to a new state.

The initial state would be an empty skill tree. The first transition would unlock one of three possible skills, putting us in a new state. The second transition would unlock one of five possible skills. From there on out, the number of possibilities depends on the current state.

The nice thing about searching the state space is that the objective function can be bounded above as well as below. This permits efficient pruning of the state space graph, as well as opportunities to use heuristics.

  • An upper bound on the objective function is found by sorting the remaining values \(v_i\) in descending order. Exploring the state space graph and unlocking the skill tree in this order is the best we can hope for.
  • A lower bound on the objective function is found by sorting the remaining values \(v_i\) in ascending order. Exploring the state space graph and unlocking the skill tree in this order is the worst that can happen.

As we search the state space graph, we use upper bounds to prune (remove) paths that cannot possibly be better than what we have seen so far in the search. Lower bounds can be used as heuristics for which paths to explore first. Several sub-problems are identical, and these can be cached/memoized. The objective function can be bounded and the algorithm can be tweaked to trade off runtime for solution quality.

Notes and references

  • A similar problem is choosing the best skill tree given a limited resource of skill points. This can also be formulated as a MIP, but a faster dynamic programming approach is also possible. This is more similar to the skill tree problem found in World of Warcraft – the number of skill points is limited and we care about final result, not growing the skill tree.
  • Using a permutation matrix, sorting can be reduced to a MIP. Of course there are much better ways to sort an array. Just because a problem can be formulated as a MIP does not mean that it should, especially if faster algorithms exist. MIPs do not reduce to sorting, but we can dream.