Pareto non-dominated front as a consumer strategy

Let’s go shopping for an apartment to rent! If we ignore location, the overall standard and other factors, the interesting data is primarily the number of square meters \(x\) and the price of rent per month \(y\).

In this article we answer the question:

Which apartments are best? In other words, which landlords should we call up first?

The data set above will be used throughout the article, but the ideas generalize to similar situations where we as consumers search for good options without much domain knowledge. The ideas also generalize to higher dimensionality (e.g. if we include distance to city center as an additional data attribute).

Table of contents

Regression and implicit optimization

An initial idea might be to compute the least squares regression line. We want many square meters \(x\) at a low price \(y\). After computing the regression line, we can move in the direction of positive \(x\) and negative \(y\) until we find the apartment the furthest away from the regression line. This is shown in the figure below.

It’s tempting to declare the resulting apartment as “best” and move on – but there are some serious issues:

  1. Implicit scalarization. The objective is to find good apartments \((x_i, y_i)\). Evaluating apartments \((x_i, y_i)\) by a linear combination of the objectives \(x_i\) and \(y_i\) is called scalarization, because we use a scalar to represent the utility of the apartment. Moving the regression line implicitly maximizes a scalar function \(f(x, y) = \lambda x - (1- \lambda)y\) for some \(0 \leq \lambda \leq 1\). We will see that this might not be what we want.
  2. Implicit weighting of objectives. The regression line is \(y = 5695 + 187x\). It shows that, in some sense, one additional square meter is worth \(187\) per month on average. By moving the regression line, we’re implicitly maximizing \(f(x, y) = 187x - y\). While this linear relationship holds on average, we might not agree that one additional square meter is worth \(187\) more per month.

In summary, the approach implicitly scalarizes the problem and sets the scalarization weights for us.

We can do much better – back to the drawing board!

Pareto domination

We will make use of the concept of Pareto domination from multiobjective optimization. In multiobjective optimization, a solution dominates another if it’s no worse in any objective, and better in at least one objective.

In our case, an apartment \(a = (x_a, y_a)\) dominates \(b = (x_b, y_b)\) if \(a\) is at least as cheap, has at least as many square meters, and is either cheaper or has more square meters, or both.

If apartment \(a\) dominates \(b\), there is no reason to ever choose \(b\).

The Pareto non-dominated front

The Pareto non-dominated front contains the set of apartments that are not dominated by any other apartment.

In the figure above, each of the non-dominated apartments have no other apartments to their right (more square meters \(x\)) and below (lower price \(y\)). In other words, for each of the non-dominated apartments, there are no other apartments that are univocally better.

The set of non-dominated apartments has intrinsic quality, even without an explicit weighting between square meters and price.

Convexity and scalarization

Recall that scalarizing the problem would mean choosing some \(\lambda \in [0, 1]\) and maximizing a weighted linear sum of the objectives \(x\) and \(y\), i.e. maximizing the function

\begin{equation*} f(x, y) = \lambda x - (1- \lambda)y. \end{equation*}

The crafty reader might notice that the Pareto non-dominated front contains an apartment that is impossible to find using scalarization. No matter which \(\lambda\) we choose, we would hit the apartments connected to the blue line before we could ever find the non-dominated apartment shown in red.

This phenomenon is amplified on a fictive data set below. The non-dominated front contains many interesting apartments, but the “concavity” of the set of apartments make them impossible to find using scalarization.

In general, scalarization finds the non-dominated Pareto front if the multiobjective optimization problem is convex.

Pareto ranks

Before any preference for square meters \(x\) or price \(y\) is set, the non-dominated front contains apartments that are maximal with respect to a partial ordering of the objectives. Scalarizing introduces a linear ordering, and with a linear ordering it’s possible to choose a “best” apartment. The “best” apartment is maximum with respect to the linear ordering.

  • The maximum apartment is better than every other apartment.
  • The maximal apartments are no worse than any other apartment.

Pareto ranks generalize the Pareto non-dominated set. Pareto ranks are found by the following algorithm:

  1. Given the data, compute the non-dominated set.
  2. Remove the non-dominated set from the data.
  3. Repeat until the data is empty.

The Pareto ranks of the apartment data set is shown below.

Summary

In multiobjective optimization, the Pareto non-dominated front yields solutions that have intrinsic quality – they are maximal with respect to the partial ordering of the objectives. If the optimization problem is convex, then scalarization can find the non-dominated front. If the optimization problem is not convex, then the non-dominated front can contain solutions that scalarization is unable to find.

As a consumer, this idea extends to any purchase where we have quantifiable measures of quality that we want to optimize. For instance, when buying a hard drive \(x\) could be storage capacity and \(y\) could be price. Computing the non-dominated front is especially useful for a consumer with little domain knowledge.

The most outstanding use-case might be when there are many attributes, e.g. price per month, square meters, distance from city center, overall standard, etc. Combining Pareto ranks with a subjective quality evaluation is a great way to pick the best apartments – or at the very least avoid making a stupid choice.

Notes and references

  • My Python package paretoset implements these algorithms.
  • Computing the Pareto non-dominated front efficiently is a non-trivial problem. With \(n\) data points, an \(\mathcal{O}(n^2)\) algorithm is easy to produce: loop over each data point, then check if any other data point dominates it. Various algorithms and approximation schemes have been devised for the general problem.
  • There are several possible ways to define Pareto ranks. For instance, we can define the rank of a solution as the number of other solutions that it dominates. This is shown on the apartments data set below.