Kidney allocation under uncertainty
- 6. February 2021
- #mathematics, #optimization
Here is an interesting (albeit slightly morbid) problem:
A set of patients are in need of kidney transplants. Each patient \(i\) has a life expectancy \(b_i\) before receiving a new kidney, as well as an extended life expectancy \(a_i\) after receiving a new kidney. New kidneys arrive randomly via a Poisson process, with a known rate parameter \(\lambda\). How should kidneys be distributed to maximize the sum of extended life expectancy for the patients, i.e., the total utility \(\sum_i a_i\)?
The story above is just a colloquial way to dress up a mathematical problem. I know nothing about real-world kidney assignments.
We start by examining a simple case with two patients. Then we develop an efficient linear time algorithm for a related problem, in which the exact arrival times of all future kidneys are known beforehand. We leverage the solution to that problem to create a simulation heuristic. Finally we construct an exact algorithm that runs in exponential time.
The two-patient case
The two patient case is relatively straightforward. Suppose we structure the problem data as a list \([ (b_i, a_i) \, \forall \, i ]\). Recall that for patient \(i\), \(b_i\) is the life expectancy before receiving a kidney and \(a_i\) is the extended life expectancy after receiving a kidney. In the figure below, the problem data is \([(2, 5), (1, 1)]\). Suppose half a unit of time \(t=1/2\) elapses and a kidney becomes available. The arrival of a kidney is shown with a dotted vertical line. Should patient \(1\) or patient \(2\) receive the kidney?
Looking at the figure above, the answer to the problem will clearly depend on \(\lambda\), the rate of which new kidneys arrive. If \(\lambda \approx 0\), then kidneys are rare and we should give the kidney to patient \(1\)—we capitalize on the opportunity to gain the utility \(a_1 = 5\). If \(\lambda \gg 0\), then kidneys are common and we should give the kidney to patient \(2\)—a new kidney will almost certainly arrive and save patient \(1\) later on.
If the kidney goes to patient \(2\), we immediately get a utility of \(a_2 = 1\). If another kidney becomes available in the remaining \(2 - 1/2 = 3/2\) units of time, we get an additional utility of \(a_1 = 5\). The probability of another kidney appearing in \(3/2\) units of time is \(\operatorname{Exponential}(3/2|\lambda)\), where \(\operatorname{Exponential}\) denotes the cumulative exponential distribution. The expected utility if we assign the kidney to patient \(2\) is therefore
Similarly, if patient \(1\) gets the kidney we obtain an expected utility of
We know that \(\operatorname{Exponential}(t|\lambda) =1 - e^{-\lambda t}\), so we can compare the expected future utilities directly:
The answer is straightforward: to maximize utility, we choose the patient \(i\) that minimizes \(a_i e^{\lambda b_i}\). In the rest of the article, we’ll assume that \(\lambda = 1\).
The perfect information case
Assume now that we have perfect information about when kidneys will arrive in the future. This removes all uncertainty from the problem and makes it easier to solve. The figure below shows a problem instance with three patients and perfect information. This problem is easily solved by hand. Patient \(2\) should get the first kidney, and patient \(3\) gets the second kidney. Patient \(1\) dies.
Reduction to linear sum assignment
We can reduce the perfect information problem to a well-known combinatorial optimization problem: linear sum assignment on a matrix \(M\) of utilities. In the matrix below, row \(i\) represents patient \(i\) and column \(j\) represents kidney \(j\). For instance, the top row corresponds to the top bar in the figure above for patient \(3\). If there are more kidneys than patients, it suffices to only include the first kidneys.
Let \(t_j\) be the arrival time of kidney \(j\). We set \(M_{ij}\) equal to \(a_i\) if \(t_j < b_i\), otherwise we set it equal to \(0\). Each kidney can be allocated to at most one patient, and each patient can at most receive one kidney. The solution to the problem above yields a total utility of \(5 + 8 = 13\).
A linear algorithm
Linear sum assignment solves the problem in \(\mathcal{O}(n^3)\) time, but we can do better. If the patients and kidneys are sorted, we can rewind time and loop backwards through the kidneys. The result is a simple algorithm that is equivalent to exploiting the sparsity pattern in the matrix \(M\) above. This algorithm allows us to optimally assign millions of kidneys without computational difficulties.
We start with the last kidney, assign it optimally, then go to the next-last kidney. We assign this kidney optimally to any non-assigned patient, and continue looping in reverse order.
The figure below shows a solution with ten patients and ten kidneys, along with the optimal solution when perfect information about the future is available. Green bars represent patients saved, green vertical lines represent kidneys used, and green horizontal lines represent the optimal assignment. Notice that even in the optimal solution below, three patients die and three kidneys are unused.
A simulation heuristic
Consider again the simple case with three patients, but this time we only know that a kidney arrived at \(t=1/4\). Apart from the arrival time of the first kidney, the future is uncertain. To which patient should the newly arrived kidney be assigned?
We can create a simulation heuristics by making use of the solution to the perfect information problem. At \(t=1/4\) we first attempt to assign the kidney to the first patient, then we remove the patient and simulate possible futures that might occur. We repeat this procedure for the second and third patients. If we run an infinite number of simulations, the observed frequency of a specific set of future events occurring will match the true probability of that future occurring in the real world.
The outcomes of such a simulation is show in the left-most subfigure below.
Denote the optimal utility achieved in a simulation when patient \(i\) gets the first kidney by \(x_{i}\). Clearly \(x_{i}\) is a random variable, and we draw a sample when we simulate a full set of future events. We want to choose \(i\) to maximize \(\mathbb{E}[x_{i}]\), where the expectation is taken over all possible futures.
The random variable \(x_{i}\) itself cannot be assumed to be normally distributed. However, we can appeal to the central limit theorem and assume that the random variable \(\mathbb{E}[x_{i}]\) is normally distributed. The standard deviation of \(\mathbb{E}[x_{i}]\) represents the uncertainty. The more simulations \(s\) we run, the lower the standard deviation, and the better our estimate of \(\mathbb{E}[x_{i}]\) becomes. This is shown in the middle and right-most sub-figures above. Here, patient \(2\) gets the kidney since \(i=2\) maximizes \(\mathbb{E}[x_{i}]\).
An exact algorithm
Consider again the three patient problem instance shown below. The problem instance is \([(1,1), (2, 5), (3, 8)]\). What is the expected utility \(\mathbb{E} [(1,1), (2, 5), (3, 8)]\)?
We can design a recursive, exact algorithm by evaluating all time intervals in turn and order. First consider the time interval \(t \in [0, 1]\), shown in gray in the figure above. There are four cases to consider:
- \(k=0\) kidneys arrive. Patient one dies and we increment time by one unit and solve the sub-problem \([(1, 5), (2, 8)]\), i.e. we compute \(\mathbb{E}[(1, 5), (2, 8)]\).
- \(k=1\) kidney arrives. For each patient \(i\) that can receive the kidney, we tentatively assign it to that person. This immediately gives us utility \(a_i\). Then we increment time by one unit and solve the sub-problem to obtain the expected future utility. The optimal expected utility in this case is:
\begin{align*} \max \left\{ 1 + \mathbb{E} [(1, 5), (2, 8)], 5 + \mathbb{E} [(2, 8)], 8 + \mathbb{E} [(1, 5)] \right\} \end{align*}
- \(k=2\) kidneys arrive. This is equivalent to the case above, but we try every assignment of two kidneys to three patients and compute the maximum.
- \(k \geq 3\) kidneys arrive. We save every patient and obtain a utility of \(1 + 5 + 8\).
We can use the Poisson probability distribution to compute the probability of each of the cases above. To compute the expected value, each case is weighted by its probability \(\operatorname{poisson}(k)\). The final utility becomes
Let’s examine how many cases are evaluated:
- The first case above with \(k=0\) examines the allocation \((0, 0, 0)\).
- The second case \(k=1\) examines three allocations \((1, 0, 0)\), \((0, 1, 0)\) and \((0, 0, 1)\).
- The third case \(k=2\) examines three allocations \((0, 1, 1)\), \((1, 0, 1)\) and \((1, 1, 0)\).
- The third case \(k=3\) examines the allocation \((1, 1, 1)\).
With \(p\) patients in total, we examine \(2^p\) sub problems in total. In other words, the algorithm runs in exponential time. Two strategies can immediately be used to speed up the algorithm:
- Caching the results, so identical sub-problems are not solved more than once.
- Pruning the search tree. Consider a case \(k\) with an associated set of sub-problems. An upper bound on each sub-problem is \(\sum_i a_i\). When applying the \(\max\) function over a set of sub-problem, there is no need to recurse on a sub-problem if its maximal utility \(\sum_i a_i\) is lower than the maximal expected value of any previously considered sub-problem in the case \(k\).
Summary
When a new kidney arrives at time \(t\), we first attempt to assign the kidney to the first patient, then we remove the patient and compute the immediate utility of this tentative assignment, plus the future expected utility. We repeat this procedure for all patients. The patient whose assignment maximizes the immediate utility, plus future expected utility, is given the kidney.
To compute the future expected utility we investigated two approaches:
- A simulation heuristic that simulates many possible futures and solves the perfect information case.
- An exact algorithm that is slower, but is exact.
The simulation heuristic is biased towards higher utility than what we can expect, since if we know the future we can make more informed decisions than if we do not. We now compare the algorithms on the problem instance shown below.
- The simulation heuristic yields expected utilities \([12.11, 12.87, 12.52]\).
- The exact algorithm yields expected utilities \([11.7, 12.56, 12.31]\).
Both algorithms suggest that the second patient should receive the kidney. However, the simulation heuristic yields higher expected utilities for all three assignments. This is because in a perfect information environment we can plan ahead, while in the original problem the future is uncertain and we must live with our choices.
Appendix: Code for the exact algorithm
The exact algorithm can look something like this in Python 3.7.
import functools
import itertools
from scipy.stats import poisson
import numpy as np
LAMBDA = 1
@functools.lru_cache(maxsize=2 ** 12)
def expected_utility(problem: tuple):
"""Compute expected value of a problem ((b_i, a_i), ...)."""
assert all(b_i > 0 for (b_i, a_i) in problem), "Every patient must be alive"
global LAMBDA # Rate at which kidneys arrive
# ========== SETUP AND BASE CASES ========================
# Base case: 0 patients are alive -> return utility 0
if not problem:
return 0
# Base case: 1 patients is alive -> return utility
time_interval = min(b for (b, a) in problem)
mu = LAMBDA * time_interval # Scale 'mu' in Poisson
num_patients = len(problem)
if num_patients == 1:
prob_nonzero_kidneys = 1 - poisson.cdf(k=0, mu=mu)
return problem[0][1] * prob_nonzero_kidneys
# ========== SETUP VARIABLES FOR GENERAL COMPUTATION =====
mu = LAMBDA * time_interval
# Lookup for probabilities. The last entry is prob >= num_patients
probabilities = [poisson.pmf(k=p, mu=mu) for p in range(num_patients + 1)]
probabilities[num_patients] = 1 - poisson.cdf(k=num_patients - 1, mu=mu)
assert np.allclose(np.sum(probabilities), 1)
# Mapping: k -> utilities under allocations when 'k' kidneys arrive
subproblem_results = [[] for _ in range(num_patients + 1)]
best_expected_value = [0 for _ in range(num_patients + 1)]
# ========== LOOP OVER EVERY POSSIBLE ALLOCATION =========
for allocation in itertools.product([0, 1], repeat=num_patients): # count binary
k = sum(allocation) # Number of kidneys
# Utility of assignment in this time interval (not looking into the future)
assigned = (problem[i] for (i, alloc) in enumerate(allocation) if alloc)
assignment_utility = sum(a_i for (b_i, a_i) in assigned)
# Create a subproblem. First remove all allocated patients
sub_problem = (problem[i] for (i, a) in enumerate(allocation) if not a)
# Remove patients who die in this time interval, then increment time
sub_problem = ((b_i, a_i) for (b_i, a_i) in sub_problem if b_i > time_interval)
sub_problem = tuple((b_i - time_interval, a_i) for (b_i, a_i) in sub_problem)
# Compute the maximal utility attainable for the sub-problem. Used for pruning
subproblem_max_utility = sum(a_i for (b_i, a_i) in sub_problem)
# If the maximal attainable utility is lower than the max expected utility
# seen on a sub-problem of the same size, skip the computation.
if assignment_utility + subproblem_max_utility < best_expected_value[k]:
continue
# Pruning did not happen. Recurse on sub-problem and update values
subproblem_utility = expected_utility(sub_problem)
new_best_expected = assignment_utility + subproblem_utility
best_expected_value[k] = max(new_best_expected, best_expected_value[k])
# Add the result of this assignment plus maximal expected subproblem utility
subproblem_results[k].append(assignment_utility + subproblem_utility)
# ========== WEIGHT ALLOCATIONS BY PROBABILITIES =========
# Return the utility = sum_i p(i) * max(subproblems with k=i)
return sum(
prob * max(sub_result)
for (sub_result, prob) in zip(subproblem_results, probabilities)
)
# Two examples
problem = ((1, 1), (1, 5), (2, 5), (3, 80), (4, 4), (5, 1))
assert expected_utility(problem) == 84.36426057278726
problem = ((1, 1), (2, 5), (3, 8))
assert expected_utility(problem) == 11.208735340059595