The dynamic lot size problem
 3. October 2021
 #algorithms, #optimization
This article describes my implementation of the fastest known algorithm for the dynamic lot size problem (also called the singleitem uncapacitated lotsizing problem).
First we describe the problem and show an example. Then we implement an algorithm from a 1958 paper^{1}, successively tuning the implementation from \(\mathcal{O}( T^4 )\) to \(\mathcal{O}( T^2 )\). Doing the math and implementing the algorithm properly takes the runtime from 2 minutes to 2 milliseconds on a problem of size \(T=500\)—a speedup of a factor 60 000.
For over thirty years, the \(\mathcal{O}( T^2 )\) approach was the fastest known algorithm. However, in 1992 a new paper^{2} described an algorithm that runs in \(\mathcal{O}( T )\) time. We implement this algorithm as well.
The full code, with comments and all, is published on GitHub Gist:
 wagner.py  The \(\mathcal{O}(T^2)\) algorithm (often scales linearly in practise)
 wagelmans.py  The \(\mathcal{O}(T)\) algorithm
Problem description
In the dynamic lot size problem we have \(T\) periods indexed by \(t=0, 1, 2, \ldots, T1\). In each period we have a product demand \(d_t \geq 0\), an order cost \(o_t\geq 0\), and a holding cost \(h_t\geq 0\).
We must fill the demand \(d_t\) in each period by placing product orders \(x_t\) in each period. If we order products in period \(t\), we incur an order cost \(o_t\) (which is constant no matter how much or little we order, as long as \(x_t > 0\)). If we still have products going into the next period \(t+1\), we pay a holding cost \(h_t\) to hold products from period \(t\) to period \(t+1\).
Let’s look at a simple example:
Let us compute the cost of a proposed solution \(x = [12, 0, 0]\). This represents placing an order of \(12\) items in the first period, to cover all periods. We pay the order cost \(o_0 = 5\) in period \(0\). Going into period \(1\) we hold \(x_0  d_0 = 12  1 = 11\) products, for a holding cost of \(h_0 = 2\) each, so a cost of \((x_0  d_0) h_0 = 11 \cdot 2 = 22\) in total. In period \(t=1\) the demand is \(d_1 = 3\), so we are left with \(8\) products going into the last period. The holding cost going into the last period is \(8 \cdot 5 = 40\). The total cost of the solution \(x = [12, 0, 0]\) is \(5 + 22 + 40 = 67\).
However, the optimal solution is \(x^{\star} = [4, 0, 8]\). You can verify that the cost of \(x^{\star}\) is \(16\).
The road ahead
We’ll start by implementing a very naive recursion. Incrementally we’ll study the problem structure and improve the code. As a result the solution time for a problem with \(T=500\) periods will drop dramatically. This is shown below. Note that (1) the time includes generating a random problem and (2) the numbers are very hardware specific, so they should be taken with a grain of salt.
>>> %timeit time_algorithm(wagner_whitin_v1, periods=500)
1min 54s ± 2.96 s per loop (mean ± std. dev. of 7 runs, 1 loop each)
>>> %timeit time_algorithm(wagner_whitin_v2, periods=500)
6.75 s ± 223 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
>>> %timeit time_algorithm(wagner_whitin_v3, periods=500)
55.6 ms ± 1.33 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
>>> %timeit time_algorithm(wagner_whitin_v4, periods=500)
2.02 ms ± 14.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
>>> %timeit time_algorithm(wagelmans_v3, periods=500)
2.87 ms ± 47.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
The WagnerWhitin dynamic programming algorithm
In a paper titled “Dynamic version of the economic lot size model”^{1}, a dynamic programming (DP) approach is explained. The recursion is given by the following equation:
The value \(F(t)\) is the optimal (lowest) cost for a subproblem consisting of periods \(0, 1, \ldots, t\). The lowest cost of the entire problem is given by \(F(T1)\). This recursion yields the minimal cost of the optimal solution. The actual solution \(x^{\star}\) can be obtained by adding some extra logic, but we’ll omit that logic in this article.
In words, the equation above reads something like this:
case (1):
if the demand in this period is 0,
then the minimal cost F(t) equals the
previous minimal cost F(t1)
case (2):
if the demand in this period is > 0,
then the minimal cost is given by the minimum of:
(j=t) the cost of ordering at j=t to cover the demand in [t],
plus no holding cost,
plus the cost of the optimal solution for earlier
periods, given by F(j1)
(j=t1) the cost of ordering at j=t1 to cover the demand in [t1, t]
plus the holding cost from [t1],
plus the cost of the optimal solution for earlier
periods, given by F(j2)
(j=t2) the cost of ordering at j=t2 to cover the demand in [t2, t1, t]
plus the holding cost from [t2, t1],
plus the cost of the optimal solution for earlier
periods, given by F(j2)
...
Attempt 1: An \(\mathcal{O}( T^4 )\) algorithm
We implement the recursive equation above asis. There are four loops:
 Loop over every \(0 \leq t < T\) and compute \(F(t)\).
 To compute \(F(t)\), loop over every argument in the \(\min\) function.
 In every argument of the \(\min\) function there’s a double sum.
The total runtime is therefore \(\mathcal{O}( T^4 )\). In Python 3.7, a simple implementation can look like this:
def wagner_whitin_v1(demands, order_costs, holding_costs):
d, o, h = demands, order_costs, holding_costs
F = {1: 0} # Base case for recursion
for t in range(len(demands)):
if d[t] == 0:
F[t] = F[t  1]
else:
F[t] = min((
# Cost of ordering at time j
o[j]
# Cost of covering demand in periods
# from j to t with product ordered at j
+ sum(d[i] * sum(h[k] for k in range(j, i))
for i in range(j  1, t + 1))
# Cost of optimal solution prior to period j
+ F[j  1]) for j in range(t + 1))
return F[len(demands)  1]
Attempt 2: An \(\mathcal{O}( T^3 )\) algorithm
Recall again the recursion:
The innermost sum \(\sum_{k=j}^{i1} h_k\) is computed time and time again. If we can cache the partial sums, we’ll achieve a good speedup and get rid of a loop. We define
The equation now becomes:
So far we have only made a syntactic change. In our code we store cumulative sums of the \(h\) vector to compute \(h_{ij}\) in \(\mathcal{O}( 1 )\) time. This brings the complexity down to \(\mathcal{O}( T^3 )\), and the code can look something like this:
class CumSumList:
def __init__(self, elements):
self.cumsums = [0] * len(elements)
partial_sum = 0
for i, element in enumerate(elements):
partial_sum += element
self.cumsums[i] = partial_sum
def sum_between(self, i, j):
"""Compute sum: elements[i] + elements[i+1] + ... + elements[j]"""
if i > j:
return 0
top = self.cumsums[j] if j < len(self.cumsums) else self.cumsums[1]
low = self.cumsums[i  1] if i >= 1 else 0
return top  low
def wagner_whitin_v2(demands, order_costs, holding_costs):
d, o, h = demands, order_costs, holding_costs
h_cumsum = CumSumList(h)
F = {1: 0} # Base case for recursion
for t in range(len(demands)):
if d[t] == 0:
F[t] = F[t  1]
else:
F[t] = min((o[j] + F[j  1]
+ sum(d[i] * h_cumsum.sum_between(j, i1)
for i in range(j  1, t + 1))
) for j in range(t + 1))
return F[len(demands)  1]
Attempt 3: An \(\mathcal{O}( T^2 )\) algorithm
Let’s examine the double sum \(S_t(j) := \sum_{i=j1}^{t} d_i \sum_{k=j}^{i1} h_k\) again. We’ll change the order of summation, which is analogous to changing the order of integration in a double integral. Then we’ll use the same trick as above with cumulative sums.
Let’s look at a concrete example. When \(t=3\), \(j\) goes from \(3\) down to \(0\). Writing out the sums \(S_3(j)\) we find
The terms can be rearranged by pulling out the \(h\)factors instead of the \(d\)factors. This is equivalent to changing the order of summation in the double sum. When we rearrange the terms, we find:
Just like drawing a graph is enlightening when reasoning about changing the order of integration in a double integral, a good drawing will help you convince yourself that this works. If you want to understand the double sum manipulation below, I highly suggest it.
Algebraically, changing the order of summation means that:
As seen in the equation above, and in the concrete example of \(S_3(j)\), we obtain a recursive formula \(S_t(j) = h_j d_{j+1, t} + S_t(j+1)\). The base cases are important here, so we must define \(S_t(j=t+1) := 0\) and \(d_{ij} := 0\) when \(i > j\).
We are now able to compute the double sum in constant time, bringing the overall complexity down to \(\mathcal{O}( T^2 )\). The recursion looks like this, and it’s important that \(j\) decreases and that previously computed values of \(S_t(j)\) are stored.
A Python implementation of this recursion is given below.
Notice how we update S_t
in each iteration of the loop.
def wagner_whitin_v3(demands, order_costs, holding_costs):
d, o, h = demands, order_costs, holding_costs
d_cumsum = CumSumList(d)
F = {1: 0} # Base case for recursion
for t in range(len(demands)):
if d[t] == 0:
F[t] = F[t  1]
else:
S_t = 0
min_args = []
for j in reversed(range(t + 1)):
S_t += h[j] * d_cumsum.sum_between(j+1, t)
min_args.append(o[j] + S_t + F[j  1])
F[t] = min(min_args)
return F[len(demands)  1]
Attempt 4: Using the Planning Horizon Theorem
There exists a small modification that won’t bring down the asymptotic complexity, but it’s still worth mentioning because it can drastically speed up the execution time. In practise the algorithm will scale linearly when we implement the Planning Horizon Theorem^{1}.
Let \(t^{\star \star} \leq t^{\star} < t\). The theorem says that if \(j = t^{\star \star}\) achieved the minimum value when computing the \(\min\)function in the iteration for \(F(t^{\star})\), then when computing \(F(t)\) we only need to compute the \(\min\)function over \(j \geq t^{\star \star}\).
In other words, there’s never any reason to look further back into the past as we go further into the future. We shortcircuit the \(\min\)function to range over \(t^{\star \star} \leq j \leq t\) instead of \(0 \leq j \leq t\) like so:
def wagner_whitin_v4(demands, order_costs, holding_costs):
d, o, h = demands, order_costs, holding_costs
d_cumsum = CumSumList(d)
F = {1: 0} # Base case for recursion
t_star_star = 0 # Highest period where minimum was achieved
for t in range(len(demands)):
if d[t] == 0:
F[t] = F[t  1]
else:
S_t = 0
min_args = []
for j in reversed(range(t_star_star, t + 1)):
S_t += h[j] * d_cumsum.sum_between(j+1, t)
min_args.append(o[j] + S_t + F[j  1])
# Index of minimum value in list `min_args`
argmin = min_args.index(min(min_args))
# Update rule, where (t  argmin) = previous period of ordering
t_star_star = max(t_star_star, t  argmin)
F[t] = min_args[argmin]
return F[len(demands)  1]
Attempt 5: The Wagelmans \(\mathcal{O}( T )\) algorithm
Going from \(\mathcal{O}( T^2 )\) to \(\mathcal{O}( T )\) requires several clever insights. There’s no point in repeating the literature^{2}^{3} here, so I’ll summarize:
 Above we solved the problem by forward recursion. We computed \(F(t)\) given \(F(0), F(1), \ldots, F(t1)\). We can also go backwards, computing \(F(t)\) given \(F(t+1), F(t+2), \ldots, F(T1)\).
 The objective function in the backward recursion can be reformulated. We solve for a different function \(G(t)\) instead of \(F(t)\). This alternative formulation is found by expressing the problem as a Mixed Integer Program and from there changing the objective function.
 The resulting structure is a minimization over a set of affine functions in the form \(p_t d_{t, j} + G(j+1)\), where the minimization is over the index \(j\). By creating a data structure holding the lower convex envelope of previously seen points \((d_{0, k1}, G(k))\), we can solve the minimization in \(\mathcal{O}(T \ln T )\) time using binary search. Finally, when the slopes \(p_t\) are a decreasing function of \(t\) (which they are in this problem), we can determine the minimum in \(\mathcal{O}(T)\) time.
The first bullet point above says that we can do a backward recursion. Here is the formulation, as well as the code. The square brackets in the equation below are Iverson brackets.
An implementation of this recursion can look like this:
def wagelmans_v1(demands, order_costs, holding_costs):
d, o, h = demands, order_costs, holding_costs
d_cumsum = CumSumList(d)
T = len(demands)
F = {T: 0}
for t in reversed(range(T)):
F[t] = min(
# Cost of ordering at time t. Do not order if d_t = 0 and j=t
o[t] * ((d[t] > 0) or (t != j)) +
# Cost of covering periods t to j (inclusive) by ordering at t
sum(h[i]* d_cumsum.sum_between(i + 1, j) for i in range(t, j)) +
# Cost of optimal solution for j + 1 and to the end
F[j + 1] for j in range(t, T))
return F[0]
The algorithm above is \(\mathcal{O}(T^3)\), so no progress has been made yet.
The second bullet point above says that we can reformulate the objective. To do so, we define \(p_t = \sum_{j=t}^{T1} h_j\). The equivalent objective is
We’re now back to \(\mathcal{O}(T^2)\). An implementation of the recursion can look like this:
from itertools import accumulate
def wagelmans_v2(demands, order_costs, holding_costs):
d, o, h = demands, order_costs, holding_costs
d_cumsum = CumSumList(d)
p = list(reversed(list(accumulate(reversed(h)))))
T = len(demands)
F = {T: 0}
for t in reversed(range(T)):
F[t] = min(o[t] * ((d[t] > 0) or (t != j)) +
p[t] * d_cumsum.sum_between(t, j) +
F[j + 1] for j in range(t, T))
# Transform objective function back to original formulation
return F[0]  sum(h[t] * d_cumsum.sum_between(0, t) for t in range(T))
To implement the \(\mathcal{O}(T)\) algorithm we’ll need two data structures: CumSumList
and ConvexEnvelope
.
The CumSumList
lets us compute \(d_{ij} := \sum_{k=i}^{j} d_k\) in \(\mathcal{O}(1)\) time by storing cumulative sums.
This data structure was introduced above.
The ConvexEnvelope
holds a lower convex envelope of points \((x_i, y_i)\), where \(x_{t+1} \leq x_{t}\).
In other words, the points are added in an order such that the \(x\)values are nonincreasing.
Adding a new point will trigger the removal of all previous points that fall above the lower convex envelope when the new point is added.
In a sequence of \(n\) additions at most \(n1\) points are removed, so the amortized complexity of adding a point is \(\mathcal{O}(1)\).
Finally we need a function optimal_index
that returns the index of the optimal point in the ConvexEnvelope
.
This also has amortized complexity \(\mathcal{O}(1)\).
If the slopes \(p_t\) in the affine functions \(p_t d_{t, j} + G(j+1)\) were not an increasing or decreasing function of \(t\), then a \(\mathcal{O}(\ln T)\) binary search would be used here.
With these three pieces in place, the final code can look something like this:
def wagelmans_v3(demands, order_costs, holding_costs):
d, o, h = demands, order_costs, holding_costs
d_cumsum = CumSumList(d)
p = list(reversed(list(accumulate(reversed(h)))))
T = len(demands)
G = {T: 0}
effpoints = ConvexEnvelope([(d_cumsum.sum_between(0, T  1), 0)])
for t in reversed(range(T)):
# Get the optimal index (as added to the ConvexEnvelope)
index_effpoints = optimal_index(effpoints, slope=p[t])
j = T  1  index_effpoints # Convert index in datastructure to j
# Update minimal cost
G[t] = o[t] + p[t] * d_cumsum.sum_between(t, j) + G[j + 1]
# If d[t] == 0, an option is to not order at time t
# If this has a lower cost than ordering (as computed above):
if (d[t] == 0) and G[t] > G[t + 1]:
G[t] = G[t + 1]
# Add the new point (x_t, y_t) = (d_{0, t1}, G[t])
effpoints.add(d_cumsum.sum_between(0, t  1), G[t])
return G[0]  sum(h[t] * d_cumsum.sum_between(0, t) for t in range(T))
Summary
It’s been seventy years since the dynamic lot size problem was solved by dynamic programming. Solving the DP recursion naively has value, since it gives us a simple and correct algorithm to benchmark and test against. It’s always a good idea to implement a simple and correct algorithm first.
Examining the structure of the double sum allowed us to go from \(\mathcal{O}(T^4)\) to \(\mathcal{O}(T^2)\). Investigating the problem structure more deeply and utilizing the Planning Horizon Theorem made the practical run time linear, but the worst case was still \(\mathcal{O}(T^2)\). This was the best result for several decades.
In the early nineties an \(\mathcal{O}(T)\) algorithm was discovered by Wagelmans. I could not find this algorithm implemented anywhere, so I implemented it here. On a typical test case the WagnerWhitin algorithm is around twenty percent faster, since Wagelmans algorithm requires more custom data structures and bookkeeping. However, on “bad” test cases WagnerWhitin runs in \(\mathcal{O}(T^2)\) time, and Wagelmans algorithm will be faster.
As a final comment, there are problems that are similar in structure which can be solved by modifying the approach and code somewhat. For instance, the order cost might be a function of the number of products ordered, or the costs might be negative in some applications.
Notes

Harvey M. Wagner and Thomson M. Whitin, “Dynamic version of the economic lot size model,” Management Science, Vol. 5, pp. 89–96, 1958 ↩↩↩

Albert Wagelmans, Stan Van Hoesel and Antoon Kolen , “Economic Lot Sizing: An O(n log n) Algorithm That Runs in Linear Time in the WagnerWhitin Case,” Operations Research, Vol. 40, Supplement 1: Optimization (Jan.  Feb., 1992), pp. S145S156 ↩↩

Chapter 7 (SingleItem Uncapacitated LotSizing) in the book “Production Planning by Mixed Integer Programming” by Yves Pochet and Laurence A. Wolsey ↩