**Need help with data science or mathematical modeling?**I do consulting work in Norway. Read about my previous work experience and reach out to me for more information.

# Investment calculator

- 4. December 2020 (modified 4. June 2023)
- #mathematics

Simulate how your savings will grow, based on historical data from the stock index S&P500.

From \( 1871 \) to \( 2020\) the value of the stock index S&P500 rose from \( 4.74 \) to \( 3645.87\). Over the past \( 149 \) years, some years had high returns, while others had relatively low returns. Each simulation in the calculator starts at some random point in time and follows the S&P500 forward in time. The user can change the expected value \( \mu \) of returns, as well as the relative scale of the volatility \( s \). The meaning of (\( s = 0 \) is no volatility, \( s = 100\) means historic volatility, and \( s = 200\) means twice as much volatility as the historical amount. Play around with the inputs to get a feel for the calculator.

Ticking the box “*Monthly interest*” will capitalize the interest of \( \mu^{1/12} \) twelve times a year,
and simulate adding \( a / 12 \) each month.
By default sequences of historical periods are sampled, but ticking the box “*Independent years*” will
sample independent yearly returns instead of sampling entire sequences.

## Concrete cases

The calculator is easily reduced to many concrete and interesting special cases:

**Bank account.**Set the number of simulations to \( n = 1 \), set the relative scale of volatility to \( s = 0 \). Finally, set the expected value of returns \( \mu \) equal to the interest rate of the bank account. Subtract inflation from the interest rate \( \mu \) if you like.**Retirement planning.**Set the yearly addition \( a \) to a negative number.**Mutual fund.**Set the number of simulations to \( n \geq 1000 \), set the relative scale of volatility to \( s = 100 \). If you believe that the past \( 50 \) years are indicative of future returns, you can use the fact that the S&P500 went from \( 86 \) in \( 1970 \) to \( 3634 \) in \( 2020 \). This gives a geometric average of \( \left( 3634 / 86 \right)^{1 / 50} = 1.078 \), so you can set \( \mu = 8 \). If you want to account for inflation, or do not believe that the past \( 50 \) years are representative, you are free to use some other number for \( \mu \) instead.

## Methodology and caveats

**Basic computations.** Given a start value \( v \), a yearly addition \( a \) and a sequence of
interest rates
\( \mu_1, \mu_2, \dots \), the computations for a single simulation are:
\begin{align*}
\text{Year $0$:} &\qquad v \\
\text{Year $1$:} &\qquad (v) \mu_1 + a \\
\text{Year $2$:} &\qquad ((v) \mu_1 + a) \mu_2 + a \\
\text{Year $n$:} &\qquad v (\mu_1 \mu_2 \mu_3 \dots) + a (1 + \mu_n + \mu_n \mu_{n-1}+ \dots )
\end{align*}
where the sequence \( \mu_1, \mu_2, \mu_3, \dots \) is based on historical data from S&P500.
For instance, if \( \mu_1 \) is the relative profit at march \( 1993 \) looking back one year,
then \( \mu_2 \) is the interest rate from march \( 1994 \) looking back one year, and so forth.
For each simulation, the starting point (march \( 1993 \) in the previous example) is randomly drawn.

**Caveats.** The simulations are based on historical data, and might not be representative for the
future.
If you set “years” to a high value, the number of effective samples decreases.
In the extreme case of \( 149 \) years there are only a few samples, and these are highly correlated.
The same phenomenon occurs on the \( 5 \% \) and \( 95 \% \) percentiles, even when the number of years is not
very high.
While there are many samples for e.g. \( 60 \) years, by definition only \( 5 \% \) are above the \( 95 \% \)
percentiles,
and these are often highly correlated. This will lead to “jagged lines” in the figure.

**Shifting and scaling.** When you input a relative scale of volatility \( s \)
and an interest rate \( \mu \),
the historical data \( \mathbf{x} \) must be shifted and scaled.
This is done in log-space, since interest rates are multiplicative.
The geometric average is defined as
\begin{align*}
\operatorname{geom}\left( \mathbf{x} \right) = \exp \left( \frac{1}{n} \sum_{i}^n \ln x_i \right).
\end{align*}
The shifting and scaling is done using the identity:
\begin{align*}
\operatorname{geom}\left(
\exp \left(
\left(
\ln ( \mathbf{x} ) - \mathbb{E}\left [ \ln ( \mathbf{x} ) \right ] \right) s + \ln \mu
\right)
\right) = \mu \qquad \forall \, s
\end{align*}
Inspect the source of this page for more details and the full Javascript code.