patternMinor
Precise algorithm for finding higher order derivatives
Viewed 0 times
orderderivativesalgorithmhigherforfindingprecise
Problem
I'm trying to make an algorithm that finds the first 10 or so terms of a function's Taylor series, which requires finding the nth derivative of the function for the nth term. It's easy to implement derivatives by following the definition of the derivative:
$$f'(x) = \lim_{h\to0}\dfrac{f(x+h)-f(x)}{h}$$
implemented here in Python:
The value seems to be even closer to the actual value of the derivative if we define it like this:
which just returns the average of
For higher order derivatives, I implemented a simple recursive function:
I tested the higher order derivatives of $f$ at $1$, where $f(x)=x^9$, and as can be proved by induction,
$$\dfrac{d^n}{dx^n}(x^k)=\dfrac{k!}{(k - n)!}x^{k-n}$$
Therefore, the nth derivative of $f$ at $1$ is $\dfrac{9!}{(9 - n)!}$.
Here are the values returned by the function for n ranging from 0 to 9:
As you can see, the values are waaaay off for $n$ greater than $5$.
What can I do to get closer to the actual values? And is there an algorithm for this that doesn't have $O(2^n)$ performance like mine?
$$f'(x) = \lim_{h\to0}\dfrac{f(x+h)-f(x)}{h}$$
implemented here in Python:
dx = 0.001
def derivative(f, x):
return (f(x + dx) - f(x)) / dxThe value seems to be even closer to the actual value of the derivative if we define it like this:
dx = 0.001
def derivative(f, x):
return (f(x + dx) - f(x - dx)) / (2 * dx)which just returns the average of
(f(x + dx) - f(x)) / dx and (f(x) - f(x - dx)) / dx.For higher order derivatives, I implemented a simple recursive function:
dx = 0.001
def nthDerivative(f, n, x):
if n == 0:
return f(x)
return (derivative(f, n - 1, x + dx) - derivative(f, n - 1, x - dx)) / (2 * dx)I tested the higher order derivatives of $f$ at $1$, where $f(x)=x^9$, and as can be proved by induction,
$$\dfrac{d^n}{dx^n}(x^k)=\dfrac{k!}{(k - n)!}x^{k-n}$$
Therefore, the nth derivative of $f$ at $1$ is $\dfrac{9!}{(9 - n)!}$.
Here are the values returned by the function for n ranging from 0 to 9:
n Value Intended value
-----------------------------------
0 1.000 1
1 9.000 9
2 72.001 72
3 504.008 504
4 3024.040 3024
5 15120.252 15120
6 60437.602 60480
7 82298.612 181440
8 32278187.177 362880
9 95496943657.736 362880As you can see, the values are waaaay off for $n$ greater than $5$.
What can I do to get closer to the actual values? And is there an algorithm for this that doesn't have $O(2^n)$ performance like mine?
Solution
The first thing you should understand is why central differencing gives you a more precise solution.
Consider the Taylor expansion of $f$ around $x$:
$$f(x + h) = f(x) + h f'(x) + \frac{1}{2} h^2 f''(x) + \frac{1}{3!} h^3 f'''(x) \cdots$$
Then:
$$\frac{f(x+h) - f(x)}{h} = f'(x) + \frac{1}{2} h f''(x) + \frac{1}{3!} h^2 f'''(x)\cdots$$
That is:
$$f'(x) = \frac{f(x+h) - f(x)}{h} + O(h)$$
However:
$$f(x - h) = f(x) - h f'(x) + \frac{1}{2} h^2 f''(x) - \frac{1}{3!} h^3 f'''(x) \cdots$$
Therefore:
$$f(x + h) - f(x-h) = 2 h f'(x) + \frac{2}{3!} h^3 f'''(x) \cdots$$
And so:
$$\frac{f(x + h) - f(x-h)}{2h} = f'(x) + O(h^2)$$
With central differencing, the even terms of the Taylor series cancel, and you get a second-order approximation instead of a first-order approximation.
(Note that in real-world problems, central differencing is not always possible for many reasons. In fluid dynamics, for example, you do not want to do a central differencing approximation across a shock, so you generally only approximate derivatives using values that are "upwind" of the point of interest. I digress.)
You can think of approximating high-order derivatives as solving for the coefficients of the Taylor expansion. To find the first derivative, you solve for the first two coefficients. Since there are two unknowns, you need two equations. To calculate the second derivative as well, you need a third equation at least.
This doesn't seem like a game that you can win. If you obtain more points by using smaller values of $h$, then numerical evaluation becomes more unstable; if $f(x)$ and $f(x+h)$ are close in value, then $f(x+h) - f(x)$ can easily suffer from catastrophic cancellation. If you use larger values of $h$, then the $O(h)$ or $O(h^2)$ error term is larger. For a given $h$, there are only two points at distance $h$ from $x$ on the real line.
However, there are as many points (at a distance $h$ from $x$) as you want in the complex plane. So if $f$ is holomorphic, and $f(x) = x^9$ is holomorphic, you can obtain some extremely high-precision derivative estimates by picking points on a circle around $x$.
Back to central differencing for a moment, this works by eliminating the even terms of the Taylor approximation. You may wonder if it's possible to extend this to eliminate higher-order terms.
Suppose that $A_k(h)$ is a $k$th-order approximation to some desired value $L$. That is, there are some constants $c_i$ such that:
$$A_k(h) = L + c_k h^k + c_{k+1} h^{k+1} + c_{k+2} h^{k+2} + \cdots$$
Let's look at what happens if you halved $h$:
$$A_k(\frac{h}{2}) = L + \frac{1}{2^k} c_k h^k + \frac{1}{2^{k+1}} c_{k+1} h^{k+1} + \frac{1}{2^{k+2}} c_{k+2} h^{k+2} + \cdots$$
Then:
$$2^k A_k(\frac{h}{2}) - A(h) = \left(2^k - 1\right) L + O(h^{k+1})$$
And so this:
$$A_{k+1}(h) = \frac{2^k A_k(\frac{h}{2}) - A_k(h)}{2^k - 1}$$
is a $k+1$th-order approximation. And, of course, you can iterate to find higher order approximations.
Even though this technique uses smaller and smaller step sizes, you can control the numeric issues by using a factor $t$ other than 2:
$$R(h,t) = \frac{t^k A(\frac{h}{t}) - A(h)}{t^k - 1}$$
This is known as Richardson extrapolation, and it turns out to be extremely useful in estimating derivatives.
Consider the Taylor expansion of $f$ around $x$:
$$f(x + h) = f(x) + h f'(x) + \frac{1}{2} h^2 f''(x) + \frac{1}{3!} h^3 f'''(x) \cdots$$
Then:
$$\frac{f(x+h) - f(x)}{h} = f'(x) + \frac{1}{2} h f''(x) + \frac{1}{3!} h^2 f'''(x)\cdots$$
That is:
$$f'(x) = \frac{f(x+h) - f(x)}{h} + O(h)$$
However:
$$f(x - h) = f(x) - h f'(x) + \frac{1}{2} h^2 f''(x) - \frac{1}{3!} h^3 f'''(x) \cdots$$
Therefore:
$$f(x + h) - f(x-h) = 2 h f'(x) + \frac{2}{3!} h^3 f'''(x) \cdots$$
And so:
$$\frac{f(x + h) - f(x-h)}{2h} = f'(x) + O(h^2)$$
With central differencing, the even terms of the Taylor series cancel, and you get a second-order approximation instead of a first-order approximation.
(Note that in real-world problems, central differencing is not always possible for many reasons. In fluid dynamics, for example, you do not want to do a central differencing approximation across a shock, so you generally only approximate derivatives using values that are "upwind" of the point of interest. I digress.)
You can think of approximating high-order derivatives as solving for the coefficients of the Taylor expansion. To find the first derivative, you solve for the first two coefficients. Since there are two unknowns, you need two equations. To calculate the second derivative as well, you need a third equation at least.
This doesn't seem like a game that you can win. If you obtain more points by using smaller values of $h$, then numerical evaluation becomes more unstable; if $f(x)$ and $f(x+h)$ are close in value, then $f(x+h) - f(x)$ can easily suffer from catastrophic cancellation. If you use larger values of $h$, then the $O(h)$ or $O(h^2)$ error term is larger. For a given $h$, there are only two points at distance $h$ from $x$ on the real line.
However, there are as many points (at a distance $h$ from $x$) as you want in the complex plane. So if $f$ is holomorphic, and $f(x) = x^9$ is holomorphic, you can obtain some extremely high-precision derivative estimates by picking points on a circle around $x$.
Back to central differencing for a moment, this works by eliminating the even terms of the Taylor approximation. You may wonder if it's possible to extend this to eliminate higher-order terms.
Suppose that $A_k(h)$ is a $k$th-order approximation to some desired value $L$. That is, there are some constants $c_i$ such that:
$$A_k(h) = L + c_k h^k + c_{k+1} h^{k+1} + c_{k+2} h^{k+2} + \cdots$$
Let's look at what happens if you halved $h$:
$$A_k(\frac{h}{2}) = L + \frac{1}{2^k} c_k h^k + \frac{1}{2^{k+1}} c_{k+1} h^{k+1} + \frac{1}{2^{k+2}} c_{k+2} h^{k+2} + \cdots$$
Then:
$$2^k A_k(\frac{h}{2}) - A(h) = \left(2^k - 1\right) L + O(h^{k+1})$$
And so this:
$$A_{k+1}(h) = \frac{2^k A_k(\frac{h}{2}) - A_k(h)}{2^k - 1}$$
is a $k+1$th-order approximation. And, of course, you can iterate to find higher order approximations.
Even though this technique uses smaller and smaller step sizes, you can control the numeric issues by using a factor $t$ other than 2:
$$R(h,t) = \frac{t^k A(\frac{h}{t}) - A(h)}{t^k - 1}$$
This is known as Richardson extrapolation, and it turns out to be extremely useful in estimating derivatives.
Context
StackExchange Computer Science Q#141184, answer score: 3
Revisions (0)
No revisions yet.