patternMinor

Computing Von Neumann Entropy Efficiently

Submitted by: @import:stackexchange-cs·Mar 10, 2026·

Viewed 0 times

numerical-algorithms quantum-computing cs stackoverflow entropy linear-algebra

efficientlyneumannentropyvoncomputing

Problem

The Von Neumann entropy $S$ of a density matrix $\rho$ is defined to be $S(\rho)= -\text{tr}(\rho \lg \rho)$. Equivalently, $S$ is the classical entropy of the eigenvalues $\lambda_k$ treated as probabilities. So $S(\rho) = -\sum_k \lambda_k \lg \lambda_k$.

Clearly the Von Neumann entropy can be computed by first extracting the eigenvalues and then doing the sum. But extracting eigenvalues is quite expensive. Is there a faster way to compute the entropy?

(For example, computing the sum of the eigenvalues is much easier than computing the individual eigenvalues. You just return the trace of the matrix. The product of the eigenvalues, the determinant, is also easier to compute than the individual eigenvalues. Is there also a shortcut for the entropy?)

Solution

A paper Computing the Entropy of a Large Matrix by Thomas P. Wihler, Bänz Bessire, André Stefanov suggests approximating $x \lg x$ with a polynomial. Then you can use the trace of powers of the matrix to sum the results of applying that polynomial to each of the eigenvalues.

(The polynomial-via-power-and-trace thing works because density matrices have orthogonal eigenvectors. Raising the matrix to a power will raise its eigenvalues to that power. Then the trace gives you the sum of the eigenvalues at that power.)

Sounds fun. Not going to let the paper spoil it.

The derivatives of $f(x) = x \ln x$ are:

$$\begin{align}
f^0(x) &= x \ln x\\
f^1(x) &= 1 + \ln x\\
f^2(x) &= x^{-1} \\
f^3(x) &= -x^{-2}\\
f^4(x) &= 2 x^{-3}\\\
f^5(x) &= -6 x^{-4}\\\
...\\
f^k(x) &= (-1)^k \cdot (k-2)! \cdot x^{1-k}
\end{align}$$

The derivatives don't always exist at $x=0$, but they do at $x=1$, so we can make a Taylor series from there:

$$\begin{align}
x \ln x &= \sum_{k=0}^\infty \frac{(x-1)^k}{k!} f^k(1)
\\
&=
(1 \ln 1)\frac{(x-1)^0}{0!} + (1 + \ln 1)\frac{(x-1)^1}{1!} + \sum_{k=2}^\infty \frac{(x-1)^k}{k!} (-1)^k 1^{1-k}(k-2)!
\\
&=
x - 1 + \sum_{k=2}^\infty \frac{(1-x)^k}{k(k-1)}
\end{align}$$

So now we have a nice series $\frac{(1-x)^1}{-1} + \frac{(1-x)^2}{2} + \frac{(1-x)^3}{6} + \frac{(1-x)^4}{12} + \frac{(1-x)^5}{24} + ...$ that we can approximate by cutting off at some index $c$. Except that would converge like $O(c^{-1})$ when $x=0$, so we should probably try to account for the excess when the numerator is stuck at $1$:

$\begin{align} x \ln x
&=
x - 1 + \sum_{k=2}^\infty \frac{(1-x)^k}{k(k-1)}
\\
&=
x - 1 + \sum_{k=2}^{c-1} \frac{(1-x)^k}{k(k-1)} + \sum_{k=c}^\infty \frac{(1-x)^k}{k(k-1)}
\\
&\approx
x - 1 + \sum_{k=2}^{c-1} \frac{(1-x)^k}{k(k-1)} + \sum_{k=c}^\infty \frac{(1-x)^{\Huge c}}{k(k-1)}
\\
&=
x - 1 + \left( \sum_{k=2}^c \frac{(1-x)^k}{k(k-1)} \right) + \frac{(1-x)^c}{c-1}
\end{align}$

Here's a graph of the difference between $x \ln x$ and the approximation when $c=10$:

And voila, just write code to evaluate that polynomial over the eigenvalues by using traces of the matrix powers:

import math
import numpy
def von_neumann_entropy(density_matrix, cutoff=10):
    x = numpy.mat(density_matrix)
    one = numpy.identity(x.shape[0])
    base = one - x
    power = base*base
    result = numpy.trace(base)
    for k in range(2, cutoff):
        result -= numpy.trace(power) / (k*k - k)
        power = power.dot(base)
    result -= numpy.trace(power) / (cutoff - 1)
    return result / math.log(2)  # convert from nats to bits

The convergence is still kinda terrible near $x \approx 0$, though. I tried replacing the last correction with some doubling steps like this:

import math
import numpy
def von_neumann_entropy(density_matrix, cutoff=10):
    x = numpy.mat(density_matrix)
    one = numpy.identity(x.shape[0])
    base = one - x
    power = base*base
    result = numpy.trace(base)
    for k in range(2, cutoff):
        result -= numpy.trace(power) / (k*k - k)
        power = power.dot(base)

    # Twiddly hacky magic.
    a = cutoff
    for k in range(3):
        d = (a+1) / (4a(a-1))
        result -= numpy.trace(power) * d
        power = power.dot(power)
        result -= numpy.trace(power) * d
        a *= 2
     result -= numpy.trace(power) / (a-1) * 0.75

     return result / math.log(2)  # convert from nats to bits

And that improved things a bit more, so that the maximum offset vs $-x \ln x$ across the [0, 1] range was $3.5 \cdot 10^{-3}$ instead of $2.9 \cdot 10^{-2}$ (for the given cutoff of 10).

(The plot is reversed.)

Of course, the paper I linked at the start of the answer probably has a much better polynomial.

For example, my error analysis should have been in terms of the total entropy instead of the individual value. And I should have used tricks like computing the trace of $AB$ from $A$ and $B$ in $O(n^2)$ time without doing the multiplication. And I should have thrown some simulated annealing at the coefficients.

Context

StackExchange Computer Science Q#56261, answer score: 5

Revisions (0)

No revisions yet.