Sources of Numerical Errors

We’ll start our journey of numerical methods with an introduction to numerical error. It’s a most important topic you’ll always encounter while using numerical algorithms. There are various sources of numerical errors. Some of the notables are:

Cancellation error

It’s also called accumulation or estimated error.

Standard double-precision float (i.e. FP64 or float64) can handle only up to 15-17 significant digits.

$\text{Standard Python \texttt{float}:$15$significant digits} \\ \text{Numpy \texttt{float64}: 15 significant digits} \\ \text{Numpy \texttt{float32}: 6 significant digits}$

Standard Python float

import sys
print("Standard Python float:", sys.float_info.dig, "significant digits")

• Other name for numpy.float64: float in python, double in C/C++
• Other name for numpy.float32: float in C/C++
import numpy as np
print("Numpy float64:", -int(np.log10(np.finfo(np.float64).resolution)), "significant digits")
print("Numpy float32:", -int(np.log10(np.finfo(np.float32).resolution)), "significant digits")


Let’s see an useful example:

The accurate equation for relativistic kinetic energy is $$E_{k_1} = \frac{mc^2}{\sqrt{1-\frac{v^2}{c^2}}}-mc^{2}$$

If the ratio of $\frac{v^2}{c^2} < 10^{-15}$ then, we will observe an arithmetic underflow. Given $v = 2$ and $c = 3\times 10^{8}$, we get numerically

$\frac{v^2}{c^2} = 4.4444444444444444 \times 10^{-17}\\ 1 - (1 - \frac{v^2}{c^2}) = 0.0$

v, c = 2, 3e8

print("v^2/c^2 =", v**2/c**2)
print("1 - (1-v^2/c^2) =", (1-v**2/c**2) - 1)


We expect from $1 - (1 - \frac{v^2}{c^2}) = 4.\overline{44}\times 10^{-17}$. This means the dependence of $v$ is lost. In this example, if we say there is no dependency of $v$ (i.e. $v$ is automatically becomes 0) then, the equation reduces to $$E_{k_1}\to\frac{mc^2}{1}-mc^2=0$$ which is painfully wrong.

Mitigation: Due to the cancellation errors sometimes it is better to use approximation rather than the full formula $$E_{k_2} = mc^2\left(1-\frac{v^2}{c^2}\right)^{-1/2}-mc^2\approx mc^2\left(1-\frac{v^2}{2c^2}-1\right)=\frac12mv^2$$ But this approximation is wrong for large values of $v$. Is there are any ways to have one formula which works in both cases?

Solution: With a bit of magic (short multiplication formulas) $$E_{k_3} = mc^2\left(\frac{c}{\sqrt{c^2-v^2}}-1\right) = mc^2 \left[\frac{\left(\frac{c}{\sqrt{c^2-v^2}}\right)^{2}-1}{\frac{c}{\sqrt{c^2-v^2}}+1} \right]$$ $$= mc^2 \left(\frac{\frac{c^2}{{c^2-v^2}}-1}{\frac{c}{\sqrt{c^2-v^2}}+1} \right) = \frac{mv^2}{\sqrt{1-\frac{v^2}{c^2}} + 1-\frac{v^2}{c^2}}$$ The above formula is mathematically equivalent with the definition but cancellation does not produce errors for small values of $v$.

Let’s see how the above formula behaves numerically for different values of $v$:

$\text{For } (v, c, m) = (2, 3\times 10^{8}, 1), \text{ we get:}\\ E_{k_{1}} = {\color{red} 0}.0 \\ E_{k_{2}} = {\color{red} 2}.0\\ E_{k_{3}} = {\color{red} 2}.0\\ \quad \\ \text{For } (v, c, m) = (1\times 10^{5}, 3\times 10^{8}, 1), \text{ we get:}\\ E_{k_{1}} = 5000000{\color{red} 416.0} \\ E_{k_{2}} = 5000000{\color{red} 000.0} \\ E_{k_{3}} = 5000000{\color{red} 416.666705}$

import numpy as np

def E_k_1(m, c, v):
return m*c**2/np.sqrt(1-v**2/c**2)-m*c**2

def E_k_2(m, c, v):
return 0.5*m*v**2

def E_k_3(m, c, v):
g = 1-v**2/c**2
return m*v**2/(np.sqrt(g)+g)

print("For (v, c, m) = (2, 3x10^8, 1), we have:")
v, c, m = 2, 3e8, 1
print(r"E_k_1 = ", E_k_1(m, c, v), "\nE_k_2 = ", E_k_2(m, c, v), "\nE_k_3 = ", E_k_3(m, c, v), "\n")

print("For (v, c, m) = (1x10^5, 3x10^8, 1), we have:")
v, c, m = 1e5, 3e8, 1
print(r"E_k_1 = ", E_k_1(m, c, v), "\nE_k_2 = ", E_k_2(m, c, v), "\nE_k_3 = ", E_k_3(m, c, v), "\n")


Round of error

Let me show you a prime example:

$0.3+0.2-0.5 = 0.0 \text{. This is zero.}\\ 0.1+0.2-0.3 = 5.551115123125783\times 10^{-17} \text{. This is {\color{red}not} zero. Beware of round of errors!}$

print("0.3+0.2-0.5 =",  0.3+0.2-0.5, ". This is zero.")
print("0.1+0.2-0.3 =",  0.1+0.2-0.3, ". This is not zero. Beware of round of errors!")


If we subtract same values but represented in different floating point values then, they give stupid result. For example: Given a number 1.2345678 such that $n_1$ is in float32 and $n_2$ is in float64. Then, the difference between them is

$n_2 - n_1 = 0.000000038578796379695745599747$$1697628498077392578125000 import numpy as np n_1 = np.float32(1.2345678) n_2 = np.float64(1.2345678) print("n_2 - n_1 = {:.55f}".format(n_2 - n_1))  But, we were expected to get 0.0. So, we should not use two different floating point values. Instead use the same floating point value for all the calculations. Overflow error Let me show you a prime example: \text{For } x = 20, \frac{1 - \cosh(x)}{1 + \cosh(x)} = -0.9999999917553856\\ \text{This doesnot produce overflow error.}\\\quad\\ \text{For } x = 800, \frac{1 - \cosh(x)}{1 + \cosh(x)} = \texttt{nan}\\ \text{This {\color{red}produces} overflow error.} \\\texttt{RuntimeWarning: overflow encountered }$$ \texttt{in cosh} \\ \texttt{RuntimeWarning: invalid value }$$\texttt{encountered in double\_scalars} \\\quad\\ \text{For } x = 800, \frac{1 - \cosh(x)}{1 + \cosh(x)} \equiv \frac{\frac{1}{\cosh(x)} - 1}{\frac{1}{\cosh(x)} + 1} = -1.0\\ \text{This {\color{red}produces} overflow error.} \\ \texttt{RuntimeWarning: overflow encountered }$$ \texttt{in cosh}$

import numpy as np

x = 20
print("For x = 20, (1-cosh(x))/(1+cosh(x)) =", (1-np.cosh(x))/(1+np.cosh(x)), "\nThis doesnot produce overflow error.")

x = 800
print("For x = 800, (1-cosh(x))/(1+cosh(x)) =", (1-np.cosh(x))/(1+np.cosh(x)), "\nThis produces overflow error.")
print("For x = 800, (1/cosh(x)-1)/(1/cosh(x)+1) =", (1/np.cosh(x)-1)/(1/np.cosh(x)+1), "\nThis produces overflow error.")


Published on Feb 13, 2023

Last revised on Feb 17, 2023

References