Applied Numerical Methods Using MATLAB. Won Y. Yang
not make any difference, while adding 2−22 to 230 does, as we can see by typing the following statements into the MATLAB Command window.
Figure 1.6 Process of adding two numbers, 3 and 14, in MATLAB.
>x=2̂30; x+2̂-22==x, x+2̂-23==x ans= 0(false) ans= 1(true)
1 (cf) Each range has a different minimum unit (LSB value) described by Eq. (1.2.5). It implies that the numbers are uniformly distributed within each range. The closer the range is to 0, the denser the numbers in the range are. Such a number representation makes the absolute quantization error large/small for large/small numbers, decreasing the possibility of large relative quantization error.
1.2.2 Various Kinds of Computing Errors
There are various kinds of errors that we encounter when using a computer for computation.
Truncation error: Caused by adding up to a finite number of terms, while we should add infinitely many terms to get the exact answer in theory.
Round‐off error: Caused by representing/storing numeric data in finite bits.
Overflow/underflow: Caused by too large or too small numbers to be represented/stored properly in finite‐bits, more specifically, the numbers having absolute values larger/smaller than the maximum ( fmax)/minimum ( fmin) number that can be represented in MATLAB.
Negligible addition: Caused by adding two numbers of magnitudes differing by over 52 bits, as can be seen in the last section.
Loss of significance: Caused by a ‘bad subtraction’, which means a subtraction of a number from another one that is almost equal in value.
Error magnification: Caused and magnified/propagated by multiplying/dividing a number containing a small error with a large/small number.
Errors depending on the numerical algorithms, step size, and so on.
For all that we cannot be free from these kinds of inevitable errors in some degree, it is not computers, but we, human beings, who must be responsible the computing errors. While our computer may insist on its innocence for an unintended lie, we programmers and users cannot escape from the responsibility of taking measures against the errors and would have to pay for being careless enough to be deceived by a machine. We should, therefore, try to decrease the errors and minimize their impact on the final results. In order to do so, we must know the sources of computing errors and also grasp the computational properties of numerical algorithms.
For instance, consider the following two formulas:
(1.2.8)
These are theoretically equivalent, whence we expect them to give exactly the same value. However, running the following MATLAB script “nm122.m” to compute the values of the two formulas, we see a surprising result that, as x increases, the step of f1(x) incoherently moves hither and thither, while f2(x) approaches 1/2 at a steady pace. We might feel betrayed by the computer and have a doubt about its reliability. Why does such a flustering thing happen with f1(x)? It is because the number of significant bits abruptly decreases when the subtraction
These two numbers have 52 significant bits, or equivalently 16 significant digits (252 ≈ 1052×3/10 ≈ 1015) so that their significant digits range from 108 to 10−8. Accordingly, the least significant digit (LSD) of their sum and difference is also the eighth digit after the decimal point (10−8).
Note that the number of significant digits of the difference decreased to 1 from 16. Could you imagine that a single subtraction may kill most of the significant digits? This is the very ‘loss of significance’, which is often called ‘catastrophic cancellation’.
%nm122.m f1=@(x)sqrt(x)*(sqrt(x+1)-sqrt(x)); f2=@(x)sqrt(x)./(sqrt(x+1)+sqrt(x)); x=1; format long e for k=1:15 fprintf('At x=%15.0f, f1(x)=%20.18f, f2(x)=%20.18f', x,f1(x),f2(x)); x= 10*x; end sx1=sqrt(x+1); sx=sqrt(x); d=sx1-sx; s=sx1+sx; fprintf('sqrt(x+1)=%25.13f, sqrt(x)=%25.13f ',sx1,sx); fprintf(' diff=%25.23f, sum=%25.23f ',d,s);
>nm122 At x= 1, f1(x)=0.414213562373095150, f2(x)=0.414213562373095090 At x= 10, f1(x)=0.488088481701514750, f2(x)=0.488088481701515480 At x= 100, f1(x)=0.498756211208899460, f2(x)=0.498756211208902730 At x= 1000, f1(x)=0.499875062461021870, f2(x)=0.499875062460964860 At x= 10000, f1(x)=0.499987500624854420, f2(x)=0.499987500624960890 At x= 100000, f1(x)=0.499998750005928860, f2(x)=0.499998750006249940 At x= 1000000, f1(x)=0.499999875046341910, f2(x)=0.499999875000062490 At x= 10000000, f1(x)=0.499999987401150920, f2(x)=0.499999987500000580 At x= 100000000, f1(x)=0.500000005558831620, f2(x)=0.499999998749999950 At x= 1000000000, f1(x)=0.500000077997506340, f2(x)=0.499999999874999990 At x= 10000000000, f1(x)=0.499999441672116520, f2(x)=0.499999999987500050 At x= 100000000000, f1(x)=0.500004449631168080, f2(x)=0.499999999998750000 At x= 1000000000000, f1(x)=0.500003807246685030, f2(x)=0.499999999999874990 At x= 10000000000000, f1(x)=0.499194546973835970, f2(x)=0.499999999999987510 At x= 100000000000000, f1(x)=0.502914190292358400, f2(x)=0.499999999999998720 At x= 1000000000000000, f1(x)=0.589020114423405180, f2(x)=0.499999999999999830 sqrt(x+1)= 100000000.0000000000000, sqrt(x)= 100000000.0000000000000 diff=0000000000000000, sum=41a7d78400000000
1.2.3 Absolute/Relative Computing Errors
The absolute/relative error of an approximate value x to the true value X of a real‐valued variable is defined as follows:
(1.2 9)
(1.2 10)
If the LSD is the dth digit after the decimal point, then the magnitude of the absolute error is not greater than half the value of LSD.
(1.2 11)
If the number of significant digits is s, then the magnitude of the relative error is not greater than half the relative value of LSD over most significant digit (MSD).
(1.2 12)
1.2.4 Error Propagation
In this section, we will see how the errors of two numbers, x and y, are propagated with the four arithmetic operations. Error propagation means that the errors in the input numbers of a process or an operation cause the errors in the output numbers.
Let their absolute errors be εx and εy,