In a numerical solver I am working on in C, I need to invert a 2x2 matrix and it then gets multiplied on the right side by another matrix:
C = B . inv(A)
I have been using the following definition of an inverted 2x2 matrix:
a = A[0][0];
b = A[0][1];
c = A[1][0];
d = A[1][1];
invA[0][0] = d/(a*d-b*c);
invA[0][1] = -b/(a*d-b*c);
invA[1][0] = -c/(a*d-b*c);
invA[1][1] = a/(a*d-b*c);
In the first few iterations of my solver this seems to give the correct answers, however, after a few steps things start to grow and eventually explode.
Now, comparing to an implementation using SciPy, I found that the same math does not explode. The only difference I can find is that the SciPy code uses scipy.linalg.inv()
, which internally uses LAPACK internally to perform the inversion.
When I replace the call to inv()
with the above calculations the Python version does explode, so I'm pretty sure this is the problem. Small differences in the calculations are creeping in, which leads me to believe it is a numerical problem--not entirely surprising for an inversion operation.
I am using double-precision floats (64-bit), hoping that numerical issues would not be a problem, but apparently that is not the case.
But: I would like to solve this in my C code without needing to call out to a library like LAPACK, because the whole reason for porting it to pure C is to get it running on a target system. Moreover, I'd like to understand the problem, not just call out to a black box. Eventually I'd like to it run with single-precision too, if possible.
So, my question is, for such a small matrix, is there a numerically more stable way to calculate the inverse of A?
Thanks.
Edit: Currently trying to figure out if I can just avoid the inversion by solving for C
.
Computing the determinant is not stable. A better way is to use Gauss-Jordan with partial pivoting, that you can work out explicitly easily here.
Solving a 2x2 system
Let us solve the system (use c, f = 1, 0 then c, f = 0, 1 to get the inverse)
a * x + b * y = c
d * x + e * y = f
In pseudo code, this reads
if a == 0 and d == 0 then "singular"if abs(a) >= abs(d):alpha <- d / abeta <- e - b * alphaif beta == 0 then "singular"gamma <- f - c * alphay <- gamma / betax <- (c - b * y) / a
elseswap((a, b, c), (d, e, f))restart
This is stabler than determinant + comatrix (beta
is the determinant * some constant, computed in a stable way). You can work out the full pivoting equivalent (ie. potentially swapping x and y, so that the first division by a
is such that a
is the largest number in magnitude amongst a, b, d, e), and this may be stabler in some circumstances, but the above method has been working well for me.
This is equivalent to performing LU decomposition (store gamma, beta, a, b, c if you want to store this LU decomposition).
Computing the QR decomposition can also be done explicitly (and is also very stable provided you do it correctly), but it is slower (and involves taking square roots). The choice is yours.
Improving accuracy
If you need better accuracy (the above method is stable, but there is some roundoff error, proportional to the ratio of the eigenvalues), you can "solve for the correction".
Indeed, suppose you solved A * x = b
for x
with the above method. You now compute A * x
, and you find that it does not quite equals b
, that there is a slight error:
A * x - b = db
Now, if you solve for dx
in A * dx = db
, you have
A * (x - dx) = b + db - db - ddb = b - ddb
where ddb
is the error induced by the numerical solving of A * dx = db
, which is typically much smaller than db
(since db
is much smaller than b
).
You can iterate the above procedure, but one step is typically required to restore full machine precision.