Chapter 28: Matrix Operations: Linear Equations, Inversion, and Least Squares

0:00 / 0:00
Report an issue

Welcome to Last Minute Lecture.

This free chapter overview is designed to help students review and understand key concepts.

These summaries supplement not replaced the original textbook and may not be redistributed or resold.

For complete coverage, always consult the official text.

Welcome back to the Deep Dive.

I'm excited for today's topic, even if it sounds a bit scary at first.

We're diving into the world of linear equations and figuring out how to solve systems of these things.

And then there's this whole matrix inversion mystery that we're going to try and unlock.

Sounds complicated.

It can be, but that's why we're here.

We're going to break it down, look at the research, and see just how important these techniques are in scientific computing.

Like, how do the experts handle those massive calculations and do it both quickly and accurately?

Right, because you can't have a rocket going to the moon based on bad math.

Exactly.

And there are different ways to work with matrices, but we're going to keep our focus on the techniques that get the job done.

Because in the end, that's what matters.

Absolutely.

So let's jump in and start mastering those matrices.

All right, let's do it.

You know, it's pretty amazing when you think about it, but behind all these complex scientific and engineering things we have, it's the language of matrices that's doing a lot of the heavy lifting.

When you've got tons of equations and a bunch of unknowns, finding that one set of values that makes everything work can seem like an impossible puzzle.

A lot of moving parts, yeah.

Yeah, and matrices help us organize the chaos.

Like, you've got this standard form, xx equals b -doll.

xx is like a container holding all the coefficients from the equations, xx represents all the unknown variables, and dollars has the constants from each equation.

It's pretty elegant when you see how it all fits together.

Okay, so if our matrix dollars is well -behaved, meaning it has an inverse, then we can just find six dollars using the formula xx equals a1b0, right?

That seems simple enough.

Yeah, it's the textbook solution, but there's a catch.

There's always a catch.

Right.

Actually, calculating a11 and using it directly to solve for six dollars can be tricky.

It's kind of like imagine trying to thread a needle during a hurricane.

Tiny errors in the calculations, those rounding errors that happen in computers, can get blown way out of proportion.

And throw everything off.

Exactly, leading to a solution that's, well, just plain wrong.

And that's what we call numerical instability.

It's a real headache.

So, if finding the inverse is risky, how do we reliably solve for six dollars in that A -sol -gui -sol is a bad equation?

We use a much more stable and elegant method.

It's called LUP decomposition.

The cool thing is, any non -singular matrix, which is basically a matrix that has an inverse, can be broken down into three special matrices.

We call them Pa dollars and two dollars.

So it looks like this.

Pa equals Lu1.

Pa dollars is a permutation matrix.

It keeps track of any row swaps we have to make.

Dollars is a unit lower triangular matrix, which means it's got a bunch of numbers below the main diagonal and all ones on the diagonal itself.

And two dollars is an upper triangular matrix, so it has numbers above the main diagonal.

Okay, so we've got our original matrix transformed into this combination of a few dollars and two dollars.

How does that help us find the solution to xx equals b -doll?

It's actually pretty clever.

We start by rewriting our system as Pxx, and because we know Pa is the same as bbou, we get wall -y xx equals pbdoll.

Now, let's introduce a temporary variable, PaLc.

This makes our equation a bit simpler.

Liou equals a bb.

Now, here's where the triangular matrices come in handy.

Since Slower is lower triangular, we can solve for each component of YOLR one at a time, starting from the top equation and moving down.

Kind of like a step -by -step process.

Exactly, and we call this forward substitution.

Once we have YOLR YOLRs, we can plug it into hum dollars and solve for success.

Since c dollars are up a triangular, we start from the last equation and work our way up.

This is back substitution.

So instead of that one potentially unstable matrix inversion, we have two much simpler systems to solve.

And the substitutions are really efficient, right?

Right, they are.

Each of them only takes about 22 steps for an L times null system, and the entire process of solving xx equals bb using LUP decomposition, which we call LUP SLVE, is also really fast.

It's way faster and more stable than trying to invert the matrix directly.

Now, you mentioned LUD composition earlier, without the P.

How is that different?

When can we just use LU?

Yeah, good question.

LUD composition is a bit simpler.

We just factor little dollars to weld you directly without that permutation matrix.

There are a lot of matrices that can be factored this way, and the process of doing this is very similar to something you might have learned in algebra class, Gaussian elimination.

Where you eliminate variables one by one, right?

That's the one.

And in this world of matrices, something called the sure complement pops up.

It's basically what's left of the matrix after you've eliminated some variables.

The numbers we divide by during this elimination process are called pivots.

Now, this LUD composition algorithm is pretty efficient, too.

It takes about three, three, any three steps to do the factorization.

And the best part is that we can store a little less than two dollars in the same memory space as the original matrix.

So, LU seems simpler.

Why do we need LUP with that extra permutation matrix?

It goes back to that numerical instability issue.

Sometimes during LUD composition, we might have to divide by a pivot that's either zero or very close to zero.

And dividing by a really tiny number can cause those tiny rounding errors to explode.

And mess everything up.

Totally.

LUP decomposition solves this problem by using a technique called pivoting.

Pivoting, huh?

Like rearranging things.

You got it.

In each step of the decomposition, we look at the current column and find the entry with the largest absolute value.

Then we swap the current row with the row that has this largest value.

This makes sure that our pivot is as big as possible so we don't end up dividing by tiny numbers.

And the permutation matrix keeps track of all the row swaps we do.

Ah, so Pyallers is like a careful record keeper.

Exactly.

And just like LUD composition, LUP decomposition takes about 303 steps.

Okay, so LUP is like the reliable workhorse for solving lineal systems.

Now let's talk about actually finding the inverse of a matrix, that A1 we were talking about earlier.

You mentioned LUP is usually better for solving XX equals BZO, but sometimes we need the inverse directly.

Why is that?

You're right.

While solving systems is the main goal, there are times when you need to have the inverse explicitly.

It's helpful in some theoretical situations and some specific applications.

And what's really neat is that we can use LUP decomposition to calculate this inverse.

So we can use what we already know.

Yeah, exactly.

It's all connected.

You can think of finding the inverse as solving a bunch of linear systems at once.

Each column of LDO could be found by solving a system like ACD equals A, where the six dollars of the i -th column of the inverse and the a was a special vector with a one in the i -th position and zeros everywhere else.

So we can use LUP solvee to find each column of the inverse.

And since LUP solvee takes two dollars two steps and we need to do it for each of the not columns, the total time to find ADA is also three and three three.

The same as the time to do the LUP decomposition.

Wow, it's amazing how everything fits together.

The material also talks about the connection between how hard it is to multiply matrices and how hard it is to invert them.

Absolutely.

In theory, matrix multiplication and matrix inversion are basically equally difficult when you look at how the time it takes to do them as the matrices get bigger.

There's this cool theorem, theorem 28 .1, that says if you have a way to infer in one dollar times null matrix in some time, let's call it one and n, then you can use that same method to multiply two one dollar times matrices in roughly the same amount of time, like one dollar.

The proof uses a clever trick with a larger matrix that's made up of blocks.

It's pretty neat.

So if we find a faster way to do one, it speeds up the other.

Exactly.

And the other way around too, that's theorem 28 .2.

If you can multiply two one times in a dollar time, then you can invert any non -singular time matrix in about zollertime.

The proof for this one is a bit more involved, but it starts by looking at a special type of matrix called a symmetric positive definite matrix.

Symmetric positive definite.

What's that?

It's a mouthful, right?

A symmetric positive definite matrix is a matrix that's equal to its own transpose, which means it's symmetric along the diagonal.

And if you take any non -zero vector and multiply it by the matrix and its transpose, you always get a positive number.

There's some really nice properties.

So how do these special matrices help us with the general problem of inverting matrices?

The proof uses this divide and conquer approach to invert these special symmetric positive definite matrices, again, using the SURE complement.

It shows that the time to invert one of these matrices is basically the same as the time to multiply matrices.

Then, to extend this to any non -singular matrix, the proof uses this neat trick.

One love one, eight knees, one AT.

It turns out that if one all is as non -singular, then AT key is always symmetric positive definite.

And this whole process takes about the same time as matrix multiplication.

They even deal with cases where the dimension of the matrix isn't the power of two by basically putting the matrix inside a larger one.

The text also mentions using LU decomposition of ATE2 to solve AT, but it points out that LUP decomposition is still better practice.

Okay, so we've gone deep into these connections between different ideas in linear algebra.

The last part of the chapter we're looking at digs even deeper into these symmetric positive definite matrices and how they're used in least squares approximation.

Right, those matrices pop up everywhere.

And they have some really important properties, like there's lemma 28 .3, which says that any positive definite matrix always has an inverse.

And lemma 28 .4 tells us that if a matrix is symmetric positive definite, then any square block you take from the top left corner of that matrix is also symmetric positive definite.

Like little mini versions of the big matrix.

Exactly.

And remember that sure complement we talked about before.

Well, lemma 28 .5, the sure complement lemma, tells us that the sure complement of a symmetric positive definite matrix is also symmetric positive definite.

So all these special properties, what do they mean in practice?

One really useful consequence is what corollary 28 .6 tells us.

When you do a standard LU decomposition on a symmetric positive definite matrix, you never run into that dividing by zero problem and all the pivots you get are positive.

This means it's a very stable process.

Although, as we said before, LUP decomposition is often used in software just to be extra safe.

Okay, so these symmetric positive definite matrices have some really nice theoretical and practical advantages.

But how do they connect to this least squares approximation idea?

So least squares approximation is what we use when we have more equations than unknowns.

You know, like when you're trying to fit a model to some data.

In these situations, there's usually no exact solution that satisfies all the equations perfectly.

So what we do is try to find a solution that gets as close as possible, minimizing the errors.

So it's all about finding the best fit for the data.

Right.

And one way to measure the error is to take the differences between what our model predicts and the actual data, square those differences and add them all up.

This usually leads to a system of equations that we can write as Axie equals y is y, where Axie represents the unknowns we're trying to find.

And we're trying to find the values of Tabor dollars that make the model fit the data best.

Exactly.

The solution that minimizes that sum of squared errors comes from solving what's called the normal equation.

A T A T E equals A T E Y Y.

And if our matrix doubler has full column rank, meaning its columns are all independent, then this matrix A T A T is guaranteed to be symmetric positive definite and therefore non -singular.

This means the normal equation has a unique solution given by A T A doubles one A T Y O R.

And that A T A one is called the pseudo inverse of A T arrow, often written as a plus dollar.

The material give us any real world examples of how this works.

It does actually.

One example shows how you can fit a quadratic polynomial to five data points using this least method.

And another example uses it to fit a more complex curve to data about the amount of carbon dioxide in the atmosphere.

In practice, to solve this normal equation, you'd usually start by calculating A T Y walls and then find the L U or L U P decomposition of A T A I to efficiently solve for two months.

This has been an incredible deep dive.

We've covered so much ground from that basic S X of beta vol problem to the power of L U P decomposition and how it helps us invert matrices.

We even talked about how multiplying and inverting matrices are related and how these special symmetric positive definite matrices are used in least squares approximation.

It's clear that these methods are crucial for scientific computing.

I agree.

It's amazing how much we rely on these techniques.

We saw how L U P decomposition is a really efficient and reliable way to solve linear systems, taking about two and two steps after we do the decomposition, which takes about three, three steps.

Matrix inversion also takes about three, three, three steps.

These algorithms are the workhorses behind so many things we do in science, engineering and technology.

Yeah, we don't always realize it, but these methods are everywhere.

So as we wrap things up, I want to leave you with something to think about.

We've seen how these matrix methods can be used to solve complex problems and find the best solutions, even when there's no perfect answer.

Where else do you think these tools might be used?

Maybe in places you wouldn't expect.

It's fascinating to think about how these seemingly abstract mathematical ideas have such a huge impact on our world.

Thanks for joining us on the deep dive.

Thanks for having me.

ⓘ This audio and summary are simplified educational interpretations and are not a substitute for the original text.

Chapter SummaryWhat this audio overview covers
Linear systems of equations, matrix inversion, and least-squares fitting represent fundamental problems in numerical computing that require careful algorithmic treatment to maintain numerical stability and computational efficiency. While the theoretical solution to a system Ax = b is given by x = A⁻¹b when A is nonsingular, direct computation of matrix inverses introduces severe numerical instability and is rarely the correct approach in practice. Instead, LUP decomposition provides a robust alternative by factoring A into a permutation matrix P, a unit lower-triangular matrix L, and an upper-triangular matrix U such that PA = LU. This factorization reduces the problem to two fast substitution procedures: forward substitution solves Ly = Pb and back substitution solves Ux = y, each requiring only Θ(n²) arithmetic operations after the initial Θ(n³) decomposition cost. The permutation component is essential for numerical robustness, as it ensures the algorithm always selects the pivot with the largest absolute value in the current column, preventing division by small numbers that would amplify rounding errors. When matrix inversion becomes necessary, the LUP framework enables computation of each column of A⁻¹ by solving a distinct linear system using LUP-SOLVE, achieving complete inversion in Θ(n³) time. A striking theoretical result connects matrix multiplication and inversion asymptotically through divide-and-conquer strategies based on Schur complements, proving both operations require the same time complexity. Symmetric positive-definite matrices hold special significance in numerical linear algebra because their structure guarantees all pivots remain positive and their Schur complements preserve positive-definiteness, enabling specialized algorithms. Finally, least-squares approximation addresses overdetermined systems where more equations exist than unknowns by minimizing the sum of squared residuals. The normal equations AᵀAc = Aᵀy derive from this minimization problem, and the pseudoinverse A⁺ = (AᵀA)⁻¹Aᵀ yields the optimal fit vector. Applications of least-squares methods span polynomial regression, data modeling, and general curve fitting problems encountered throughout scientific and engineering domains.

Using this chapter to study? Last Minute Lecture is free and student-run. If it helped, consider supporting the project.

Support LML ♥