Loading...
Projects: Linear Algebra
Role on Project: Instructor, Subject Matter Expert
Position Title: Professor, Mathematics
Department: Department of Mathematics
Institution: University of Toronto
From this author
Digital Object Types: Video Links
Title of Resource Part of Description
Lecture 38: Best Approximate Solution of a Linear System (Nicholson Section 5.6) | Link Linear Algebra Lecture Videos
Alternate Video Access via MyMedia | Video Duration: 47:49
Description: Started by introducing the Gobstoppers and Stopgobbers problem; this is a smaller version of the backpacker’s problem. Demonstrated that there’s no exact solution.
5:40 --- Introduced the concept of best approximate solution. Found the best approximate solution via projection onto the GS,SG subspace and then solving a related linear algebra problem. This is the “slow way” of finding the best approximate solution. Note that in this lecture I produced the orthogonal basis for the subspace via “magic”; this was because the example had been given before the Gram-Schmidt process had been discussed. If you’ve learnt about Gram-Schmidt and so there’s nothing magical here; you know how to find that pair of vectors.
15:45 --- (note the change in clothing --- it’s a different day!) Reviewed the diet problem in which there are more equations than there are unknowns. There’s no solution. But is this the best we can do?
18:20 --- Reviewed the concept of “best approximate solution of Ax = b”.  Reviewed the “slow way” of finding the approximate solution.
21:50 --- What happens if you don’t first try to find the solution before proceeding to find the best approximate solution? What happens if there actually is a solution? Good news! The best approximate solution will turn out to be a solution in this case. So you don’t need to first try and find a solution of Ax = b first. I guess the way to view this is that sometimes the best approximate solution involves no approximation at all; it’s actually the solution.
25:10 --- Introduced a faster and easier way to find the best approximate solution. Derived the linear system A^T A x = A^T b.
37:00 --- Applied this approach to the diet problem.
43:35 --- Will A^T A always be invertible?
44:15 --- Applied best approximate solution to a data fitting example using some data from the Spurious Correlations webpage. Trying to find best linear fit to the data.
Lecture 39: Introduction to Orthogonal Subspaces (Section 8.1) | Link Linear Algebra Lecture Videos
Alternate Video Access via MyMedia | Video Duration: 50:36
Description: Started with a review of what it means for a vector to be orthogonal to a subspace. Defined S-perp; the subspace of all vectors in R^n that are orthogonal to S.
2:15 --- Did an example where S is the span of two vectors in R^3. Geometrically, we know what S is and what S-perp should be. But how, in general, can we tell if a specific vector is orthogonal to S? This would involve computing infinitely many dot products! Demonstrated how it’s sufficient to simply test the vector against the vectors in a spanning set for the subspace.
10:40 --- If v is orthogonal to S then so is t v for every scalar.
13:30 --- Stated the theorem that if S is the span of k vectors in R^n then vis orthogonal to S if and only if v is orthogonal to each of the k vectors in the spanning set. Proved theorem.
18:30 --- Started focusing on S-perp. Found S-perp for the previous example of S. We’ve already found a set of vectors in S-perp. To show that this is all of S-perp, we need to show that every vector in S-perp is in that set of vectors.  Showed how to formulate “please find S-perp” as a problem of the form Ax = 0. Used this to find S-perp.
27:55 --- Computed S-perp where S is the span of 4 vectors in R^4. This is a case where I don’t have geometric intuition about S or S-perp. I have to solve the problem algebraically and see what I learn from the process. Reviewed how to formulate “please find S-perp” as a problem of the form Ax = 0 and found S-perp. Note that in the process of finding S-perp, we found a basis for S, we found the dimension of S, we found a basis for S-perp, and we found the dimension of S-perp and we found that dim(S)+dim(S-perp)=4. I made a mistake at 35:00!! To find a basis of S, we need a basis for Row(A) because we put the spanning set of S into the rows of A. I wrote down a basis for Col(A), not row (A). Doh!
36:35 --- Presented a general approach to finding S-perp. Explained why dim(S-perp) = n-rank(A).
39:40 --- Did example where S is a line in R^3. Geometrically, expect S-perp to be the plane through the origin orthogonal to the line. Algebraically found S-perp. This is one of those examples that many students hate because you end up looking at a matrix with only one row and asking questions about its rank and so forth. Make sure you’re comfortable with this! Many students don’t like matrices with only one row.
43:30 --- How to project a vector onto a subspace? We know how to project onto a line, but… what if we have a vector x and we’d like to write x as the sum of two vectors, one in S and the other in S-perp? Explained why we’d like to do this and reminded students that we’ve already done this in R^2 and in R^3.
46:15 --- If I have a basis for S, can I write that the projection of x onto S is the sum of the projections of x onto each basis vector? NO! It fails if the basis isn’t an orthogonal set of vectors!  Now the challenge is: given a basis for S, is there some way to transform this into an orthogonal basis for S?
Lecture 3: How to solve a system of linear equations (Nicholson, Section 1.2) | Link Linear Algebra Lecture Videos
Alternate Video Access via MyMedia | Video Duration: 50:57
Description: 
1:00 --- defined the elementary (equation) operations. 
3:00 --- defined what it means for two linear systems to be equivalent. 
6:30 --- stated the theorem that equivalent linear systems have the same solution set. 
8:50 --- introduced the augmented matrix as a short-hand way of representing a linear system of equations.  Introduced the coefficient matrix A, the right-hand-side vector b. and the augmented matrix (A|b). 
9:30 --- Considered a specific linear system, represented it using an augmented matrix, introduced (and used) elementary row operations on the augmented matrix (which are the same thing as elementary (equation) operations) until the augmented matrix was in a simple form.  Wrote down the linear system that corresponded to the simple augmented matrix and concluded that the original linear system had no solutions. 
19:20 --- did another example in which it turned out there was exactly one solution. 
30:00 --- introduced the language “leading ones” of a matrix that’s in reduced row echelon form.  At 31:30 I gave a wrong answer to a student --- I claimed that if every student reduced the augmented matrix so that the first nonzero entry in a row is 1 then every student would have the same matrix.  This isn’t true.  What’s true is that if every student reduced the augmented matrix until it was in reduced row echelon form (the first nonzero entry in a row is 1 and above and below that 1 are zeros and the leading ones move to the right as you move down the rows and any zero rows are at the bottom of the matrix) then all students would have the same RREF matrix.
32:00 --- How to use Matlab to find the RREF of a matrix. 
34:20 --- did another example, this time there’re infinitely many solutions. How to write down the solution set.  In the example, there are 3 equations and 4 unknowns. One of the unknowns is set to be a free parameter. Does it really matter which parameter you set to the free parameter?  In this example, you could set x1 = t and find x2 = some expression of t, x3 = some expression of t.  Alternatively, you could set x2 = t and find x1 = some expression of t, x3 = some expression of t. Alternatively, you could set x3 = t and find x1 = some expression of t, x2 = some expression of t.  It happens that the structure of the RREF of the augmented matrix is such that it’s easiest to do the x3=t option.    
41:20 --- introduced the concept of a “general solution” and then verified that it’s a solution, no matter what the value of the free parameter t. 
47:00 --- I went through the exercise of writing the solution when making the choice of setting x1 = t and find x2 = some expression of t, x3 = some expression of t. Hopefully, this was a gross enough exercise to convince you of the charm of choosing x3 = t.
Lecture 40: How to project onto a subspace (Nicholson section 8.1) | Link Linear Algebra Lecture Videos
Alternate Video Access via MyMedia | Video Duration: 48:50 
Description: Started by reminding students of something from the previous lecture: why finding S-perp is the same as solving a problem of the form Ax = 0. Also, what’s cool is that in the process of doing this you get a basis for S for free! Fixed a mistake I’d made in the previous lecture: one finds a basis for S by taking the row space of A, not the column space of A.
8:50. Did another example, this time where S is given as the span of two vectors in R^5. Found S-perp and also found a basis for S.
16:40 --- stated that if S is a subspace of R^n then dim(S)+dim(S-perp)=n.
17:35 --- Started discussion of projecting a vector onto a subspace.
18:45 --- reviewed how it works in R^2. Proj_S() is defined to be the vector s0 in S that is the closest to x. In general, how do we find this vector s0?
23:10 --- Stated and proved the theorem “If s0 is in S and if x- s0 is orthogonal to S then s0 is the vector in S that is the closest to x.” This theorem gives us a way to find Proj_S(x)!
33:10 --- Did an example where S is the span of two vectors in R^4. Projected a specific vector onto S. Presented two different ways to find this projection: a fast way and a slow way. The slow way will turn out to be useful for something else, so don’t completely ignore it. The fast way involves finding an orthogonal basis for S, projecting onto each basis vector, and adding up the projections. Which is great, if you have a way of finding an orthogonal basis for S.
Lecture 41: Introduction to the Gram-Schmidt process (Nicholson Section 8.1) | Link Linear Algebra Lecture Videos
Alternate Video Access via MyMedia | Video Duration: 47:53
Description: Started class with a “why do we care about projections?” word problem --- how much powdered milk, Kraft Dinner, and Gatorade should you take backpacking so as to most-closely satisfy the daily required diet.
11:30 --- Reviewed that there were two different ways of projecting a vector onto a subspace and it’s super-easy if you happen to have an orthogonal basis for the subspace.
14:00 --- what if you have an orthonormal basis for the subspace? This makes the formula for the projection prettier but it makes the vectors in the orthonormal basis look uglier.
19:45 --- Given a basis, how can we create an orthogonal basis from it? I started with a simple example of two vectors in R^2.
25:00 --- what would have happened if I’d done that example but handled the vectors in different order?
29:00 --- Considered S the span of three vectors in R^3. I’d like to find an orthogonal basis for S. But at this point I don’t even have a basis for S! The good news: applying the procedure to a spanning set for S will produce an orthogonal basis for S (and we can then determine the dimension of S). In this particular example, it turns out that the original spanning set was a basis for S.
42:15 ---- I know that S = span of the given vectors because this is how I was given S. I then did a bunch of stuff to the spanning set. How do I know that the final set of vectors spans S? How do I know that I didn’t mess anything up? I raised this question but didn’t answer it.
43:15 --- Did another example where S is the span of three vectors and I sought an orthogonal set of vectors that spans S.
Lecture 42: Finishing up the Gram-Schmidt process (Nicholson Section 8.1) | Link Linear Algebra Lecture Videos
Alternate Video Access via MyMedia | Video Duration: 35:56
Description:Reviewed the Gram-Schmidt process. This time I wrote the process in terms of projecting onto subspaces and subtracting off those projections.
10:20 --- Did Gram-Schmidt to a set of six vectors in R^5, they span a subspace S. I’d like a basis for S and the dimension of S. What does it mean if you get zero vectors while applying the Gram-Schmidt process?
20:00 --- I made a mistake here. I should have asked students to accept that Span{v1,v2,v3,v4,v5,v6} = Span{w1,w2,w3,w4,w5,w6}. The v6 is missing on the blackboard. Doh! It’s corrected by a student eventually but still…
22:50 ---- Gave a warning that Gram-Schmidt is super on paper but if you actually implement it on a computer you’ll find that it’s numerically unstable and roundoff error messes stuff up. If you want to do it on a computer you should use the Modified Gram-Schmidt method or something even more sophisticated.
24:50 --- I stated a theorem which will allow us to be confident about Gram-Schmidt not messing up the span at any step. Demonstrated how to use the theorem.
31:10 --- Proved theorem for the special case of three vectors.  If you’re curious about the modified Gram-Schmidt method and how it compares to the vanilla Gram-Schmidt method have a look at Solving the Normal Equations by QR and Gram-Schmid or The modified Gram-Schmidt procedureHere's a nice document: Gram–Schmidt Orthogonalization: 100 Years and More which includes some of the history behind Gram-Schmidt, modified Gram-Schmidt, Least-squares approximation (another way of describing our best approximate solutions) and an interesting sci-fi connection.  The notes are from an advanced course so don’t expect to understand all 50 slides. 
Lecture 43: Non-diagonalizable matrices examples; all symmetric matrices are diagonalizable (set-up for Nicholson Section 8.2) | Link Linear Algebra Lecture Videos
Alternate Video Access via MyMedia | Video Duration: 37:20
Description: Started with a T/F question “Every (square) matrix can be diagonalized”. A better counter-example would have been the matrix [2 1;0 2] so that you can see the eigenvalues 2,2 as separate from the off-diagonal 1. It’s the off-diagonal 1 that stops the matrix from being diagonalizable. What went wrong? If there’s even one eigenvalue for which the algebraic multiplicity is larger than the geometric multiplicity, the matrix will not be diagonalizable.
10:50 --- None of the examples of non-diagonalizable matrices are symmetric. Is this a failure of imagination? Are there symmetric matrices that are non-diagonalizable? No. It’s a theorem --- all symmetric matrices are diagonalizable.
12:00 --- If A is upper triangular, lower triangular, or diagonal is it true that the eigenvalues are precisely on the diagonal? Yes. Did a 4x4 example to show why this is true --- you should verify that it’s true in general.  
15:45 --- Diagonalized a symmetric 3x3 matrix.
24:45 --- After all those computations, nothing had to do with A’s being symmetric. Here’s something interesting: the eigenvectors are mutually orthogonal! This is because A is symmetric. Demonstrated that P^T P is a diagonal matrix. Modified the eigenvectors (made them unit vectors) and this made P^T equal to P^{-1}. (We diagonalized the matrix without having to compute P^{-1} using the inverse matrix algorithm.)
34:30 --- Defined what it means for a square matrix to be orthogonally diagonalizable.
Lecture 44: How to orthogonally diagonalize a symmetric matrix (Nicholson Section 8.2, optional) | Link Linear Algebra Lecture Videos
Alternate Video Access via MyMedia | Video Duration: 51:37 
Descripton: Started with an overview for the process of finding out if a matrix is diagonalizable.
4:40 --- Turned the discussion to symmetric matrices.
5:15 --- quick review of the 3x3 symmetric example from previous lecture.
8:50 --- review of orthogonal diagonalization for this example.
10:20 --- Why we care about orthogonal diagonalization at a physical level.
12:35 --- Diagonalized another symmetric 3x3 matrix. Found the eigenvalues and the eigenvectors that one gets via the usual process.
16:00 --- wrote down the 3 eigenvectors. Do they form an orthogonal set? No! That said, eigenvectors that have different eigenvalues are mutually orthogonal.
17:15 --- We have a basis for the eigenspace; can we transform it into an orthogonal basis? Time for Gram-Schmidt! Using this, I find a different pair of eigenvectors from the original pair of eigenvectors. Important thing for you to check: if I have vectors x and y and they’re both eigenvectors with eigenvalue λ then any linear combination of x and y will also be an eigenvector with eigenvalue λ. But if x is an eigenvector with eigenvalue λ and y is an eigenvector with eigenvalue μ where λ≠μ then linear combinations of x and y will not be eigenvectors unless they’re linear combinations where one (but not both) of the coefficients equals zero.
23:45 --- How the process presented at the beginning of class is modified for orthogonal diagonalization.
25:20 --- Gave students the challenge problem of finding a symmetric matrix that has specific eigenspaces. This is how the authors of textbooks generate the matrices in the exercises; how they have such nice eigenvalues and eigenvectors.
34:00 --- What happens to the eigenvalues of a matrix when you multiply the matrix by a scalar? The eigenvalues are multiplied by the same scalar. This is important to remember --- it’s a classic thing students get wrong.
37:00 --- Stated and proved the theorem that if A is symmetric and x is an eigenvector with eigenvalue λ and y is an eigenvector with eigenvalue μ where λ≠μ then x and y are orthogonal.
49:50 --- Finished class with three key T/F questions.
Lecture 4: Reduced Row Echelon, Rank, Solutions | Link Linear Algebra Lecture Videos
Alternate Video Access via MyMedia | Video Duration: 46:04 |
Description: Started with a bird’s-eye view of how to solve a system of linear equations.
5:10 --- First example is 3 equations with 4 unknowns --- the augmented matrix is 3 x 5. If you had to make a bet, you’d guess that there’ll be infinitely many solutions. But don’t bet when you can find out the real answer.  The rank of the augmented matrix is 3 while the rank of the coefficient matrix is 2. Because 2 < 3, there are no solutions to the linear system. (This is a fancy way of saying that when you write down the linear system corresponding to the RREF form of the augmented matrix, you get two equations that make sense but the third equation is 0=1. Which has no solution. Whenever rank(CoeffMatrix)<rank(AugmentedMatrix) this means that the linear system of the RREF form of the augmented matrix will have at least one equation of the form 0=1 and so there’s no solution.
9:50 --- Another example with 3 equations with 4 unknowns. In this case, rank(CoeffMatrix) = Rank(Augmented matrix) < number of unknowns. Because the two ranks are equal to one another, there’s at least one solution. Because the rank is less than the number of unknowns, there are infinitely many solutions and because “the number of knowns”-rank(AugmentedMatrix) = 2, the general solution has two free parameters.
17:00 --- wrote the general solution in vector form and discussed the three vectors in the general solution.
23:55 --- Did another example, this one with 4 equations with 7 unknowns. The general solution has 4 free parameters.
31:00 --- Defined the rank of a matrix and discussed it more fully.
36:50 --- Stated the theorem about how the rank of the coefficient matrix and the rank of the augmented matrix determine whether there’s no, one, or infinitely many solutions.
Lecture 5: Vectors, dot products, solutions of Ax=b (Nicholson, Sections 4.1 & 4.2) | Link Linear Algebra Lecture Videos
Alternate Video Access via MyMedia | Video Duration: 27:45 |
DescriptionMissing lecture: The next thing in Nicholson’s book is Section 1.3 “homogeneous systems of linear equations; trivial and non-trivial solutions; linear combinations of solutions; basic solutions”. In the previous book and in the lectures from Fall 2016 and Fall 2017, these concepts were introduced but were interwoven with other material which you haven’t been introduced to yet. There’s no way to disentangle the material and so I have no lecture videos to offer you on the topic. Started with a quick review of Cartesian coordinates, vectors, vector addition, and scalar multiplication of vectors. The most important thing to keep track of is the difference between a point P(p1,p2) which has coordinates p1 and p2 and the position vector of this point which is a vector whose tail is at the origin and whose tip is at P(p1,p2). In the course, we often use position vectors and points interchangeably and this can be very confusing sometimes.
5:586:22 --- ignore this part.
6:22 --- introduced the the dot product for vectors in R^2. First defined algebraically: the dot product of u = [u1;u2] and v = [v1;v2] is u1*v1+u2*v2. Second, defined it geometrically: if you know the length of the two vectors and the angle between them then you can construct the dot product. (This leads to the natural question: if I gave a pair of vectors to two students and asked student A to compute the dot product using the algebraic definition and asked student B to compute the dot product using the geometric definition, would students A and B always get the exact same answer?) The dot product can be defined in two ways and the two different ways of defining it give the same answer and are useful in different ways. This is a powerful and important aspect of the dot product, not discussed in the book.
10:00 --- generalized the dot product and length to vectors in R^n.
12:10 --- introduced what it means for two vectors to be orthogonal. Note! The zero vector is orthogonal to all vectors.
19:40 --- stated the theorem that tells us how the dot product interacts with scalar multiplication and with vector addition.
23:30 Multiplying a matrix and a vector to verify a solution of a system of linear equations.
Lecture 6: Introduction to Planes (Nicholson, Section 4.2) | Link Linear Algebra Lecture Videos
Alternate Video Access via MyMedia | Video Duration: 15:06 |
Description: Reminded students of Cartesian Coordinates. Language: given a point P(p1,p2) in the Cartesian plane, it will be reprepresented using a position vector P (in italics) whose tail is at the origin O(0,0) and whose head is at the point P(p1,p2).
4:00 --- introduced the vector representation of a plane through the origin in R^3. I used the language of subspaces ---- the span of two vectors --- if you don’t know this yet, ignore that and keep going…
6:28 --- introduced the vector representation of a plane through the point P(p1,p2,p3) in R^3. How the vector representation of a plane is related to the scalar representation of a plane.
8:30 --- Given a plane that’s not through the origin, consider a point P(p1,p2,p3) that’s in the plane. Does this mean that the point’s position vector p lies in the plane? No! But if P(p1,p2,p3) and Q(q1,q2,q3) lie in the plane then the vector q-p will be parallel to the plane.
10:10 --- how to find the scalar equation for a plane from a normal vector to the plane and a point in the plane.
Lecture 7: Projection onto a vector, Projection perpendicular to a vector (Nicholson, Section 4.2) | Link Linear Algebra Lecture Videos
Alternate Video Access via MyMedia | Video Duration: 47:51 |
Description: Review of definition of the dot product of two vectors.
4:02 --- Projection of the vector y onto the vector x. Presented a derivation of Proj_x (y) using SOHCAHTOA and the geometric definition of the dot product.
12:30 --- example using two specific vectors.
15:34 --- introduced Perp_x (y) and showed how to find it once you’ve found Proj_x (y).
22:45 --- verify that Proj_x (y) is parallel to x and that Perp_x (y) is orthogonal to x.  Verify that Proj_x (y) + Perp_x (y) = y.
25:30 --- presented a second derivation of Proj_x (y); this derivation is based on the algebraic definition of the dot product.
32:20 --- Find the distance from a point to a plane. This is where we really need to be careful about the difference between a point P(p1,p2,p3) and its position vector p.
Lecture 8: Review of projections, Introduction to the cross product (Nicholson, Section 4.2) | Link Linear Algebra Lecture Videos
Alternate Video Access via MyMedia | Video Duration: 49:52 |
Description: Started with a correction to a mistake in previous lecture then did a review of Proj_x (y) and Perp_x (y).
7:00 --- showed that Proj_x (Perp_x (y)) = 0.
25:20 --- started with cross products. The dot product of two vectors works for any two vectors in R^n. The cross product of two vectors only works for vectors in R^3. The cross product of two vectors in R^3 is a vector in R^3.
27:45 --- it’s possible to generalize cross products in some sense --- for example, given 4 vectors in R^5 there’s a way of using them to create a 5th vector in R^5. This is analogous to a cross product on R^5.
30:00 --- Given a vector equation of a plane, find a scalar equation of the plane. This means that you need to find a normal vector to the plane. This can be done using the cross product of two vectors that are parallel to the plane. (Or it can be done by solving a system of 2 linear equations in 3 unknowns…) Verified that the cross product is orthogonal to the vectors that created it.
42:00 --- presented the formula for how to compute the cross product of two vectors in R^3.
47:00 --- the cross product of a vector with itself is the zero vector. Showed that u x v= - v x u. Proved that u ⋅ (u x v) = 0.
Lecture 9: Properties and Uses of the Cross Product (Nicholson, Section 4.3) | Link Linear Algebra Lecture Videos
Alternate Video Access via MyMedia | Video Duration: 48:23 |
Description
1:55 --- the properties of the cross product: scalar multiplication, vector addition, anti-symmetry.
5:00 --- find a scalar equation for the plane that contains three given points. (Apologies for the video – the camera person wasn’t following the blackboards as well as usual…)
16:30 --- introduced ||u x v|| = || u || ||v|| sin(theta) where theta is the angle between u and v.  Included a discussion of why it’s sin(theta) and not |sin(theta)|.
21:12 --- How the cross product is related to the area of a parallelogram.
25:08 --- how to use a dot product and a cross product to compute the volume of a parallelepiped.