Friday, 6 November 2020

Introduction to Linear Algebra - 4. Orthogonality

4. Orthogonality

This article is a summary and a note-taking from the book 'Introduction to Linear Algebra, Gilbert Strang'.

https://math.mit.edu/~gs/linearalgebra/

4.1. Orthogonality of the Four Subspaces

https://www.semanticscholar.org/paper/The-Fundamental-Theorem-of-Linear-Algebra-Strang/9cf7601489c7c0b4ca6bdcf63d07cd6e016ec2b7/figure/0

여기 링크에 Linear Algebra의 다른 fundamental theorem도 함께 있으니 한번 봐보자.

Those four fundamental subspaces reveal what a matrix really does.

Def.

Orthogonal subspaces

Two subspaces V and W of a vector space are orthogonal if every vector v in V is perpendicular to every vector w in W.

vt w=0 for all v in V and all w in W.

Observations.

Orthogonality is impossible when dim V + dim W > dim whole space.
Zero vector is the only vector where null space meets row space, also 90 degree. And so do left null space and column space.

Proposition.

The nullspace N(A) and the row space C(At) are orthogonal subspaces of Rn.

The left nullspace N(At) and the column space C(A) are orthogonal in Rm.

Proof.

Practical
Mathematical

Orthogonal Complements.

Those fundamental subspaces are not only orthogonal, they are orthogonal complements.

Def.

The orthogonal complement of a subspace V contains every vector that is perpendicular to V. This orthogonal subspace is denoted by V perp.

있는데, orthogonal complement에서는 complement가 가지는 의미가 매우 중요하다.

Fundamental Theorem of Linear Algebra, Part 2.

N(A) is the orthogonal complement of the row space C(At) in Rn.

N(At) is the orthogonal complement of the column space C(A) in Rm.

Part 1 gave the dimension of the subspaces. Part 2 gives the 90 angles between them.

The point of “complements” is that every x can be split into a row space component xr and a nullspace component xn. When A multiplies x=xr+xn, above fig shows what happens:

The nullspace component goes to zero: Axn=0
The row space component goes to the column space: Axr=Ax.

Combining Bases from Subspaces

Some valuable facts about bases.

Any n independent vectors in Rn must span Rn. So they are a basis.
Any n vectors that span Rn must be independent. So they are a basis.

When the vectors go into the columns of an n by n square matrix, here are the same two facts.

If the n columns of A are independent, they span Rn. So Ax = b is solvable.
If the n columns span Rn, they are independent. So Ax = b has only one solution.

Proof.

Uniqueness implies existence and existence implies uniqueness.

Proposition.

Given a matrix with bases for the fundamental subspaces,

r, n-r vectors of the row space and the nullspace are independent.
r, m-r vectors of the column space and the left nullspace are independent.

Proof.

If a combination of all n vectors gives xr+xn=0, then xr=-xn is in both subspaces. So xr=xn=0, which proves independence of the n vectors together.

4.2. Projections

Two questions.

Show that projections are easy to visualize.
Projection matrices - symmetric matrices with P^2=P. The project of b is Pb.

Find the combination p = x^hat dot a = x^hat1 a1 + … x^hat n an closest to a given vector b. ⇔ We are projecting each b in R^m onto the subspace spanned by the a’s, to get p.

어떤 subspace 내에 있는 vector를 projection 하는 문제는 위 고민으로부터 왔다. 이 문제를 풀기 위한 핵심적인 아이디어는 geometry로부터 온다. 주어진 subspace와 error vector가 orthogonal 한 성질을 이용한다.

Projection Onto a Line

Projection Onto a Subspace

A^t A x^hat = A^t b

p = A x^hat = A(A^tA)^-1 A^t b = Pb

At (b-Ax^hat)=0 이므로 error vector e는 left null space (N(At))이다.

정리.

AtA is invertible if and only if A has linearly independent columns.

Proof.

We will show that AtA has the same null space as A.

The columns of A are independent -> AtA invertible

Ax=0.

A has linearly independent columns, so null space is zero vector. So does AtAx=0.

AtA invertible(has independent columns) -> The columns of A are independent.

AtAx=0.

xtAtAx=(Ax)t(Ax)=(Ax)^2=0

So N(AtA)=N(A), And if N(AtA) has independent columns, so does A.

4.3. Least Square Approximations

https://en.wikipedia.org/wiki/Least_squares

The method of least squares is a standard approach in regression analysis to approximate the solution of overdetermined systems (sets of equations in which there are more equations than unknowns) by minimizing the sum of the squares of the residuals made in the results of every single equation.

https://en.wikipedia.org/wiki/Regression_analysis

In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome variable') and one or more independent variables (often called 'predictors', 'covariates', or 'features'). The most common form of regression analysis is linear regression, in which a researcher finds the line (or a more complex linear combination) that most closely fits the data according to a specific mathematical criterion.

Ax=b의 해가 없는 경우가 있다. ⇔ too many equations ⇔ overdetermined systems ⇔ e=b-Ax를 항상 0일 수는 없다.

가장 작은 e를 얻었을 때, x hat은 least square solution이다.

Here is a short unofficial way.

when Ax = b has no solution, multiply by At and solve AtAx^hat = Atb. we must connect projections to least squares.

Minimizing the Error

By geometry

https://ccrma.stanford.edu/~jos/sasp/Geometric_Interpretation_Least_Squares.html

https://www.youtube.com/watch?v=oWuhZuLOEFY

The nearest point is the projection p.

By algebra

Every vector b=p(in a column space)+e(in a nullspace of At).

Ax=b=p+e i impossible, Ax hat is solvable.

E = norm(Ax-b)^2=norm(Ax-p)^2+norm(e)^2.

By calculus

Most functions are minimized by calculus. The graph bottoms out and the derivative in every direction is zero. ⇔ continuous, multivariate function 내에서 convex 한 (or locally optima) point가 존재한다.

In least square approximation, a function is given f(x1, x2), so we can solve by calculating partial derivatives of x1, x2 is 0. ⇔ AtAx^hat = Atb.

Partial derivatives ⇔ AtAx^hat = Atb? (not proven in the book)

The Big Picture

In the figure 4.3, x was split into xr+xn. There were many solutions to Ax=b. In this section, the situation is just the opposite.There are no solutions to Ax=b.

Instead of Ax=b, we solve Ax hat =p

N(A) is a zero vector - with independent columns. AtA is invertible. The equation AtAx hat=Atb fully determines the best vector x hat.

Chapter 7 will have the complete picture - all four subspaces included. x = xr + xn, b = p + e.

Fitting a Straight Line

Fitting a line is the clearest application of least squares.

Fitting a Parabola

A system is non-linear, but we can solve an optimized least square approximation for three variables.

4.4. Orthogonal Bases and Gram-Schmidt

https://en.wikipedia.org/wiki/QR_decomposition

http://www.seas.ucla.edu/~vandenbe/133A/lectures/qr.pdf

https://m.blog.naver.com/PostList.nhn?blogId=ldj1725&categoryNo=6&logCode=0

https://web.calpoly.edu/~brichert/teaching/oldclass/f2002217/solutions/solutions11.pdf

This section has two goals.

See how orthogonality makes it easy to find x hat and p and P.
Construct orthogonal vectors.

Def.

The vectors q1, …, qn are orthonormal if

qit qj = 0 when i/=j and 1 when i=j.

A matrix with orthonormal columns is assigned to the special letter Q.

The matrix Q is easy to work with because QtQ=I.

Q is a rectangular, Qt is only an inverse from left.
Q is square, QtQ=I means that Qt=Q^-1: transpose=inverse.

Examples of orthogonal bases.

Matrices for rotation, permutation, reflection.

정리.

If Q has orthonormal columns QtQ=I, it leaves lengths unchanged:

Same length: norm(Qx)=norm(x) for every vector x.

Q also preserves dot products (Qx)t(Qy)=xtQtQy=xty.

Projections Using Orthogonal Bases: Q Replaces A.

Projection을 구하기 위한 수식에서 A가 orthonormal 하다면 AtA=QtQ=I이므로 수식은 간단해진다.

Important case: When Q is square and m=n, the subspace is the whole space. Then Qt=Q^-1 and x hat = Qtb.

The Gram-Schmidt Process

앞서 살펴본 projection, least square approximation에 AtA 연산이 들어가므로 A가 orthonormal bases matrix인 Q가 된다면 계산 상의 이점이 있다.

For this to be true, we had to say “If the vectors are orthonormal”. Now we find a way to create orthonormal vectors.

(Gram-Schmidt 수식)

Subtract from every new vector its productions in the directions already set.

The Factorization A=QR

Since the vectors a, b, c are combinations of the q’s (and vice versa), there must be a third matrix connecting A to Q. This third matrix is the triangular R in A=QR.

(R matrix 수식)

Also R=QtA.

This non-involvement of later vectors is the key point of Gram-Schmidt:

The vectors and A and q1 are all along a single line.
The vectors a, b and A, B and q1, q2 are all in the same plane.
The vectors a, b, c and A, B, C and q1, q2, q3 are in one subspace (dimension 3).

정리.

R is invertible, since diag(R)/=0.

최종적으로 A = QR 식을 통해서 least square approximation을 간단하게 구할 수 있다. 또한 책에서 소개한 modified Gram-Schmidt 는 mn^2 time에 Q, R을 구할 수 있다.

Sunday, 1 November 2020

Introduction to Linear Algebra - 3. Vector Spaces and Subspaces

3. Vector Spaces and Subspaces

This article is a summary and a note-taking from the book 'Introduction to Linear Algebra, Gilbert Strang'.

https://math.mit.edu/~gs/linearalgebra/

3.1. Spaces of Vectors

Matrix calculations에 있어서 이제 가장 높은 차원의 이해로 넘어가려고 한다.

Number -> Vector (Vectors of numbers) -> Space (Spaces of vectors)

우리는 vector space, 특히 subspace를 논의하지 않고서는 Ax=b 의 모든 것을 이해했다고 볼 수 없다.

이번 chapter에서는 space에 관한 논의와 함께 “Fundamental theorem of Linear Algebra”로 마무리 지으려고 한다.

Vector Space

먼저 가장 중요한 vector space로 시작하려고한다. R1, R2, … Rn로 표시되며 각각은 각 dimension 내의 모든 vector들을 포함한다.

Def.

The space Rn consists of all column vectors v with n components.

임의의 vector space에서 vector의 기본 연산인 vector addition, scalar multiplication 이후에도 그 결과는 vector space에 존재한다. 이 연산은 linear combination을 생성한다.

아래는 vector space의 정의에 따라, vector addition, scalar multiplication이 따르는 8가지 rule 이다.

(1)

(2)

/..

Subspace

Rn 내부에 중요한 vector spaces들이 존재하며 Rn의 subspace라 한다.

Def.

A subspace of a vector space is a set of vectors (including 0) that satisfies two requirements. if v and w are vectors in the subspace an c is any scalar, then

v+w is in the subspace.
cv is in the subspace.

=> All linear combinations stay in the subspace.

https://en.wikipedia.org/wiki/Linear_subspace#Definition

In mathematics, and more specifically in linear algebra, a linear subspace, also known as a vector subspace[1][2] is a vector space that is a subset of some larger vector space. A linear subspace is usually simply called a subspace, when the context serves to distinguish it from other types of subspaces.

If V is a vector space over a field K and if W is a subset of V, then W is a subspace of V if under the operations of V, W is a vector space over K. Equivalently, a nonempty subset W is a subspace of V if, whenever w1, w2 are elements of W and a, b are elements of K, it follows that a*w1+b*w2 is in W.

주의해야 할 것은 subspace에는 항상 zero vector가 포함되어야 한다는 것.

The column space of A

subspace 중 가장 중요한 subspace는 linear system Ax = b에서의 matrix A와 관련이 있다.

Def.

The column space consists of all linear combinations of the columns. The combinations are all possible vectors Ax.They fill the column space C(A).

system에 대한 solvable 여부를 판단하기 위해서는 Ax를 columns의 linear combination으로 보고 이들이 만들어내는 subspace, 즉 column space C(A)에 대해 탐색해야 하기 때문이다. Ax = b 는 b가 A의 column space 내에 있을 때만 해를 가진다. 책에서는 이렇게 표현하고 있다.

To solve Ax = b is to express b as a combination of the columns.

3.2. The Null Space of A: Solving Ax = 0

Null Space

For non-invertible matrix A, there are non-zero solutions to given Ax=0. The null space of A N(A) consists of all solutions to Ax = 0. These vectors x are in Rn.

matrix A가 invertible하다면 Ax = 0의 유일한 해는 x = 0이지만 대부분의 경우 A는 rectangular하고 또한 Ax = 0 를 만족시키는 다른 해들도 가진다.

주의사항은 Ax = b에서 b가 zero vector가 아니면 solution은 subspace가 아니라는 것이다.

The null space consists of all combinations of the special solutions.

Linearly Independent

이후에 다룰 것이지만 zero null space Z의 사례는 매우 중요하다. 왜냐하면 이 경우 A columns의 linear combination이 유일하게 x = 0일 때만 0이 되므로 column들이 independent 하다는 의미를 가지기 때문이다.

No combination of columns gives the zero vector.
All columns have pivots, no columns are free.

Solving Ax=0 by Elimination

Borward elimination
Back substitution

헷갈리지 말아야 할 것. m by n matrix에서 column space의 m번 째 component가 항상 0이더라도 column space는 R^m에 있다.

(예제 풀어보기)

Echelon Matrices

137P의 matrix를 통해서 다음의 사실을 도출한다.

Three pivot variables
Four free variables

일반화시키면 n>m의 n by m matrix에서는 n-m ~ m개의 free variables을 얻게 되고 이것이 null space의 dimension을 의미한다.

The nullspace is a subspace. Its dimension is the number of free variables.

3.3. The Rank and The Row Reduced Form

The true size of A is given by its rank.

Def.

The rank r of A is the number of pivots.

(예제)

2 pivots, 2 frees, rank = 2.

Rank One

(예제)

Defs of ranks.

If a given matrix A’s rank is r, the matrix A have r independent rows, and r independent columns.
The rank r is the dimension of the column space.

The Pivot Columns

(예제)

다음 정의는 순전히 computational 합니다.

정리.

The pivot columns are not combinations of earlier columns. The free columns are combinations of earlier columns. And these combinations are the special solutions.

The Special Solutions

(예제)

각 free variable이 1인 경우에 대응하는 special solution을 구하여 nullspace matrix N을 얻을 수 있다.

정리.

Ax = 0 has r pivots and n-r free variables. n columns -r pivot columns. The null space matrix N contains n-r special solutions. Then AN = 0. the special solutions are independent.

주어진 matrix A의 reduced row echelon form R에서 다음과 같이 정형화 할 수 있다.

R = [ I F && 0 0 ], N = [-F && I]

을 통해서 column space와 그에 따른 null space를 정형화시켜서 표현할 수 있다는 이야기를 하고 있음. 그리고 MATLAB, 혹은 다른 program에 따라서 elimination 과정은 다를 수 있지만 최종적으로 R은 같음을 이야기하고 있다. 그 말은 주어진 column이 가지는 column space, null space 등은 항상 일관되다는 말이다.

3.4. The Complete Solution to Ax = b

지난 section에서 Ax=0의 solution을 살펴보았다. Forward elimination, backward substitution 과정에서 b는 여전히 0 vector 이기 때문에 Ax=0은 Rx=0으로 그대로 치환되어 풀었다. 이제 b가 non-zero vector인 경우를 살펴보자.

One Particular Solution

(예제)

particular solution을 고르는 방법. free variable을 모두 0으로 설정하면 pivot variable은 Rx = d에서 대응되는 d와 값이 같다.

x particular = the particular solution solves Axp = b
x nullspace = the n-r special solutions solve Axn = 0

So, A(xp+xn)=b.

만약 A가 square matrix이고 n = m = r이면? (A가 independent columns을 가진다면?)

particular solution만 존재하고 special solutions은 zero vector이다.

Full column rank (r=n)

A has independent columns.

All columns of A are pivot columns
There are no free variables or special solutions.
The null space N(A) contains only the zero vector x=0
If Ax=b has a solution (it might not, overdetermined) then it has only one solution.

The Complete Solution

Full row rank (r=m)

All rows have pivots, and R has no zero rows.
Ax=b has a solution for every right side b.
The column space is the whole space Rm.
There are n-r=n-m special solutions in the nullspace of A.

Four possibilities for linear equations.

r=m and r=n square and invertible Ax=b has 1 solution
r=m and r<n short and wide Ax=b has infinite solutions
r<m and r=n tall and thin AX=b has 0 or 1 solution
r<m and r<n not full rank Ax=b has 0 or infinite solution.

이들 각각을 system의 해로서 4개로 유형화할 수 있다. 159p 참고.

3.5. Independence, Basis and Dimension

This important section is about the true size of a subspace. The true dimension is measured by counting independent columns. We will see that the true dimension of the column space is the rank r.

The goal is to understand a basis: independent vectors that “span the space”

이번 단원의 essential idea는 다음과 같다.

Independent vectors: no extra vectors
Spanning a space: enough vectors to produce the rest
Basis for a space: not too many or too few
Dimension of a space: the number of vectors in a basis

Linear Independence

Def.

The columns of A are linearly independent when the only solutions to Ax = 0 is x = 0. no other combination Ax of the columns gives the zero vector.

역으로 Ax=0에서 x=0이 아닌 solution이 있으면 각 vector는 다른 vector들의 linear combination이 되어 linear dependent가 된다.

위 정의는 matrix에 한정되어 있지만 아래 정의는 어떤 vector space의 vectors sequence에도 적용된다.

Def.

The sequence of vectors v1, …, vn is linearly independent if the only combination that gives the zero vector is 0v1+ …+ 0vn. X1v1+x2v2+...+xnvn=0 only happens when all x’s are zero.

Vectors that Span a Subspace

We now introduce the single word “span” to describe this: This column space is spanned by the columns.

Def.

A set of vectors spans a space if their linear combinations fill the space.

A Basis for a Vector Space

Two vectors cannot span all of R3, even if they are independent. four vectors cannot be independent, even if they span R3. We want enough independent vectors to span the space.

=> A basis is just right.

Def.

A basis for a vector space is a sequence of vectors with two properties. The basis vectors are linearly independent and the span the space.

Proposition.

There is one and only one way to write v as a combination of the basis vectors.

Proof.

(refer book)

A basis in a vector space is not unique!

어떤 vector space에서 basis는 infinite 하다. 우리는 직관을 위해 standard basis를 Rn의 basis로 이야기한다.

Question.

Given five vectors in R7, how do you find a basis for the space they span?

=> Make them the rows of A, and eliminate to the non-zero rows of R, or put the five vectors into the column of A.

Dimension of a Vector Space

https://en.wikipedia.org/wiki/Dimension_(vector_space)

In mathematics, the dimension of a vector space V is the cardinality (i.e. the number of vectors) of a basis of V over its base field.[1] It is sometimes called Hamel dimension (after Georg Hamel) or algebraic dimension to distinguish it from other types of dimension.

For every vector space there exists a basis,[a] and all bases of a vector space have equal cardinality;[b] as a result, the dimension of a vector space is uniquely defined. We say V is finite-dimensional if the dimension of V is finite, and infinite-dimensional if its dimension is infinite.

Trivial question.

Does a set of n linearly independent vectors always span Rn?

https://math.stackexchange.com/questions/978327/does-a-set-of-n-independent-vectors-in-rn-always-span-rn

Proof by contradiction.

Rn has dimension n- which means any basis can have no more than n vectors (in fact exactly n vectors). Now if a set of linearly independent vectors didn’t span Rn then we would have a basis consisting of more than n vectors by extending to the L.I. set.

Proposition.

If v1,…, vm and w1,…,wn are both bases for the same vector space, then m = n.

Proof.

We want to reach a contradiction with a supposition of m>n, n>m. W = VA 로 만드는 coefficient matrix A를 두면 n=m이 아닐 때 non-zero solution을 얻으므로 W가 basis라는 가정은 모순이 된다.

Def.

The dimension of a space is the number of vectors in every basis.

Terminology.

We never say “the rank of a space” or “the dimension of a basis” or “the basis of a matrix”. Those terms have no meaning. It is the dimension of the column space that equals to the rank of the matrix.

Bases for Matrix Spaces and Function Spaces

The word “independence” and “basis” and “dimension” are not at all restricted to column vectors. We can also ask them for matrix spaces and function spaces.

궁금증.

이 section에서 subspace에 관한 essential idea들 다루고 있는데 정의에 대해서만 이야기한다. 왜냐하면 이전 section들에서 다루었던 pivot variable, free variable, 해의 존재 여부 등이 independence, dimension 등을 구하기 위한 계산 과정에 있기 때문으로 보인다.

3.6. Dimensions of the Four Subspaces

The main theorem in this chapter connects rank and dimension. The rank of a matrix is the number of pivots. The dimension of a subspace is the number of vectors in a basis. The rank of A reveals the dimensions of all four fundamental subspaces.

Four Fundamental Subspaces

The row space is C(At), a subspace of Rn.
The column space is C(A), a subspace of Rm.
The nullspace is N(A), a subspace of Rn.
The left nullspace is N(At), a subspace of Rm. This is our new space.

Part 1.

The dimensions of the four subspaces.

Part 2.

How the four subspaces fit together.

The Four Subspaces for R

예제에 따라서 각 subspace의 dimension 구하기. E (eliminated matrix)의 rank, free vars 개수 등으로 바로 구할 수 있다.

The Four Subspaces for A

The subspace dimensions for A are the same as for R. The job is to explain why.

EA= R이므로 independent vectors, basis에 대한 논의가 동일하다. 단 column space는 다를 수 있음.

Fundamental Theorem of Linear Algebra, Part 1.

The column space and row space both have dimension r.

The nullspaces have dimensions n-r and m-r.

Matrices of Rank One

주어진 matrix A가 rank 1일 때 A=uvt의 특별한 형태를 가진다. 이때 N(A)는 uvtx = u(vtx)이므로 v와 perpendicular한 plane으로 주어진다. 이러한 perpendicularity는 Fundamental Theorem의 Part 2 중 일부이다.