Sylvester's criterion

In mathematics, Sylvester’s criterion is a necessary and sufficient criterion to determine whether a Hermitian matrix is positive-definite. It is named after James Joseph Sylvester.

Sylvester's criterion states that a n × n Hermitian matrix M is positive-definite if and only if all the following matrices have a positive determinant:

the upper left 1-by-1 corner of M,
the upper left 2-by-2 corner of M,
the upper left 3-by-3 corner of M,
${}\quad \vdots$
M itself.

In other words, all of the leading principal minors must be positive. By using appropriate permutations of rows and columns of M, it can also be shown that the positivity of any nested sequence of n principal minors of M is equivalent to M being positive-definite.^[1]

An analogous theorem holds for characterizing positive-semidefinite Hermitian matrices, except that it is no longer sufficient to consider only the leading principal minors: a Hermitian matrix M is positive-semidefinite if and only if all principal minors of M are nonnegative.^[2]^[3]

Elementary proof[]

Suppose $M_{n}$ is $n\times n$ Hermitian matrix $M_{n}^{\dagger }=M_{n}$ . Let $M_{k},k=1,\ldots n$ be the principal minor matrices, the $k\times k$ upper left corner matrices. Let's show that if $M_{n}$ is positive definite the principal minors are positive that is $\det M_{k}>0$ . It is easy to see that $M_{k}$ is positive definite by choosing

x=\left({\begin{array}{c}x_{1}\\\vdots \\x_{k}\\0\\\vdots \\0\end{array}}\right)=\left({\begin{array}{c}{\vec {x}}\\0\\\vdots \\0\end{array}}\right)

and noticing $0<x^{\dagger }M_{n}x={\vec {x}}^{\dagger }M_{k}{\vec {x}}.$ Equivalently the eigenvalues of $M_{k}$ are positive. The determinant is the product of eigenvalues and therefore positive $\det M_{k}>0$ . This ends the part 1.

Then the other way, we use induction. The most general form of an $(n+1)\times (n+1)$ Hermitian matrix is

M_{n+1}=\left({\begin{array}{cc}M_{n}&{\vec {v}}\\{\vec {v}}^{\dagger }&d\end{array}}\right)\qquad (*)

where $M_{n}$ is an $n\times n$ Hermitian matrix, ${\vec {v}}$ is a vector and $d$ is a real constant. Suppose the criterion holds for $M_{n}$ we show that it holds for $M_{n+1}$ . We assume that the all the principal minors of $M_{n+1}$ are positive. This implies that $\det M_{n+1}>0$ , $\det M_{n}>0$ , and that $M_{n}$ is positive definite by the inductive hypothesis. Denote

x=\left({\begin{array}{c}{\vec {x}}\\x_{n+1}\end{array}}\right)

then

x^{\dagger }M_{n+1}x={\vec {x}}^{\dagger }M_{n}{\vec {x}}+x_{n+1}{\vec {x}}^{\dagger }{\vec {v}}+{\bar {x}}_{n+1}{\vec {v}}^{\dagger }{\vec {x}}+d|x_{n+1}|^{2}

Let's "complete the square":

=({\vec {x}}^{\dagger }+{\vec {v}}^{\dagger }M_{n}^{-1}{\bar {x}}_{n+1})M_{n}({\vec {x}}+x_{n+1}M_{n}^{-1}{\vec {v}})-|x_{n+1}|^{2}{\vec {v}}^{\dagger }M_{n}^{-1}{\vec {v}}+d|x_{n+1}|^{2}

=({\vec {x}}+{\vec {c}})^{\dagger }M_{n}({\vec {x}}+{\vec {c}})+|x_{n+1}|^{2}(d-{\vec {v}}^{\dagger }M_{n}^{-1}{\vec {v}})

where

{\vec {c}}=x_{n+1}M_{n}^{-1}{\vec {v}}

(we know that

M_{n}^{-1}

exists because the eigenvalues of

M_{n}

are all positive.) The first term is positive by the inductive hypothesis, we now examine the sign of the second term. We use the block matrix determinant formula

\det \left({\begin{array}{cc}A&B\\C&D\end{array}}\right)=\det A\det(D-CA^{-1}B)

on $(*)$ to obtain

\det M_{n+1}=\det M_{n}(d-{\vec {v}}^{\dagger }M_{n}^{-1}{\vec {v}})>0

which implies

d-{\vec {v}}^{\dagger }M_{n}^{-1}{\vec {v}}>0

and we have

x^{\dagger }M_{n+1}x>0.

\Box

The block matrix determinant formula follows from taking the determinant of both sides of the following identity:

\left({\begin{array}{cc}A&B\\C&D\end{array}}\right)=\left({\begin{array}{cc}I&0\\CA^{-1}&I\end{array}}\right)\left({\begin{array}{cc}A&0\\0&D-CA^{-1}B\end{array}}\right)\left({\begin{array}{cc}I&A^{-1}B\\0&I\end{array}}\right).

Proof[]

The proof is only for nonsingular Hermitian matrix with coefficients in $\mathbb {R}$ , therefore only for nonsingular real-symmetric matrices.

Positive definite or semidefinite matrix: A symmetric matrix A whose eigenvalues are positive (λ > 0) is called positive definite, and when the eigenvalues are just nonnegative (λ ≥ 0), A is said to be positive semidefinite.

Theorem I: A real-symmetric matrix A has nonnegative eigenvalues if and only if A can be factored as A = B^TB, and all eigenvalues are positive if and only if B is nonsingular.^[4]

Proof:

Forward implication: If A ∈ R^n×n is symmetric, then, by the spectral theorem, there is an orthogonal matrix P such that A = PDP^T , where D = diag (λ₁, λ₂, . . . , λ_n) is real diagonal matrix with entries being eigenvalues of A and P is such that its columns are the eigenvectors of A. If λ_i ≥ 0 for each i, then D^1/2 exists, so A = PDP^T = PD^1/2D^1/2P^T = B^TB for B = D^1/2P^T, and λ_i > 0 for each i if and only if B is nonsingular.

Reverse implication: Conversely, if A can be factored as A = B^TB, then all eigenvalues of A are nonnegative because for any eigenpair (λ, x):

\lambda ={\frac {x^{T}Ax}{x^{T}x}}={\frac {x^{T}B^{T}Bx}{x^{T}x}}={\frac {\|Bx\|^{2}}{\|x\|^{2}}}\geq 0.

Theorem II (The Cholesky decomposition): The symmetric matrix A possesses positive pivots if and only if A can be uniquely factored as A = R^TR, where R is an upper-triangular matrix with positive diagonal entries. This is known as the Cholesky decomposition of A, and R is called the Cholesky factor of A.^[5]

Proof:

Forward implication: If A possesses positive pivots (therefore A possesses an LU factorization: A = L·U'), then, it has an LDU factorization A = LDU = LDL^T in which D = diag(u₁₁, u₂₂, . . . , u_nn) is the diagonal matrix containing the pivots u_ii > 0.

{\begin{aligned}\mathbf {A} &=LU'={\begin{bmatrix}1&0&\cdots &0\\\ell _{12}&1&\cdots &0\\\vdots &\vdots &&\vdots \\\ell _{1n}&\ell _{2n}&\cdots &1\end{bmatrix}}{\begin{bmatrix}u_{11}&u_{12}&\cdots &u_{1n}\\0&u_{22}&\cdots &u_{2n}\\\vdots &\vdots &&\vdots \\0&0&\cdots &u_{nn}\end{bmatrix}}\\[8pt]&=LDU={\begin{bmatrix}1&0&\cdots &0\\\ell _{12}&1&\cdots &0\\\vdots &\vdots &&\vdots \\\ell _{1n}&\ell _{2n}&\cdots &1\end{bmatrix}}{\begin{bmatrix}u_{11}&0&\cdots &0\\0&u_{22}&\cdots &0\\\vdots &\vdots &&\vdots \\0&0&\cdots &u_{nn}\end{bmatrix}}{\begin{bmatrix}1&u_{12}/u_{11}&\cdots &u_{1n}/u_{11}\\0&1&\cdots &u_{2n}/u_{22}\\\vdots &\vdots &&\vdots \\0&0&\cdots &1\end{bmatrix}}\end{aligned}}

By a uniqueness property of the LDU decomposition, the symmetry of A yields: U = L^T, consequently A = LDU = LDL^T. Setting R = D^1/2L^T where D^1/2 = diag( $\scriptstyle {\sqrt {u_{11}}},\scriptstyle {\sqrt {u_{22}}},\ldots ,\scriptstyle {\sqrt {u_{11}}}$ ) yields the desired factorization, because A = LD^1/2D^1/2L^T = R^TR, and R is upper triangular with positive diagonal entries.

Reverse implication: Conversely, if A = RR^T, where R is lower triangular with a positive diagonal, then factoring the diagonal entries out of R is as follows:

\mathbf {R} =LD={\begin{bmatrix}1&0&\cdots &0\\r_{12}/r_{11}&1&\cdots &0\\\vdots &\vdots &&\vdots \\r_{1n}/r_{11}&r_{2n}/r_{22}&\cdots &1\end{bmatrix}}{\begin{bmatrix}r_{11}&0&\cdots &0\\0&r_{22}&\cdots &0\\\vdots &\vdots &&\vdots \\0&0&\cdots &r_{nn}\end{bmatrix}}.

R = LD, where L is a lower triangular matrix with a unit diagonal and D is the diagonal matrix whose diagonal entries are the r_ii ’s. Hence D has a positive diagonal and hence D is non-singular. Hence D² is a non-singular diagonal matrix. Also, L^T is an upper triangular matrix with a unit diagonal. Consequently, A = LD²L^T is the LDU factorization for A, and thus the pivots must be positive because they are the diagonal entries in D².

Uniqueness of the Cholesky decomposition: If we have another Cholesky decomposition A = R₁R₁^T of A, where R₁ is lower triangular with a positive diagonal, then similar to the above we may write R₁ = L₁D₁, where L₁ is a lower triangular matrix with a unit diagonal and D₁ is a diagonal matrix whose diagonal entries are the same as the corresponding diagonal entries of R₁. Consequently, A = L₁D₁²L₁^T is an LDU factorization for A. By the uniqueness of the LDU factorization of A, we have L₁ = L and D₁² = D². As both D₁ and D are diagonal matrices with positive diagonal entries, we have D₁ = D. Hence R₁ = L₁D₁ = LD = R. Hence A has a unique Cholesky decomposition.

Theorem III: Let A_k be the k × k leading principal submatrix of A_n×n. If A has an LU factorization A = LU, where L is a lower triangular matrix with a unit diagonal, then det(A_k) = u₁₁u₂₂ · · · u_kk, and the k-th pivot is u_kk = det(A₁) = a₁₁ for k = 1, u_kk = det(A_k)/det(A_k−1) for k = 2, 3, . . . , n, where u_kk is the (k, k)-th entry of U for all k = 1, 2, . . . , n.^[6]

Combining Theorem II with Theorem III yields:

Statement I: If the symmetric matrix A can be factored as A=R^TR where R is an upper-triangular matrix with positive diagonal entries, then all the pivots of A are positive (by Theorem II), therefore all the leading principal minors of A are positive (by Theorem III).

Statement II: If the nonsingular n × n symmetric matrix A can be factored as $A=B^{T}B$ , then the QR decomposition (closely related to Gram-Schmidt process) of B (B = QR) yields: $A=B^{T}B=R^{T}Q^{T}QR=R^{T}R$ , where Q is orthogonal matrix and R is upper triangular matrix.

As A is non-singular and $A=R^{T}R$ , it follows that all the diagonal entries of R are non-zero. Let r_jj be the (j, j)-th entry of R for all j = 1, 2, . . . , n. Then r_jj ≠ 0 for all j = 1, 2, . . . , n.

Let F be a diagonal matrix, and let f_jj be the (j, j)-th entry of F for all j = 1, 2, . . . , n. For all j = 1, 2, . . . , n, we set f_jj = 1 if r_jj > 0, and we set f_jj = -1 if r_jj < 0. Then $F^{T}F=I_{n}$ , the n × n identity matrix.

Let S=FR. Then S is an upper-triangular matrix with all diagonal entries being positive. Hence we have $A=R^{T}R=R^{T}F^{T}FR=S^{T}S$ , for some upper-triangular matrix S with all diagonal entries being positive.

Namely Statement II requires the non-singularity of the symmetric matrix A.

Combining Theorem I with Statement I and Statement II yields:

Statement III: If the real-symmetric matrix A is positive definite then A possess factorization of the form A = B^TB, where B is nonsingular (Theorem I), the expression A = B^TB implies that A possess factorization of the form A = R^TR where R is an upper-triangular matrix with positive diagonal entries (Statement II), therefore all the leading principal minors of A are positive (Statement I).

In other words, Statement III proves the "only if" part of Sylvester's Criterion for non-singular real-symmetric matrices.

Sylvester's Criterion: The real-symmetric matrix A is positive definite if and only if all the leading principal minors of A are positive.

Notes[]

^ Horn, Roger A.; Johnson, Charles R. (1985), Matrix Analysis, Cambridge University Press, ISBN 978-0-521-38632-6. See Theorem 7.2.5.
^ Carl D. Meyer, Matrix Analysis and Applied Linear Algebra. See section 7.6 Positive Definite Matrices, page 566
^ Prussing, John E. (1986), "The Principal Minor Test for Semidefinite Matrices" (PDF), Journal of Guidance, Control, and Dynamics, 9 (1): 121–122, Bibcode:1986JGCD....9..121P, doi:10.2514/3.20077, archived from the original (PDF) on 2017-01-07, retrieved 2017-09-28
^ Carl D. Meyer, Matrix Analysis and Applied Linear Algebra. See section 7.6 Positive Definite Matrices, page 558
^ Carl D. Meyer, Matrix Analysis and Applied Linear Algebra. See section 3.10 The LU Factorization, Example 3.10.7, page 154
^ Carl D. Meyer, Matrix Analysis and Applied Linear Algebra. See section 6.1 Determinants, Exercise 6.1.16, page 474

References[]

Gilbert, George T. (1991), "Positive definite matrices and Sylvester's criterion", The American Mathematical Monthly, Mathematical Association of America, 98 (1): 44–46, doi:10.2307/2324036, ISSN 0002-9890, JSTOR 2324036.
Horn, Roger A.; Johnson, Charles R. (1985), Matrix Analysis, Cambridge University Press, ISBN 978-0-521-38632-6. Theorem 7.2.5.
Carl D. Meyer (June 2000), Matrix Analysis and Applied Linear Algebra, SIAM, ISBN 0-89871-454-0.

[ref4-1] Horn, Roger A.; Johnson, Charles R. (1985), Matrix Analysis, Cambridge University Press, ISBN 978-0-521-38632-6. See Theorem 7.2.5.

[ref1b-2] Carl D. Meyer, Matrix Analysis and Applied Linear Algebra. See section 7.6 Positive Definite Matrices, page 566

[prussing-3] Prussing, John E. (1986), "The Principal Minor Test for Semidefinite Matrices" (PDF), Journal of Guidance, Control, and Dynamics, 9 (1): 121–122, Bibcode:1986JGCD....9..121P, doi:10.2514/3.20077, archived from the original (PDF) on 2017-01-07, retrieved 2017-09-28

[ref1-4] Carl D. Meyer, Matrix Analysis and Applied Linear Algebra. See section 7.6 Positive Definite Matrices, page 558

[ref2-5] Carl D. Meyer, Matrix Analysis and Applied Linear Algebra. See section 3.10 The LU Factorization, Example 3.10.7, page 154

[ref3-6] Carl D. Meyer, Matrix Analysis and Applied Linear Algebra. See section 6.1 Determinants, Exercise 6.1.16, page 474

[1]

[2]

[3]

[4]

[5]

[6]