Min-max theorem

In linear algebra and functional analysis, the min-max theorem, or variational theorem, or Courant–Fischer–Weyl min-max principle, is a result that gives a variational characterization of eigenvalues of compact Hermitian operators on Hilbert spaces. It can be viewed as the starting point of many results of similar nature.

This article first discusses the finite-dimensional case and its applications before considering compact operators on infinite-dimensional Hilbert spaces. We will see that for compact operators, the proof of the main theorem uses essentially the same idea from the finite-dimensional argument.

In the case that the operator is non-Hermitian, the theorem provides an equivalent characterization of the associated singular values. The min-max theorem can be extended to self-adjoint operators that are bounded below.

Matrices[]

Let $A$ be a $n \times n$ Hermitian matrix. As with many other variational results on eigenvalues, one considers the Rayleigh–Ritz quotient $RA : Cn \ {0} → R$ defined by

R_{A}(x)={\frac {(Ax,x)}{(x,x)}}

where $(\cdot, \cdot)$ denotes the Euclidean inner product on $C n$ . Clearly, the Rayleigh quotient of an eigenvector is its associated eigenvalue. Equivalently, the Rayleigh–Ritz quotient can be replaced by

f(x)=(Ax,x),\;\|x\|=1.

For Hermitian matrices A, the range of the continuous function R_A(x), or f(x), is a compact subset [a, b] of the real line. The maximum b and the minimum a are the largest and smallest eigenvalue of A, respectively. The min-max theorem is a refinement of this fact.

Min-max theorem[]

Let $A$ be an $n \times n$ Hermitian matrix with eigenvalues $λ 1 \leq ... \leq λ k \leq ... \leq λ n$ then

\lambda _{k}=\min _{U}\{\max _{x}\{R_{A}(x)\mid x\in U{\text{ and }}x\neq 0\}\mid \dim(U)=k\}

and

\lambda _{k}=\max _{U}\{\min _{x}\{R_{A}(x)\mid x\in U{\text{ and }}x\neq 0\}\mid \dim(U)=n-k+1\}

in particular,

\lambda _{1}\leq R_{A}(x)\leq \lambda _{n}\quad \forall x\in \mathbf {C} ^{n}\backslash \{0\}

and these bounds are attained when $x$ is an eigenvector of the appropriate eigenvalues.

Also the simpler formulation for the maximal eigenvalue λ_n is given by:

\lambda _{n}=\max\{R_{A}(x):x\neq 0\}.

Similarly, the minimal eigenvalue λ₁ is given by:

\lambda _{1}=\min\{R_{A}(x):x\neq 0\}.

Proof —

Since the matrix $A$ is Hermitian it is diagonalizable and we can choose an orthonormal basis of eigenvectors {u₁, ..., u_n} that is, u_i is an eigenvector for the eigenvalue λ_i and such that (u_i, u_i) = 1 and (u_i, u_j) = 0 for all i ≠ j.

If U is a subspace of dimension k then its intersection with the subspace $span{u k, ..., u n}$ isn't zero (by simply checking dimensions) and hence there exists a vector $v \neq 0$ in this intersection that we can write as

v=\sum _{i=k}^{n}\alpha _{i}u_{i}

and whose Rayleigh quotient is

R_{A}(v)={\frac {\sum _{i=k}^{n}\lambda _{i}\alpha _{i}^{2}}{\sum _{i=k}^{n}\alpha _{i}^{2}}}\geq \lambda _{k}

(as all $\lambda _{i}\geq \lambda _{k}$ for i=k,..,n) and hence

\max\{R_{A}(x)\mid x\in U\}\geq \lambda _{k}

Since this is true for all U, we can conclude that

\min\{\max\{R_{A}(x)\mid x\in U{\text{ and }}x\neq 0\}\mid \dim(U)=k\}\geq \lambda _{k}

This is one inequality. To establish the other inequality, chose the specific k-dimensional space $V = span{u 1, ..., u k}$ , for which

\max\{R_{A}(x)\mid x\in V{\text{ and }}x\neq 0\}\leq \lambda _{k}

because $\lambda _{k}$ is the largest eigenvalue in V. Therefore, also

\min\{\max\{R_{A}(x)\mid x\in U{\text{ and }}x\neq 0\}\mid \dim(U)=k\}\leq \lambda _{k}

In the case where U is a subspace of dimension n-k+1, we proceed in a similar fashion: Consider the subspace of dimension k, $span{u 1, ..., u k}.$ Its intersection with the subspace U isn't zero (by simply checking dimensions) and hence there exists a vector v in this intersection that we can write as

v=\sum _{i=1}^{k}\alpha _{i}u_{i}

and whose Rayleigh quotient is

R_{A}(v)={\frac {\sum _{i=1}^{k}\lambda _{i}\alpha _{i}^{2}}{\sum _{i=1}^{k}\alpha _{i}^{2}}}\leq \lambda _{k}

and hence

\min\{R_{A}(x)\mid x\in U\}\leq \lambda _{k}

Since this is true for all U, we can conclude that

\max\{\min\{R_{A}(x)\mid x\in U{\text{ and }}x\neq 0\}\mid \dim(U)=n-k+1\}\leq \lambda _{k}

Again, this is one part of the equation. To get the other inequality, note again that the eigenvector u of $\lambda _{k}$ is contained in $U = span{u k, ..., u n}$ so that we can conclude the equality.

Counterexample in the non-Hermitian case[]

Let N be the nilpotent matrix

{\begin{bmatrix}0&1\\0&0\end{bmatrix}}.

Define the Rayleigh quotient $R_{N}(x)$ exactly as above in the Hermitian case. Then it is easy to see that the only eigenvalue of N is zero, while the maximum value of the Rayleigh ratio is $.mw-parser-output .sfrac{white-space:nowrap}.mw-parser-output .sfrac.tion,.mw-parser-output .sfrac .tion{display:inline-block;vertical-align:-0.5em;font-size:85%;text-align:center}.mw-parser-output .sfrac .num,.mw-parser-output .sfrac .den{display:block;line-height:1em;margin:0 0.1em}.mw-parser-output .sfrac .den{border-top:1px solid}.mw-parser-output .sr-only{border:0;clip:rect(0,0,0,0);height:1px;margin:-1px;overflow:hidden;padding:0;position:absolute;width:1px}1/2$ . That is, the maximum value of the Rayleigh quotient is larger than the maximum eigenvalue.

Applications[]

Min-max principle for singular values[]

The singular values {σ_k} of a square matrix M are the square roots of the eigenvalues of M*M (equivalently MM*). An immediate consequence^{[citation needed]} of the first equality in the min-max theorem is:

\sigma _{k}^{\uparrow }=\min _{S:\dim(S)=k}\max _{x\in S,\|x\|=1}(M^{*}Mx,x)^{\frac {1}{2}}=\min _{S:\dim(S)=k}\max _{x\in S,\|x\|=1}\|Mx\|.

Similarly,

\sigma _{k}^{\uparrow }=\max _{S:\dim(S)=n-k+1}\min _{x\in S,\|x\|=1}\|Mx\|.

Here $\sigma _{k}=\sigma _{k}^{\uparrow }$ denotes the k^th entry in the increasing sequence of σ's, so that $\sigma _{1}\leq \sigma _{2}\leq \cdots$ .

Cauchy interlacing theorem[]

Let $A$ be a symmetric n × n matrix. The m × m matrix B, where m ≤ n, is called a compression of $A$ if there exists an orthogonal projection P onto a subspace of dimension m such that PAP* = B. The Cauchy interlacing theorem states:

Theorem. If the eigenvalues of

A

are

α 1 \leq ... \leq α n

, and those of B are

β 1 \leq ... \leq β j \leq ... \leq β m

, then for all

j \leq m

,

\alpha _{j}\leq \beta _{j}\leq \alpha _{n-m+j}.

This can be proven using the min-max principle. Let β_i have corresponding eigenvector b_i and S_j be the j dimensional subspace $S j = span{b 1, ..., b j},$ then

\beta _{j}=\max _{x\in S_{j},\|x\|=1}(Bx,x)=\max _{x\in S_{j},\|x\|=1}(PAP^{*}x,x)\geq \min _{S_{j}}\max _{x\in S_{j},\|x\|=1}(A(P^{*}x),P^{*}x)=\alpha _{j}.

According to first part of min-max, $α j \leq β j .$ On the other hand, if we define $S m - j +1 = span{b j, ..., b m},$ then

\beta _{j}=\min _{x\in S_{m-j+1},\|x\|=1}(Bx,x)=\min _{x\in S_{m-j+1},\|x\|=1}(PAP^{*}x,x)=\min _{x\in S_{m-j+1},\|x\|=1}(A(P^{*}x),P^{*}x)\leq \alpha _{n-m+j},

where the last inequality is given by the second part of min-max.

When $n - m = 1$ , we have $α j \leq β j \leq α j +1$ , hence the name interlacing theorem.

Compact operators[]

Let $A$ be a compact, Hermitian operator on a Hilbert space H. Recall that the spectrum of such an operator (the set of eigenvalues) is a set of real numbers whose only possible cluster point is zero. It is thus convenient to list the positive eigenvalues of $A$ as

\cdots \leq \lambda _{k}\leq \cdots \leq \lambda _{1},

where entries are repeated with multiplicity, as in the matrix case. (To emphasize that the sequence is decreasing, we may write $\lambda _{k}=\lambda _{k}^{\downarrow }$ .) When H is infinite-dimensional, the above sequence of eigenvalues is necessarily infinite. We now apply the same reasoning as in the matrix case. Letting S_k ⊂ H be a k dimensional subspace, we can obtain the following theorem.

Theorem (Min-Max). Let

A

be a compact, self-adjoint operator on a Hilbert space

H

, whose positive eigenvalues are listed in decreasing order

... \leq λ k \leq ... \leq λ 1

. Then:

{\begin{aligned}\max _{S_{k}}\min _{x\in S_{k},\|x\|=1}(Ax,x)&=\lambda _{k}^{\downarrow },\\\min _{S_{k-1}}\max _{x\in S_{k-1}^{\perp },\|x\|=1}(Ax,x)&=\lambda _{k}^{\downarrow }.\end{aligned}}

A similar pair of equalities hold for negative eigenvalues.

Proof —

Let S' be the closure of the linear span $S'=\operatorname {span} \{u_{k},u_{k+1},\ldots \}$ . The subspace S' has codimension k − 1. By the same dimension count argument as in the matrix case, S' ∩ S_k is non empty. So there exists x ∈ S' ∩ S_k with $\|x\|=1$ . Since it is an element of S' , such an x necessarily satisfy

(Ax,x)\leq \lambda _{k}.

Therefore, for all S_k

\inf _{x\in S_{k},\|x\|=1}(Ax,x)\leq \lambda _{k}

But $A$ is compact, therefore the function f(x) = (Ax, x) is weakly continuous. Furthermore, any bounded set in H is weakly compact. This lets us replace the infimum by minimum:

\min _{x\in S_{k},\|x\|=1}(Ax,x)\leq \lambda _{k}.

So

\sup _{S_{k}}\min _{x\in S_{k},\|x\|=1}(Ax,x)\leq \lambda _{k}.

Because equality is achieved when $S_{k}=\operatorname {span} \{u_{1},\ldots ,u_{k}\}$ ,

\max _{S_{k}}\min _{x\in S_{k},\|x\|=1}(Ax,x)=\lambda _{k}.

This is the first part of min-max theorem for compact self-adjoint operators.

Analogously, consider now a $(k - 1)$ -dimensional subspace S_k−1, whose the orthogonal complement is denoted by S_k−1^⊥. If S' = span{u₁...u_k},

S'\cap S_{k-1}^{\perp }\neq {0}.

So

\exists x\in S_{k-1}^{\perp }\,\|x\|=1,(Ax,x)\geq \lambda _{k}.

This implies

\max _{x\in S_{k-1}^{\perp },\|x\|=1}(Ax,x)\geq \lambda _{k}

where the compactness of A was applied. Index the above by the collection of k-1-dimensional subspaces gives

\inf _{S_{k-1}}\max _{x\in S_{k-1}^{\perp },\|x\|=1}(Ax,x)\geq \lambda _{k}.

Pick S_k−1 = span{u₁, ..., u_k−1} and we deduce

\min _{S_{k-1}}\max _{x\in S_{k-1}^{\perp },\|x\|=1}(Ax,x)=\lambda _{k}.

Self-adjoint operators[]

The min-max theorem also applies to (possibly unbounded) self-adjoint operators.^[1]^[2] Recall the essential spectrum is the spectrum without isolated eigenvalues of finite multiplicity. Sometimes we have some eigenvalues below the essential spectrum, and we would like to approximate the eigenvalues and eigenfunctions.

Theorem (Min-Max). Let A be self-adjoint, and let

E_{1}\leq E_{2}\leq E_{3}\leq \cdots

be the eigenvalues of A below the essential spectrum. Then

$E_{n}=\min _{\psi _{1},\ldots ,\psi _{n}}\max\{\langle \psi ,A\psi \rangle :\psi \in \operatorname {span} (\psi _{1},\ldots ,\psi _{n}),\,\|\psi \|=1\}$ .

If we only have N eigenvalues and hence run out of eigenvalues, then we let $E_{n}:=\inf \sigma _{ess}(A)$ (the bottom of the essential spectrum) for n>N, and the above statement holds after replacing min-max with inf-sup.

Theorem (Max-Min). Let A be self-adjoint, and let

E_{1}\leq E_{2}\leq E_{3}\leq \cdots

be the eigenvalues of A below the essential spectrum. Then

$E_{n}=\max _{\psi _{1},\ldots ,\psi _{n-1}}\min\{\langle \psi ,A\psi \rangle :\psi \perp \psi _{1},\ldots ,\psi _{n-1},\,\|\psi \|=1\}$ .

If we only have N eigenvalues and hence run out of eigenvalues, then we let $E_{n}:=\inf \sigma _{ess}(A)$ (the bottom of the essential spectrum) for n > N, and the above statement holds after replacing max-min with sup-inf.

The proofs^[1]^[2] use the following results about self-adjoint operators:

Theorem. Let A be self-adjoint. Then

(A-E)\geq 0

for

E\in \mathbb {R}

if and only if

\sigma (A)\subseteq [E,\infty )

.^[1]^: 77

Theorem. If A is self-adjoint, then

$\inf \sigma (A)=\inf _{\psi \in {\mathfrak {D}}(A),\|\psi \|=1}\langle \psi ,A\psi \rangle$

and

$\sup \sigma (A)=\sup _{\psi \in {\mathfrak {D}}(A),\|\psi \|=1}\langle \psi ,A\psi \rangle$ .^[1]^: 77

References[]

^ Jump up to: ^a ^b ^c ^d G. Teschl, Mathematical Methods in Quantum Mechanics (GSM 99) https://www.mat.univie.ac.at/~gerald/ftp/book-schroe/schroe.pdf
^ Jump up to: ^a ^b Lieb; Loss (2001). Analysis. GSM. 14 (2nd ed.). Providence: American Mathematical Society. ISBN 0-8218-2783-9.

M. Reed and B. Simon, Methods of Modern Mathematical Physics IV: Analysis of Operators, Academic Press, 1978.

[teschl-1] Jump up to: ^a ^b ^c ^d G. Teschl, Mathematical Methods in Quantum Mechanics (GSM 99) https://www.mat.univie.ac.at/~gerald/ftp/book-schroe/schroe.pdf

[lieb-loss-2] Jump up to: ^a ^b Lieb; Loss (2001). Analysis. GSM. 14 (2nd ed.). Providence: American Mathematical Society. ISBN 0-8218-2783-9.

[1]

[2]

show v t Analysis in topological vector spaces
Basic concepts	Abstract Wiener space Bochner space Convex series
Derivatives	Differentiable vector-valued functions from Euclidean space Differentiation in Fréchet spaces Fréchet Gateaux functional holomorphic quasi
Measurability	Measures (Lebesgue Projection-valued Vector) Bochner / Weakly / Strongly measurable function
Integrals	Bochner Dunford Pettis/Gelfand–Pettis/Weak regulated Paley–Wiener
Main results	Inverse function theorem (Nash–Moser theorem)
Functional calculus	Borel functional calculus Continuous functional calculus Holomorphic functional calculus

show v t Spectral theory and ^*-algebras
Basic concepts	Involution/-algebra Banach algebra B-algebra C-algebra Noncommutative topology Projection-valued measure Spectrum Spectrum of a C-algebra Spectral radius Operator space
Main results	Gelfand–Mazur theorem Gelfand–Naimark theorem Gelfand representation Polar decomposition Singular value decomposition Spectral theorem Spectral theory of normal C*-algebras
Special Elements/Operators	Isospectral Normal operator Hermitian/Self-adjoint operator Unitary operator Unit
Spectrum	Krein–Rutman theorem Normal eigenvalue Spectrum of a C*-algebra Spectral radius Spectral asymmetry Spectral gap
Decomposition of a spectrum	(Continuous Point Residual) Approximate point Compression Discrete Spectral abscissa
Spectral Theorem	Borel functional calculus Min-max theorem Positive operator-valued measure Projection-valued measure Riesz projector Rigged Hilbert space Spectral theorem Spectral theory of compact operators Spectral theory of normal C*-algebras
Special algebras	Amenable Banach algebra With an Approximate identity Banach function algebra Disk algebra Uniform algebra Von Neumann algebra Tomita–Takesaki theory
Finite-Dimensional	Alon–Boppana bound Bauer–Fike theorem Numerical range Schur–Horn theorem
Generalizations	Dirac spectrum Essential spectrum Pseudospectrum Structure space (Shilov boundary)
Miscellaneous	Abstract index group Banach algebra cohomology Cohen–Hewitt factorization theorem Extensions of symmetric operators Limiting absorption principle Unbounded operator
Examples	Wiener algebra
Applications	Almost Mathieu operator Corona theorem Hearing the shape of a drum (Dirichlet eigenvalue) Heat kernel Kuznetsov trace formula Lax pair Proto-value function Ramanujan graph Rayleigh–Faber–Krahn inequality Spectral geometry Spectral method Spectral theory of ordinary differential equations Sturm–Liouville theory Superstrong approximation Transfer operator Transform theory Weyl law