Interlacing of Eigenvalues

by potatio, Feb 10, 2020, 10:05 AM

Cauchy's Interlacing Theorem says that the eigenvalues of submatrix are interlaced between the eigenvalues of the original real symmetric matrix.
This is a pretty famous result.

Another interesting interlacing result that we did in class a few weeks was:

Quote:

Let $B$ be a real symmetric matrix, and define:
$A = B + xx^T$ If the $n$ eigenvalues of $A$ are $\lambda_1 \geq \ldots \geq \lambda_n$ , and the $n$ eigenvalues of $B$ are $\mu_1 \geq \ldots \geq \mu_n$ , then these are interlaced as:
$\lambda_1 \geq \mu_1 \geq \ldots \lambda_i \geq \mu_i \geq \lambda_{i + 1} \ldots \lambda_n \geq \mu_n.$

The proof (which we rushed through in class, and hence this post) consists of two distinct parts:

$\lambda_i \geq \mu_i$

For any $y \in \mathbb{R}^n$ , we have:
$y^TAy = y^TBy + y^Txx^Ty = y^TBy + (x^Ty)^2 \geq y^TBy.$ Now, note the Rayleigh quotient characterization of eigenvalues:
$\lambda_i = \max_{\substack{U \leq \mathbb{R}^n \\ \dim U = i}} \min_{\substack{y \in U \\ |y| = 1}} (y^TAy)$ $\mu_i = \max_{\substack{U \leq \mathbb{R}^n \\ \dim U = i}} \min_{\substack{y \in U \\ |y| = 1}} (y^TBy)$ There is also a minimax formulation:
$\lambda_i = \min_{\substack{U \leq \mathbb{R}^n \\ \dim U = n - i + 1}} \max_{\substack{y \in U \\ |y| = 1}} (y^TAy)$ $\mu_i = \min_{\substack{U \leq \mathbb{R}^n \\ \dim U = n - i + 1}} \max_{\substack{y \in U \\ |y| = 1}} (y^TBy)$ We can use either, but we will only use the first here.
We are now done by noting that if $f \geq g$ everywhere on a common domain $X$ , then both $\min f \geq \min g$ and $\max f \geq \max g$ , applying this twice.

$\mu_i \geq \lambda_{i + 1}$

This is the harder part. The proof takes an idea similar to that for Cauchy's interlacing theorem though!
Let us define the subspace $V$ of dimension $n - 1$ :
$V = \{ y \in \mathbb{R}^n \mid x^Ty = 0 \}$ Consider a subspace $U$ of $\mathbb{R}^n$ with dimension $i$ . From this, create the subspace $U' = U \cap V$ .

Note that $\dim(U')$ can be only $i - 1$ or $i$ . How? Use the general fact that:
$n \geq \dim(U + V) = \dim(U) + \dim(V) - \dim(U \cap V)$ and $\dim(U') \leq \dim(U)$ .

Now, for every $y \in U',$
$y^TAy = y^TBy + y^Txx^Ty = y^TBy + (x^Ty)^2 = y^TBy.$
Then, as the minimum over a subset cannot be smaller than the minimum over a superset,
$\min_{y \in U'} y^TBy = \min_{y \in U'} y^TAy \geq \min_{y \in U} y^TAy.$ But we also have:
$\mu_i = \max_{\substack{S \leq \mathbb{R}^n \\ \dim S \geq i}} \min_{\substack{y \in S \\ |y| = 1}} y^TBy \geq \min_{y \in U'} y^TBy \geq \min_{y \in U} y^TAy.$ for all $U$ .
(Replacing $\dim S = i$ in the original formulation above by $\dim S \geq i$ does not change anything!)

Thus,
$\mu_i \geq \max_{\substack{U \leq \mathbb{R}^n \\ \dim U = i + 1}}\min_{y \in U} y^TAy = \lambda_{i + 1}.$ and we're done!

This post has been edited 1 time. Last edited by potatio, Feb 10, 2020, 10:06 AM

matrix

eigenvalue

interlacing

Solving a Puzzle with Rank-Nullity

by potatio, Jan 28, 2020, 11:15 AM

This post is in a different flavour, altogether. This is an interesting problem someone posted online:

Quote:

You have a set of $2n+1$ stones with positive real masses. Suppose that every subset of size $2n$ of these stones can be split into $2$ sets of equal mass, each containing $n$ stones. Prove that all stones have the same mass.

The solution is quite nice, hence the post.
Consider the matrix $A$ where the $i\textsuperscript{th}$ row is formed by removing the $i\textsuperscript{th}$ stone, and indicating $a_{ij} = \pm 1$ which group the $j\textsuperscript{th}$ stone belongs to ( $a_{ii} = 0$ as this stone does not take part).

Note that the condition that each subset has $n$ stones gives us that each row of $A$ sums up to $0$ .
Therefore, the all-ones vector $\vec{1}$ of size $2n + 1$ is in the null-space of $A$ .
$A \times \vec{1} = \vec{0}$ As the dimension of its null-space is atleast $1$ , rank-nullity tells us that $A$ can have rank atmost $2n$ . We will now show that $A$ has rank exactly $2n$ . Why does this prove the required result? Because the weight vector $\vec{w}$ by definition must be in the null-space of $A$ . But the null-space of $A$ is spanned by $\vec{1}$ ! This means all the weights are equal as $\vec{w} = w \cdot \vec{1}$ !

Consider the submatrix $B$ obtained by deleting the last column and last row of $A$ . Thus, $B$ has size $2n \times 2n$ . The crucial property of $B$ that we will use is that $B$ is invertible!

Proof of invertibility

Then, $B$ has rank $2n$ as it is full-rank. On adding the deleted row and deleted column back to get $A$ , the rank is unaffected. (In general, the rank of a matrix is $\geq$ the rank of any submatrix). Hence, the rank of $A$ is $2n$ , and following above, all the weights are equal.

This post has been edited 12 times. Last edited by potatio, Jan 28, 2020, 8:40 PM

Making Sense of Finite Fields

by potatio, Jan 27, 2020, 4:41 PM

This is going to be a series of questions (that I had) with answers (that I found out/looked up) regarding Galois fields.
Our last post discussed that every Galois Field has $p^n$ elements for some $n \in \mathbb{N}$ .

Q. Consider the Galois Field $GF(p)$ . This is isomorphic to (ie, can be thought of as) $\mathbb{Z}_p$ . What about $GF(p^n)?$ Is that isomorphic to $\mathbb{Z}_{p^n}?$
A. No! This is because $\mathbb{Z}_{p^n}$ has zero divisors, but $GF(p^n)$ cannot, as it is a field.

Q. Okay, so what are the elements of $GF(p^n)?$ How would you describe them?
A. They can be thought of as polynomials, each of whose coefficients come from $\mathbb{Z}_p$ .

Q. What are these polynomials? Where do they come from?
A. Actually, there is no 'canonical' representation of the elements of $GF(p^n)$ as polynomials. This is because $GF(p^n)$ is isomorphic to $GF(p)[x]/{\langle f(x) \rangle}$ where $f(x)$ is an irreducible polynomial of degree $n$ . Depending on what $f$ is, you get different representations of each element!

Q. Interesting, but what is $f$ here? Why does it have to be irreducible? Why does it have to be of degree $n?$
A. Well, we can't work with $GF(p)[x]$ directly, because it is not a field! For example, the polynomial $x$ has no multiplicative inverse. (Evaluate at $0$ to show impossibility.)
Note that the elements of $GF(p)[x]/{\langle f(x) \rangle}$ are in fact, sets containing polynomials. Polynomials $a(x)$ and $b(x)$ are in the same set iff the difference $a(x) - b(x)$ is a multiple of $f(x)$ . As $f$ has degree $n$ , we can characterize each set by a representative element $c_{n - 1}x^{n - 1} + c_{n - 2}x^{n - 2} + \cdots + c_0$ . There are $p$ choices for each coefficient $c_i$ (as these come from $\mathbb{Z}_p$ ) and $n$ of them totally, so we have $p^n$ elements, as required. Each choice of $c_i$ gives rise to a distinct set.
We also need no zero divisors in this construction. This follows exactly because $f$ is irreducible!

The choice of $f$ actually gives us a basis to represent $GF(p^n)$ - look at the coefficients $c_i$ above. As we know, changing the basis only changes the way we view elements, not the actual elements themselves.

This post has been edited 1 time. Last edited by potatio, Jan 27, 2020, 5:14 PM

field

galois

Finite Fields and Cauchy's Theorem

by potatio, Jan 24, 2020, 1:24 PM

Today, we were studying Galois Fields, and I've realized I've forgotten almost everything from Modern Algebra that I took almost a year ago.

I've always known that fields are quite restricted because of the many constraints on the operations on the field elements. (Compare to a group which can be really anything). But I've never studied fields in depth. Today I learned that a finite field (a Galois field) has order $p^n$ for some prime $p$ . This is extremely restrictive!

One of the easiest ways to prove this is by Cauchy's Theorem. (I had forgotten all about this.)
Cauchy's Theorem states that if a prime $p$ divides the order (size) of a group $G$ , then $G$ must have an element of order $p$ .

Recall the order of an element is the smallest number of times we must repeatedly apply the group operation on it to get the identity of the group. For a finite group, every element has finite order, because the group is closed under the group operation. In fact, Lagrange's theorem states that the order of an element always divides the order of the group!

Anyways, back to Cauchy's Theorem. I will only provide here the proof for when $G$ is an abelian group (when the group operation is commutative). Why?

The proof is simpler. The proof that I know of the non-abelian case requires knowledge of the Class Equation for Groups.
We won't need the more general case for the proof of the above theorem on finite fields.

But you can easily find a full proof of Cauchy's Theorem online.

Proof of Cauchy's Theorem for abelian groups

The theorem on finite fields follows now from the following facts:

A finite field $(F, +, \times)$ has prime characteristic $p$ . This follows from the fact that a field has no zero divisors.
Every element in the additive group $(F, +)$ has order $p$ . This follows from the definition of a characteristic and using distributivity of multiplication over addition.

Why? Note that as $(F, +, \times)$ is a field, $(F, +)$ is an abelian group by the field axioms. If a prime $q \neq p$ divided the order of $(F, +)$ , by Cauchy's Theorem, there must be an element in $(F, +)$ that had order $q$ , a contradiction to the second fact above.

Thus, $|F| = |(F, +)| = p^n$ for some $n$ !

This post has been edited 11 times. Last edited by potatio, Jan 28, 2020, 12:21 PM

Menger's Theorem with Max-Flow Min-Cut

by potatio, Jan 23, 2020, 8:41 AM

Once we have proved Max-Flow Min-Cut, we can use this to prove a variety of results. The work below can be thought of as a fleshing out of the ideas in Flows in Graphs.

Menger's Theorem states that the minimum number of edges whose removal is required to separate vertices $s$ and $t$ in an undirected graph $G$ is equal to the maximum number of edge-disjoint paths from $s$ to $t$ .

A preliminary note is that, given a set of edge-disjoint paths from $s$ to $t$ , we need to remove at least one edge from each of the paths in this set. Otherwise, we could just use the untouched path to go from $s$ to $t$ . Menger's Theorem says that this is sufficient.

Let us create a flow network $G_F$ from $G$ by splitting undirected edges $\{x, y\}$ into directed edges $(x, y)$ and $(y, x)$ , and assigning each directed edge, a capacity of $1$ .

We prove this by two lemmas:

$G_F$ has a $0- 1$ flow of value $k$ iff there are $k$ edge-disjoint paths from $s$ to $t$ in $G$ .
Proof
Note that the 'if' direction is easy: the $k$ paths never intersect, so a flow of value $k$ exists.
For the 'only if' direction, we show something even stronger: we can find $k$ edge-disjoint paths using only the corresponding edges in $G_F$ where the flow is $1$ .
The idea is to use induction on the tuple $($ size of set of edges where flow $= 1, k)$ . Starting from $s$ , keep following an edge with flow $1$ . When we visit intermediate nodes that are not $s$ or $t$ , there must be an outgoing edge with flow $1$ because of flow conservation. Continuing, we either reach $t$ , or cycle back to a vertex already seen. In either case, we can zero out the flow (along the $s - t$ path, or the cycle), and use the induction hypothesis.
If an $s-t$ cut in $G_F$ has capacity $k$ , then removing $k$ edges is enough to disconnect $s$ and $t$ .
Proof
Each edge crossing the cut from $S$ to $T$ has a capacity of $1$ , so just removing these edges will make it impossible to go from $s \in S$ to $t \in T$ .

We will need the contrapositive of the second lemma: if $k$ edges are required to be removed to disconnect $s$ and $t$ , every cut must have capacity $\geq k$ .

These two lemmas along with Max-Flow Min-Cut are enough to prove Menger's Theorem!
As the capacities are all integers, we know that an optimal flow exists with only integer values. Let the value of the optimal integral flow $f^*$ be $k$ . By the first lemma, we know that this is the maximal number of edge-disjoint paths from $s$ to $t$ .
By the contrapositive of the second lemma, the capacity of the minimum cut in $G_F$ is $\geq$ the size of the minimum set of edges to disconnect $s$ and $t$ , which is $\geq$ the maximum number of edge-disjoint paths from $s$ to $t$ (by our very first observation) $=$ maximum $0-1$ flow in the graph $G_F$ .

But now, Max-Flow Min-Cut says that we must have equality everywhere! This proves the theorem.

This post has been edited 1 time. Last edited by potatio, Jan 23, 2020, 8:41 AM

menger

maxflow

mincut

Max-Flow Min-Cut with Linear Programming

by potatio, Jan 22, 2020, 7:23 PM

This work is based on Totally Unimodular Matrices. However, they use a different notation there, so I just wanted to rewrite their proof in my own words.

Define the incidence matrix $A$ of a directed graph $G$ as:
$A_{ij} = \begin{cases} 1, & \text{if edge } j \text{ originates at vertex } i \\ -1, & \text{if edge } j \text{ ends at vertex } i \\ 0, & \text{otherwise} \\ \end{cases}$ Thus, the dimensions of $A$ are $|V| \times |E|$ .

Now, let's move on to flows. Let $c$ be the capacity vector with all non-negative integer entries.
Let the row in $A$ corresponding to the source vertex $s$ be $w$ . Let $A'$ be $A$ with rows corresponding to $s$ and $t$ removed.
Then, maximum flow can be written as the primal linear program:
$\begin{align*} \max w^Tf \\ \text{such that } f &\leq c, \\ f &\geq 0, \\ A'f &= 0. \end{align*}$ Then, the dual linear program corresponds to:
$\begin{align*} \min c^Td \\ \text{such that } d &\geq 0, \\ z &\in \mathbb{R}, \\ A'^Tz + d &\geq w. \end{align*}$
$z$ is actually a vector of size $|V| - 2$ . It has one variable for each vertex that is not $s$ nor $t$ . This is inconvenient, so we add $z_s = -1$ and $z_t = 0$ .

This changes the first dual constraint to $A^Tz + d \geq 0$ .
Now, we need some facts (that will not be proven here but these proofs are available in the link above):

A matrix $M$ is said to be totally unimodular if every square submatrix of $M$ has determinant $-1, 0$ or $1$ .
A non-singular submatrix of a totally unimodular matrix $M$ has an integer inverse.
The vertices of the convex polytope $Mx \leq b$ where $b$ has integer entries have all integer entries too.
$A$ is totally unimodular for any graph $G$ .

Note that both the primal and dual are feasible, hence, both have optimal solutions. The strong law of LP duality tells us that the optimal values are actually the same.

Note that for an LP, a vertex of the feasible region must be in the optimal solution set (think of the gradient of the objective function and how it 'pulls').This shows that an optimal flow exists with integer values if all capacities are integers!

Thus, we can consider the optimal solutions of the primal and dual to be occurring at vertices of the respective feasible regions. Let $f^*$ and $(z^*, d^*)$ be the integer-valued optimal solutions respectively.
We have, looking at the constraint in the dual for edge $(u, v)$ , and the feasibility condition,
$d^*_{uv} \geq \max(0, z^*_v - z^*_u).$ As the dual is a minimization problem, and $c \geq 0$ , it will always be optimal to choose equality in the above inequality. However, we will not need this fact directly.

We need to relate these variables in the dual to a $s-t$ cut separating $s$ and $t$ .
Define $S = \{ u \in V, z^*_u \leq -1\}$ . Then, $s \in S$ and $t \in V - S$ , so this is a valid $s-t$ cut.
Consider an edge $(u, v)$ crossing this cut. As $z^*$ are all integers, the above gives (with the definition of $S$ and $V - S$ ):
$d^*_{uv} \geq 1.$ Thus, the optimal value of the dual is
$c^Td^* \geq \sum_{e \text{ crossing } } c_e = \text{ capacity of } s-t \text{ cut }\geq \text{ maximum flow out of } S = w^Tx^*,$ which is the optimal value of the primal.

But, we know that these optimal values are the same! Hence, we must have equality everywhere. This means:
$d_{ij} = \begin{cases} 1, & \text{if } i \in S \text{ and } j \in T \\ 0, & \text{otherwise} \\ \end{cases}$ $z_{i} = \begin{cases} -1, & \text{if } i \in S \\ 0, & \text{otherwise} \\ \end{cases}$