Difference between revisions of "Cauchy-Schwarz Inequality"

m (spacing, caps)
(rewrote)
Line 1: Line 1:
The '''Cauchy-Schwarz Inequality''' (which is known by other names, including Cauchy's Inequality) states that, for two sets of real numbers <math>a_1,a_2,\ldots,a_n</math> and <math>b_1,b_2,\ldots,b_n</math>, the following [[inequality]] is always true:
+
The '''Cauchy-Schwarz Inequality''' (which is known by other names, including '''Cauchy's Inequality''', '''Schwarz's Inequality''', and the '''Cauchy-Bunyakovsky-Schwarz Inequality''') is a well-known [[inequality]] with many elegant applications.
  
<math> \displaystyle ({a_1}^2+{a_2}^2+ \cdots +{a_n}^2)({b_1}^2+{b_2}^2+ \cdots +{b_n}^2) \geq (a_1b_1+a_2b_2+ \cdots +a_nb_n)^2</math>
+
== Elementary Form ==
  
[[Equality condition|Equality]] holds if and only if <math>\frac{a_1}{b_1}=\frac{a_2}{b_2}= \cdots =\frac{a_n}{b_n}</math>.
+
For any real numbers <math> a_1, \ldots, a_n </math> and <math> b_1, \ldots, b_n </math>,
 +
<center>
 +
<math>
 +
\left( \sum_{i=1}^{n}a_ib_i \right)^2 \le \sum_{i=1}^{n}a_i^2 \sum_{i=1}^{n}b_i^2
 +
</math>,
 +
</center>
 +
with equality when there exist constants <math> \displaystyle \mu, \lambda </math> not both zero such that for all <math> 1 \le i \le n </math>, <math> \displaystyle \mu a_i = \lambda b_i </math>.
  
== Proof ==
+
=== Proof ===
There are many ways to prove this; one of the more well-known is to consider the equation
 
  
<math> \displaystyle (a_1x+b_1)^2+(a_2x+b_2)^2+ \cdots +(a_nx+b_n)^2=0</math>.
+
There are several proofs; we will present an elegant one that does not generalize.
  
Expanding, we find the equation to be of the form
+
Consider the vectors <math> \mathbf{a} = \langle a_1, \ldots a_n \rangle </math> and <math> {} \mathbf{b} = \langle b_1, \ldots b_n \rangle </math>.  If <math> \displaystyle \theta </math> is the [[angle]] formed by <math> \mathbf{a} </math> and <math> \mathbf{b} </math>, then the left-hand side of the inequality is equal to the square of the [[dot product]] of <math> \mathbf{a} </math> and <math> \mathbf{b} </math>, or <math> \left( ||\mathbf{a}|| \cdot ||\mathbf{b}|| \cos\theta \right)^2 </math>.  The right hand side of the inequality is equal to <math> \left( ||\mathbf{a}|| \cdot ||\mathbf{b}|| \right)^2 </math>.  The inequality then follows from <math> |\cos\theta | \le 1 </math>, with equality when one of <math> \mathbf{a,b} </math> is a multiple of the other, as desired.
  
<math> \displaystyle Ax^2 + Bx + C,</math>
+
=== Complex Form ===
  
where <math>A=\sum_{i=1}^n a_i^2</math>, <math>B=2\sum_{j=1}^n a_jb_j</math>, and <math>C=\sum_{k=1}^n b_k^2.</math>.  By the [[Trivial inequality | Trivial Inequality]], we know that the left-hand-side of the original equation is always at least 0, so either both roots are [[complex numbers]], or there is a double root at <math>x=0</math>.  Either way, the [[discriminant]] of the equation is nonpositive. Taking the [[discriminant]], <math>B^2-4AC \leq 0</math> and substituting the above values of A, B, and C leaves us with the Cauchy-Schwarz Inequality,
+
The inequality sometimes appears in the following form.
  
<math> \displaystyle (a_1b_1+a_2b_2+ \cdots +a_nb_n)^2 \leq (a_1^2+a_2^2+ \cdots +a_n^2)(b_1^2+b_2^2+ \cdots +b_n^2),</math>
+
Let <math> a_1, \ldots, a_n </math> and <math> b_1, \ldots, b_n </math> be [[complex numbers]].  Then
 +
<center>
 +
<math>
 +
\left| \sum_{i=1}^na_ib_i \right|^2 \le \sum_{i=1}^{n}|a_i^2| \sum_{i=1}^n |b_i^2|
 +
</math>.
 +
</center>
 +
This appears to be more powerful, but it follows immediately from
 +
<center>
 +
<math>
 +
\left| \sum_{i=1}^n a_ib_i \right| ^2 \le \left( \sum_{i=1}^n |a_i| \cdot |b_i| \right)^2 \le \sum_{i=1}^n |a_i^2| \sum_{i=1}^n |b_i^2|
 +
</math>.
 +
</center>
  
or, in the more compact [[sigma notation]],
+
== General Form ==
  
<math>{\left(\sum a_ib_i\right)}^2 \leq \left(\sum a_i^2\right)\left(\sum b_i^2\right).</math>  
+
Let <math> \displaystyle V </math> be a [[vector space]], and let <math> \langle \cdot, \cdot \rangle : V \times V \mapsto \mathbb{R} </math> be an [[inner product]].  Then for any <math> \mathbf{a,b} \in V </math>,
+
<center>
Note that this also gives us the equality case; equality holds if and only if the discriminant is equal to 0, which is true if and only if the equation has 0 as a double root, which is true if and only if <math>\frac{a_1}{b_1}=\frac{a_2}{b_2}= \cdots =\frac{a_n}{b_n}</math>.
+
<math>
 +
\langle \mathbf{a,b} \rangle^2 \le \langle \mathbf{a,a} \rangle \langle \mathbf{b,b} \rangle
 +
</math>,
 +
</center>
 +
with equality if and only if there exist constants <math> \displaystyle \mu, \lambda </math> not both zero such that <math> \mu\mathbf{a} = \lambda\mathbf{b} </math>.
  
== Contest Problem Solving ==
+
=== Proof 1 ===
This inequality is used very frequently to solve Olympiad-level Inequality problems, such as those on the [[United States of America Mathematics Olympiad | USAMO]] and [[International Mathematics Olympiad | IMO]].
+
 
 +
Consider the polynomial of <math> \displaystyle t </math>
 +
<center>
 +
<math>
 +
\langle t\mathbf{a + b}, t\mathbf{a + b} \rangle = t^2\langle \mathbf{a,a} \rangle + 2t\langle \mathbf{a,b} \rangle + \langle \mathbf{b,b} \rangle
 +
</math>.
 +
</center>
 +
This must always be greater than or equal to zero, so it must have a non-positive discriminant, i.e., <math> \langle \mathbf{a,b} \rangle^2 </math> must be less than or equal to <math> \langle \mathbf{a,a} \rangle \langle \mathbf{b,b} \rangle </math>, with equality when <math> \mathbf{a = 0} </math> or when there exists some scalar <math> \displaystyle -t </math> such that <math> -t\mathbf{a} = \mathbf{b} </math>, as desired.
 +
 
 +
=== Proof 2 ===
 +
 
 +
We consider
 +
<center>
 +
<math>
 +
\langle \mathbf{a-b, a-b} \rangle = \langle \mathbf{a,a} \rangle + \langle \mathbf{b,b} \rangle - 2 \langle \mathbf{a,b} \rangle
 +
</math>.
 +
</center>
 +
Since this is always greater than or equal to zero, we have
 +
<center>
 +
<math>
 +
\langle \mathbf{a,b} \rangle \le \frac{1}{2} \langle \mathbf{a,a} \rangle + \frac{1}{2} \langle \mathbf{b,b} \rangle
 +
</math>.
 +
</center>
 +
Now, if either <math> \mathbf{a} </math> or <math> \mathbf{b} </math> is equal to <math> \mathbf{0} </math>, then <math> \langle \mathbf{a,b} \rangle^2 = \langle \mathbf{a,a} \rangle \langle \mathbf{b,b} \rangle = 0 </math>.  Otherwise, we may [[normalize]] so that <math> \langle \mathbf {a,a} \rangle = \langle \mathbf{b,b} \rangle = 1 </math>, and we have
 +
<center>
 +
<math>
 +
\langle \mathbf{a,b} \rangle \le 1 = \langle \mathbf{a,a} \rangle^{1/2} \langle \mathbf{b,b} \rangle^{1/2}
 +
</math>,
 +
</center>
 +
with equality when <math>\mathbf{a} </math> and <math> \mathbf{b} </math> may be scaled to each other, as desired.
 +
 
 +
=== Examples ===
 +
 
 +
The elementary form of the Cauchy-Schwarz inequality is a special case of the general form, as is the '''Cauchy-Schwarz Inequality for Integrals''': for integrable functions <math> f,g : [a,b] \mapsto \mathbb{R} </math>,
 +
<center>
 +
<math>
 +
\left( \int_{a}^b f(x)g(x)dx \right)^2 \le \int_{a}^b [f(x)]^2dx \cdot \int_a^b [g(x)]^2 dx
 +
</math>,
 +
</center>
 +
with equality when there exist constants <math> \displaystyle \mu, \lambda </math> not both equal to zero such that for <math> t \in [a,b] </math>,
 +
<center>
 +
<math>
 +
\mu \int_a^t f(x)dx = \lambda \int_a^t g(x)dx
 +
</math>.
 +
</center>
  
 
== Other Resources ==
 
== Other Resources ==
* [http://en.wikipedia.org/wiki/Cauchy-Schwartz_inequality Wikipedia entry]
+
* [http://en.wikipedia.org/wiki/Cauchy-Schwarz_inequality Wikipedia entry]
  
 
===Books===
 
===Books===
 
* [http://www.amazon.com/exec/obidos/ASIN/052154677X/artofproblems-20 The Cauchy-Schwarz Master Class: An Introduction to the Art of Mathematical Inequalities] by J. Michael Steele.
 
* [http://www.amazon.com/exec/obidos/ASIN/052154677X/artofproblems-20 The Cauchy-Schwarz Master Class: An Introduction to the Art of Mathematical Inequalities] by J. Michael Steele.
 
* [http://www.amazon.com/exec/obidos/ASIN/0387982191/artofproblems-20 Problem Solving Strategies] by Arthur Engel contains significant material on inequalities.
 
* [http://www.amazon.com/exec/obidos/ASIN/0387982191/artofproblems-20 Problem Solving Strategies] by Arthur Engel contains significant material on inequalities.

Revision as of 17:21, 15 April 2007

The Cauchy-Schwarz Inequality (which is known by other names, including Cauchy's Inequality, Schwarz's Inequality, and the Cauchy-Bunyakovsky-Schwarz Inequality) is a well-known inequality with many elegant applications.

Elementary Form

For any real numbers $a_1, \ldots, a_n$ and $b_1, \ldots, b_n$,

$\left( \sum_{i=1}^{n}a_ib_i \right)^2 \le \sum_{i=1}^{n}a_i^2 \sum_{i=1}^{n}b_i^2$,

with equality when there exist constants $\displaystyle \mu, \lambda$ not both zero such that for all $1 \le i \le n$, $\displaystyle \mu a_i = \lambda b_i$.

Proof

There are several proofs; we will present an elegant one that does not generalize.

Consider the vectors $\mathbf{a} = \langle a_1, \ldots a_n \rangle$ and ${} \mathbf{b} = \langle b_1, \ldots b_n \rangle$. If $\displaystyle \theta$ is the angle formed by $\mathbf{a}$ and $\mathbf{b}$, then the left-hand side of the inequality is equal to the square of the dot product of $\mathbf{a}$ and $\mathbf{b}$, or $\left( ||\mathbf{a}|| \cdot ||\mathbf{b}|| \cos\theta \right)^2$. The right hand side of the inequality is equal to $\left( ||\mathbf{a}|| \cdot ||\mathbf{b}|| \right)^2$. The inequality then follows from $|\cos\theta | \le 1$, with equality when one of $\mathbf{a,b}$ is a multiple of the other, as desired.

Complex Form

The inequality sometimes appears in the following form.

Let $a_1, \ldots, a_n$ and $b_1, \ldots, b_n$ be complex numbers. Then

$\left| \sum_{i=1}^na_ib_i \right|^2 \le \sum_{i=1}^{n}|a_i^2| \sum_{i=1}^n |b_i^2|$.

This appears to be more powerful, but it follows immediately from

$\left| \sum_{i=1}^n a_ib_i \right| ^2 \le \left( \sum_{i=1}^n |a_i| \cdot |b_i| \right)^2 \le \sum_{i=1}^n |a_i^2| \sum_{i=1}^n |b_i^2|$.

General Form

Let $\displaystyle V$ be a vector space, and let $\langle \cdot, \cdot \rangle : V \times V \mapsto \mathbb{R}$ be an inner product. Then for any $\mathbf{a,b} \in V$,

$\langle \mathbf{a,b} \rangle^2 \le \langle \mathbf{a,a} \rangle \langle \mathbf{b,b} \rangle$,

with equality if and only if there exist constants $\displaystyle \mu, \lambda$ not both zero such that $\mu\mathbf{a} = \lambda\mathbf{b}$.

Proof 1

Consider the polynomial of $\displaystyle t$

$\langle t\mathbf{a + b}, t\mathbf{a + b} \rangle = t^2\langle \mathbf{a,a} \rangle + 2t\langle \mathbf{a,b} \rangle + \langle \mathbf{b,b} \rangle$.

This must always be greater than or equal to zero, so it must have a non-positive discriminant, i.e., $\langle \mathbf{a,b} \rangle^2$ must be less than or equal to $\langle \mathbf{a,a} \rangle \langle \mathbf{b,b} \rangle$, with equality when $\mathbf{a = 0}$ or when there exists some scalar $\displaystyle -t$ such that $-t\mathbf{a} = \mathbf{b}$, as desired.

Proof 2

We consider

$\langle \mathbf{a-b, a-b} \rangle = \langle \mathbf{a,a} \rangle + \langle \mathbf{b,b} \rangle - 2 \langle \mathbf{a,b} \rangle$.

Since this is always greater than or equal to zero, we have

$\langle \mathbf{a,b} \rangle \le \frac{1}{2} \langle \mathbf{a,a} \rangle + \frac{1}{2} \langle \mathbf{b,b} \rangle$.

Now, if either $\mathbf{a}$ or $\mathbf{b}$ is equal to $\mathbf{0}$, then $\langle \mathbf{a,b} \rangle^2 = \langle \mathbf{a,a} \rangle \langle \mathbf{b,b} \rangle = 0$. Otherwise, we may normalize so that $\langle \mathbf {a,a} \rangle = \langle \mathbf{b,b} \rangle = 1$, and we have

$\langle \mathbf{a,b} \rangle \le 1 = \langle \mathbf{a,a} \rangle^{1/2} \langle \mathbf{b,b} \rangle^{1/2}$,

with equality when $\mathbf{a}$ and $\mathbf{b}$ may be scaled to each other, as desired.

Examples

The elementary form of the Cauchy-Schwarz inequality is a special case of the general form, as is the Cauchy-Schwarz Inequality for Integrals: for integrable functions $f,g : [a,b] \mapsto \mathbb{R}$,

$\left( \int_{a}^b f(x)g(x)dx \right)^2 \le \int_{a}^b [f(x)]^2dx \cdot \int_a^b [g(x)]^2 dx$,

with equality when there exist constants $\displaystyle \mu, \lambda$ not both equal to zero such that for $t \in [a,b]$,

$\mu \int_a^t f(x)dx = \lambda \int_a^t g(x)dx$.

Other Resources

Books