Convex Functions and Jensen's Inequality
by always_correct, Nov 20, 2016, 3:21 AM
What does it mean for a function to be convex? Concave? Intuitively we see that all convex functions on an interval
look like a smooth valley, and all concave functions look like smooth hills. We want a rigorous definition of convex and concave functions, we first tackle this by realizing a concave function is just the negative of some convex function.
We can notice by drawing a convex function
on an interval
, that for any line
through any two points on the function,
is above the graph, more precisely,
for
.

For a non-convex function, we see that this line may be below the graph.

Let's define a convex function
this way. First, we need a concise way to say all points on this line segment are above or on the graph. Let
and
be the endpoints of a line in the interval where the function is convex, let this interval be
. The midpoint of these two points is the average of the
and
coordinates. This is in the line segment, as the average of two numbers is between those numbers. Generally, we can say any for any
,(Letting
gives us the midpoint case.)
To prove this we can write
and note this is the equation of a line
, and use the Intermediate Value Theorem. We can equally say that for any
,
Note that if
is the equation of the line segment from
to
, by linearity*:
The last equality comes from the fact, looking at the picture, that the line is equal to the function at
and
, in other words,
and
. Thus we come to the final conclusion that all points on the line segment connecting any two points of
can be written as:
By our definition, this point has to be above or on
for
to be convex, where
. Thus, if a function is convex in the interval
, for all
and
,
*Actually, the function
is affine not linear, meaning
and NOT
. However we can still note that if
, then:
So "linearity" sometimes applies if the coefficients
add up to one... make sure you understand this!
Great! We just rigorously defined convex functions, now I need to make sure you aren't lying when you say you understand.
Try proving the inequality in the definition of convexity fails for
.
Moving onward, we can now discuss Jensen's Inequality. Here is the statement of Jensen's Inequality.
Let
be a real convex function defined in the interval
, and let
and
be positive real numbers such that
. Then

See if you can prove this. Here I will discuss an outline to think about
Now that we understand Jensen's inequality(hopefully), we can prove powerful inequalities, and find maximums where no other methods provide help. For example look at this problem:
In
, find the maximum of
.
Solution
Thank you for reading, and comment if you have any questions or feedback.

We can notice by drawing a convex function







For a non-convex function, we see that this line may be below the graph.

Let's define a convex function






![$\mu \in [0,1]$](http://latex.artofproblemsolving.com/5/1/f/51fe8bdefd8e38f102dcbabea80e15292ab8bbfe.png)

![$$\mu a + [1-\mu]b \in [a,b]$$](http://latex.artofproblemsolving.com/9/8/e/98e577e370caaa752be78c64cd0a57c6c417c375.png)
![$\mu a + [1-\mu]b = b+\mu(a-b)$](http://latex.artofproblemsolving.com/a/6/8/a6845019c8ae60418892d8319c8513832ad68e30.png)

![$\mu \in [0,1]$](http://latex.artofproblemsolving.com/5/1/f/51fe8bdefd8e38f102dcbabea80e15292ab8bbfe.png)
![$$\mu f(a) + [1-\mu]f(b) \in [f(a),f(b)]$$](http://latex.artofproblemsolving.com/7/6/a/76af6f1832076e889fcb7031321accab85a174e3.png)



![$$L(\mu a + [1-\mu]b)=\mu L(a) + [1-\mu]L(b) =\mu f(a) + [1-\mu]f(b)$$](http://latex.artofproblemsolving.com/f/a/2/fa251069e1fc9465a1e8ebc63bdeb22a042a9498.png)





![$$(h,k)=(\mu a + [1-\mu]b, \mu f(a) + [1-\mu]f(b))$$](http://latex.artofproblemsolving.com/e/8/1/e81bc6bfa6e7d107d79c2e415f146dfe7ff4cd40.png)





![$\mu \in [0,1]$](http://latex.artofproblemsolving.com/5/1/f/51fe8bdefd8e38f102dcbabea80e15292ab8bbfe.png)
![$$\mu f(a) + [1-\mu]f(b) \geq f(\mu a + [1-\mu]b)$$](http://latex.artofproblemsolving.com/c/3/3/c334e9918f0b0acd5fd1faf01084471ad282c8e6.png)






Great! We just rigorously defined convex functions, now I need to make sure you aren't lying when you say you understand.

![$\mu \in \mathbb{R}\setminus [0,1]$](http://latex.artofproblemsolving.com/2/3/3/23355579f8af443bc55fb1baa7e012e77df784ce.png)
Moving onward, we can now discuss Jensen's Inequality. Here is the statement of Jensen's Inequality.
Let






See if you can prove this. Here I will discuss an outline to think about
Let
. We see that the tangent line to
at
is always below or on
because it is convex. If we replace all occurrences of
in the inequality with the function of the tangent line
, what happens? Can we use a property of functions that are lines(one that I used before)? What happens if we replace one side back to
after simplifying/expanding? What do we note?
.







Now that we understand Jensen's inequality(hopefully), we can prove powerful inequalities, and find maximums where no other methods provide help. For example look at this problem:
In


Solution
We note from the first constraint that
. Note that
is concave, so
is convex. Let
and
. We see by Jensen's Inequality that:
This implies that:
Thus the minimum is
. This happens when
(Why? Look at the original proof of Jensen's.)









Thank you for reading, and comment if you have any questions or feedback.

This post has been edited 7 times. Last edited by always_correct, Nov 20, 2016, 10:35 PM