Criterion for a global extremum

Optimization: Extreme points

Criterion for a global extremum

So far, we have only been concerned with local extrema of multivariate functions. Since global maxima of functions on #\mathbb{R}^2# are local maxima, and similarly for minima, the local information is relevant to global optimization. For special functions, in particular, for convex or concave functions, we can even draw global conclusions.

Since global extrema of functions depend on the domain on which these functions are defined, we also need to bring the domain into the picture. For the definition of convexity of a function, we need to require that the domain of the function itself is also convex in the following sense:

Convex sets and convex functionsLet #S# be a domain in the #x,y#-plane. Then #S# is called convex if every line segment between two points of #S# lies entirely within #S#.

Let #f# be a function defined on a convex domain #S#. Then #f# is called convex if the line segment connecting any two points of the graph of #f# has no points below the graph. In other words, if for all #u#, #v# in #S# and \(0\le t\le 1\), we have \[f(t \cdot u+(1-t)\cdot v) \le t\cdot f(u)+ (1-t)\cdot f(v)\]

If #-f# is convex on #S#, the #f# is called concave on #S#.

Examples

Examples of convex domains are the

the whole plane #\Bbb R^2#
open quadrants, like the set of all #\rv{x,y}# with #x\gt0# and #y\gt 0#
closed quadrants, like the set of all #\rv{x,y}# with #x\ge0# and #y\ge 0#
open disks, like the set of all #\rv{x,y}# with #x^2+y^2\lt 1#
closed disks, like the set of all #\rv{x,y}# with #x^2+y^2\le 1#
the epigraph of a convex univariate function #f# (like, #f(x) = x^2# or #f(x)=2^x#), that is, the set of all #\rv{x,y}# with #y\ge f(x)#
a line or a line segment

In the theory on Concavity and convexity a definition was given of convexity of a function on the whole plane #\mathbb{R}^2#. The current definition coincides with that definition in that particular case.

We are now ready to formulate a sufficient condition for a local extreme point to be a global extremum.

From local to global extremum

Let #S# be a convex domain.

If #f# is a convex function on #S#, then every local minimum of #f# is a global minimum of #f# on #S#.
If #f# is a concave function on #S#, then every local maximum of #f# is a global maximum of #f# on #S#.

The second statement follows from the first by going over to the function #-f#. Therefore, we focus on proving the first statement.

Let #u# be a local minimum of #f# on #S#. Then, by definition, there is #\epsilon\gt0# such that for all #v\in S# with \(||u-v||\le \epsilon\), we have #f(v)\ge f(u)#.

Assume there exists #v# in #S# such that #f(v)\lt f(u)#. For #0\lt t\le 1#, we have

\[ f((1-t) v+t u) \le (1-t)f(v)+ t f(u) \lt (1-t)f(u)+t f(u)=f(u) \]

Choose #t=\min\left(1,\frac{\epsilon}{||u-v||}\right)#, so \[||\left((1-t)u+t v\right)-u|| \lt \epsilon\] Then, by the above, we have \(f((1-t)u+t v)\lt f(u)\), but by the definition of local minimality and choice of #\epsilon#, we have #f((1-t)u+t v)\ge f(u)#, a contradiction.

We conclude that, for each #v# in #S#, we have #f(v)\ge f(u)#. This means that #u# is a global minimum of #f# on #S#.

In the case of a differentiable convex function on a complex domain, we can even conclude that stationary points are global extrema.

From stationary points to global extremaSuppose that #S# is a convex domain and #f# a differentiable function on #S# with continuous partial derivatives.

If #f# is convex, then every stationary point of #f# in #S# is a global minimum.
If #f# is concave, then every stationary point of #f# in #S# is a global maximum.

The second statement is an immediate consequence of the first, since #-f# is convex if and only if #f# is concave. Therefore, we focus on a proof of the first statement.

Let #u=\rv{u_1,u_2}# and #v=\rv{v_1,v_2}# be arbitrary points of #S# and let #t\in\ivoc{0}{1}#. From convexity of #f# we draw the following conclusion: \[\begin{array}{rcl}f(t\cdot u+(1-t)\cdot v) &\le& t\cdot f(u)+ (1-t)\cdot f(v) \\ &&\phantom{xx}\color{blue}{\text{convexity of }f}\\ t\cdot\left( f(u)-f(v)\right) &\ge& f(v+t\cdot (u-v))-f(v) \\ &&\phantom{xx}\color{blue}{\text{terms rearranged}}\\ f(u)-f(v)&\ge&\dfrac{f(v+t\cdot (u-v))-f(v)}{t} \\ &&\phantom{xx}\color{blue}{\text{both sides divided by }t}\\ f(u)-f(v)&\ge&\left.\dfrac{\dd}{\dd t}\left(f(v+t\cdot (u-v))-f(v)\right) \right|_{t=0}\\ &&\phantom{xx}\color{blue}{\text{limit taken for }t\downarrow 0}\\ f(u)-f(v)&\ge&f_x(v)\cdot (u_1-v_1) + f_y(v)\cdot(u_2-v_2)\\ &&\phantom{xx}\color{blue}{\text{chain rule for partial differentiation}}\\\end{array}\]

In particular, if #v# is a stationary point of #f# in #S#, then #f_x(v) = f_y(v) = 0#, so the right hand side equals #0# and we find #f(u)\ge f(v)# for all #u# in #S#. This means that #v# is a global minimum.