### Optimization: Extreme points

### Criteria for extrema and saddle points

As with the functions of a single variable, we want to have good workable criteria for determining the nature of a stationary point. To this end we look at the partial derivatives of the second order of \(f(x,y)\) around a stationary point \(\rv{a,b}\). In the case of one variable, the stationary point is a minimum if the second derivative is positive at the point, and it is a maximum if the second derivative at the point is negative. If #f# is sufficiently often differentiable at #\rv{a,b}#, this criterion also applies to the restrictions of #f# to the #x#-axis and to the #y#-axis. So you might expect that the stationary point #\rv{a,b}# is a minimum of #f# if #f_{xx}(a,b)\gt0# and #f_{yy}(a,b)\gt0# hold. But this condition is not sufficient, as will be clear from the example #f(x,y)= x^2 -3x\cdot y +y^2# at the stationary point #\rv{0,0}#: both#f_{xx}(0,0)\gt0# and #f_{yy}(0,0)\gt0# hold, but the restriction of #f# to the line #y=x# is the function #f(x,x)=-x^2#, which has a maximum value at #\rv{0,0}#.

In order to formulate a sufficient criterion for a local **extremum** (i.e., a local minimum or maximum), we need the following notion.

Hessian

Let \(f(x,y)\) be a bivariate function all of whose partial derivatives of the first and second order exist. The **Hessian of** \(f(x,y)\) is the bivariate function \[H(x,y)=f_{xx}(x,y)\cdot f_{yy}(x,y)-f_{xy}(x,y)^2\]

Partial derivatives test for a local extremum or a saddle point Let \(f(x,y)\) be a bivariate function all of whose partial derivatives of first and second order exist and are continuous. Assume that \(\rv{a,b}\) is a stationary point of #f# lying in an open disk that is contained in the domain of #f#. Denote the Hessian of #f# by #H#.

- If \(H(a,b)\gt0\) and \(f_{xx}(a,b)\gt0\), then \(\rv{a,b}\) is a local minimum of \(f(x,y)\).
- If \(H(a,b)\gt0\) and \(f_{xx}(a,b)\lt0\), then \(\rv{a,b}\) is a local maximum of \(f(x,y)\).
- If \(H(a,b)\lt0\), then \(f(x,y)\) is a saddle point in \(\rv{a,b}\).

Proof sketch: For the sake of expressing the idea of the proof, there is no harm in choosing #\rv{a,b}=\rv{0,0}#. In order to ascertain that the stationary point #\rv{0,0}# is a local minimum or local maximum, we check whether this is the case for #f# as a function on every line #l# through #\rv{0,0}#. It is not obvious that this will suffice; it requires more mathematics than we treat here.

We first consider the line through #\rv{0,0}# with equation #x=0#. For the function #f# whose domain is this line, the fact that #\rv{0,0}# is a stationary point means that #f_y(0,0)=0#. But the *theory of one variable* also has criteria for local minima and maxima:

- If #f_{yy}(0,0)\gt0# then #\rv{0,0}# is a minimum of the restriction of #f# to the line #x=0#
- If #f_{yy}(0,0)\lt0# then #\rv{0,0}# is a maximum of the restriction of #f# to the line #x=0#

For the line #y=0#, the same criteria are valid with respect to the second order derivative with respect to #x#.

The reason that this is not enough for local minimality or maximality of #f# on a two-dimensional open disk around #\rv{0,0}# is that there are many more lines through #\rv{0,0}#: the stationary point #\rv{0,0}# is a local minimum if the double derivative at #\rv{0,0}# of the restriction of #f# to each line through #\rv{0,0}# is positive.

Each line through #\rv{0,0}#distinct from the #y#-axis is given by an equation of the form #y=\lambda x# for a certain number #\lambda#. An arbitrary point on that line has the form #\rv{u,\lambda u}# where #u# is a real number. If we restrict #f# to such a line, then the value of #f# at #\rv{u,\lambda u}# will be equal to #f(u,\lambda u)#. Thus, we can view #x# and #y# as functions of #u# with function rules #x=u# and #y=\lambda u#. The point #\rv{0,0}# is a local minimum of #f# on that line if #f_{uu}(0)\gt0#. The latter expression can be calculated by use of the *chain rules for partial differentiation* and *commuting mixed derviates* \[ \begin{array}{rcl}f_{uu}(0)&=&\left.\frac{\dd}{\dd u}\left(f_{x}\cdot x_u+f_{y}\cdot y_u\right)\right|_{u=0}\\&&\phantom{xxxuvw}\color{blue}{\text{chain rule}}\\&=&\left.\left(f_{xx} \cdot x_u+\lambda f_{yx}\cdot x_u + f_{xy}\cdot y_u+\lambda f_{yy}\cdot y_u\right)\right|_{u=0}\\&&\phantom{xxxuvw}\color{blue}{\text{chain rule}}\\&=& \left.\left( f_{xx} +\lambda f_{yx} +\lambda f_{xy}+\lambda^2f_{yy}\right)\right|_{u=0} \\&&\phantom{xxxuvw}\color{blue}{x_u=1\text{ and } y_u=\lambda}\\&=& \left.\left( f_{xx} +2\lambda f_{xy}+\lambda^2f_{yy}\right)\right|_{u=0} \\&&\phantom{xxxuvw}\color{blue}{f_{xy}=f_{yx}}\\&=&f_{xx}(0,0) +2\lambda f_{xy}(0,0)+\lambda^2f_{yy}(0,0) \\&&\phantom{xxxuvw}\color{blue}{u=0 \text{ substituted}}\end {array}\] If this expression is always (that is, for each value of #\lambda#) positive, then #\rv{0,0}# is a local minimum of #f# on any line through the stationary point #\rv{0,0}#. This means that, for each line #l# through the stationary point, a positive number #\varepsilon_l# exists, so #f(x,y)\ge f(0,0)# for all points #\rv{x,y}# on the line #l# at distance at most #\varepsilon# to #\rv{0,0}#. It follows, as mentioned, with slightly more mathematics than we treat here, that there is a positive number #\varepsilon# such that, for each point #\rv{x,y}# in the open disk #S_{\rv{0,0},\varepsilon}# with radius #\varepsilon# and center #\rv{0,0}#, we have #f(x,y)\ge f(0,0)#. But then #\rv{0,0}# is a local minimum of #f#.

So we need to ask ourselves under what condition the expression \(f_{xx}(0,0) +2\lambda f_{xy}(0,0)+\lambda^2f_{yy}(0,0)\) will always be positive (or, for the criterion of a local maximum, always be negative). Because the expression is a quadratic polynomial in #\lambda#, we know that this is the case if the discriminant is negative: \[4f_{xy}(0,0)^2-4f_{xx}(0,0)f_{yy}(0,0)\lt0\tiny.\] After division by #-4#, this reads #H(0,0)\gt0#. This implies the first two statements.

If #H(0,0)\lt0#, then the discriminant is positive, so there are two values of #\lambda# such that the double derivative at #\rv{0,0}# of the restriction of #f# to the line through #\rv{0,0}# is positive for one value of #\lambda# and negative for another. Therefore, the point #\rv{0,0}# is a local minimum of #f# on one line and a local maximum on the other. The conclusion is that #\rv{0,0}# is a saddle point. This completes the sketch of the proof.

If #H(a,b)\gt0# then #f_{xx}(a,b) \cdot f_{yy}(a,b)\gt f_{xy}(a,b) ^2\ge0#. In particular, #f_{xx}(a,b)# and # f_{yy}(a,b)# are nonzero and have the same sign, so #\rv{a,b}# is always a local minimum or local maximum. We conclude that only if #H(a,b)=0#, there is no indication as to the nature of the stationary point #\rv{a,b}#.

The partial derivatives are \[f_x(x,y)=8\cdot y\cdot x+2\cdot y^2-5\cdot y\phantom{xxx}\text{and}\phantom{xxx}f_y(x,y)=4\cdot x^2+4\cdot y\cdot x-5\cdot x\tiny.\] Thus, the stationary points are the solutions of the equations \[\left\{ \begin{array}{rll} 8\cdot y\cdot x+2\cdot y^2-5\cdot y&=0\\ 4\cdot x^2+4\cdot y\cdot x-5\cdot x&=0\end {array}\right.\] The solution is \(\left\{{\rv{0,0},\rv{0,{{5}\over{2}} },\rv{{{5}\over{4}},0},\rv{{{5 }\over{12}},{{5}\over{6}}}}\right\}\). So there are four stationary points.

To determine the nature of these stationary points, we calculate the partial derivatives of the second order: \[f_{xx}(x,y)=8\cdot y,\phantom{xxx} f_{xy}(x,y)=8\cdot x+4\cdot y-5,\phantom{xxx}f_{yy}(x,y)=4\cdot x\tiny.\] In the following scheme, in which \(H(a,b)=f_{xx}(a,b)\cdot f_{yy}(a,b)-f_{xy}(a,b)^2\) is the value of the Hessian at #\rv{a,b}#, we draw conclusions about the stationary points. \[ \begin{array}{|c|c|c|c|c|c|} \hline \rv{a,b} & f_{xx}(a,b) & f_{xy}(a,b) & f_{yy}(a,b) & H(a,b) & \text{conclusion} \\ \hline \rv{ 0 , 0 } & 0 & -5 & 0 & -25 & \text{ saddle point } \\ \hline \rv{ 0 , {{5}\over{2}} } & 20 & 5 & 0 & -25 & \text{ saddle point }\\ \hline \rv{ {{5}\over{4}} , 0 } & 0 & 5 & 5 & -25 & \text{ saddle point }\\ \hline \rv{ {{5}\over{12}} , {{5}\over{6}} } & {{20}\over{3}} & {{5}\over{3}} & {{5}\over{3}} & {{25}\over{ 3}} & \text{ local minimum }\\ \hline \end {array}\]

The graph of #f# below illustrates these results.

**Pass Your Math**independent of your university. See pricing and more.

Or visit omptest.org if jou are taking an OMPT exam.