Quadratic forms

Orthogonal and symmetric maps: Applications of symmetric maps

Quadratic forms

Symmetric maps can be used to define quadratic forms. A quadratic form on a vector space #V# is a second-degree homogeneous polynomial in the coordinates of a vector with respect to a fixed basis of #V#. We begin with a more intrinsic definition for the case of a real vector space. To this end we use the term polarization for the right hand side of the polarization formula of an inner product.

Quadratic form A quadratic form on a real vector space #V# is a function #q:V\to\mathbb{R}# with the following two properties:

homogeneity: For each scalar #\lambda# and each vector #\vec{x}# of #V# we have \(q(\lambda\cdot \vec{x}) = \lambda^2\cdot q(\vec{x})\).
bilinearity of polarization: The real-valued map #f_q# on pairs of vectors from #V# defined by \( f_q(\vec{x},\vec{y}) =\frac12\left( q(\vec{x}+\vec{y}) - q(\vec{x})-q(\vec{y})\right)\) is bilinear.

The bilinear map #f_q# is symmetric and uniquely determined by #q#; it is called the bilinear form of #q#.

If #g# is a symmetric bilinear form on #V#, then #r(\vec{x}) =g(\vec{x},\vec{x})# is a quadratic form. Each quadratic form can be obtained in this way. Moreover, #g# is the bilinear form of #r#. We call #r# the quadratic form defined by #g#.

Suppose that #q# is a quadratic form on a vector space #V#. Then the associated bilinear form #f_q# satisfies

\[\begin{array}{rcl}f_q( \vec{x} ,\vec{x} )&=&\frac12\left(q( 2\vec{x}) -2q(\vec{x} )\right)\\&&\phantom{xx}\color{blue}{\text{definition }f_q( \vec{x} ,\vec{y} )\text{ with }\vec{y}=\vec{x}}\\&=&\frac12\left(4q(\vec{x}) - 2q(\vec{x} )\right)\\&&\phantom{xx}\color{blue}{q(\lambda\vec{x}) = \lambda^2q(\vec{x})}\\&=& q(\vec{x} )\\&&\phantom{xx}\color{blue}{\text{simplified}}\end{array}\]

This shows that #q(\vec{x}) = f_q(\vec{x},\vec{x})# is uniquely determined by #f_q#.

Now let #g# be a symmetric bilinear form on #V#. Then #r(\vec{x}) =g(\vec{x},\vec{x})# is homogeneous because of the bilinearity of #g#:

\[r(\lambda\vec{x})= g(\lambda\vec{x},\lambda\vec{x})=\lambda^2 g(\vec{x},\vec{x})=\lambda^2 r(\vec{x})\]

The polarization of #r# is bilinear because of

\[\begin{array}{rcl}f_r(\vec{x},\vec{y}) &=&\frac12\left( r(\vec{x}+\vec{y}) - r(\vec{x})-r(\vec{y})\right)\\&&\phantom{xxx}\color{blue}{\text{definition }f_r}\\ &=&\frac12\left( g(\vec{x}+\vec{y},\vec{x}+\vec{y}) - g(\vec{x},\vec{x})-g(\vec{y},\vec{y})\right)\\&&\phantom{xxx}\color{blue}{\text{definition }r}\\&=&\frac12\left(g(\vec{x},\vec{x})+g(\vec{x},\vec{y})+g(\vec{y},\vec{x})+g(\vec{y},\vec{y})-g(\vec{x},\vec{x})-g(\vec{y},\vec{y})\right)\\&&\phantom{xxx}\color{blue}{\text{bilinearity }g}\\&=&\frac12\left(g(\vec{x},\vec{y})+g(\vec{y},\vec{x})\right)\\&&\phantom{xxx}\color{blue}{\text{simplified}}\\&=&g(\vec{x},\vec{y})\\&& \phantom{xxx}\color{blue}{\text{symmetry }g}\end{array}\] This shows that #r# is a quadratic form, and that the corresponding bilinear form is equal to #g#.

According to the first statement, every quadratic form #q# with bilinear form #f_q# can be obtained as the quadratic form determined by #f_q#.

Value at originFor each quadratic form #q# we have #q(\vec{0}) = 0#. This follows from the homogeneity of #q# with #\lambda=0#.

Dimension 1 Let #V# be the inner product space #\mathbb{R}#. Each bilinear form on #V# is of the form \[g(x,y) = a\cdot x\cdot y\] for a constant real number #a#. In order to see this, put #a = g(1,1)#. Then, by the bilinearity of the form, we have \(g(x,y) = x\cdot g(1,1)\cdot y = a\cdot x\cdot y\). In particular, each bilinear form on a #1#-dimensional vector space is symmetric.

Let #q# be a quadratic form on #V#. Then, there is a constant #a# such that # q(x)= a \cdot x^2# for every real number #x#. The corresponding bilinear form \[f_q(x,y) =\frac12\left(a\cdot (x+y)^2 -a\cdot x^2-a\cdot y^2 )\right)= a\cdot x\cdot y\] is positive-definite if and only if #a\gt0#. Hence, there are quadratic forms whose bilinear forms are not inner products.

Example The function #q# on #\mathbb{R}^2# determined by \(q(\rv{x,y}) = 2x^2-4xy+5y^2\) is a quadratic form and the corresponding bilinear form is

\[\begin{array}{rcl}f(\rv{x,y},\rv{u,v}) &=& \frac12\left( q(\rv{x,y}+\rv{u,v}) - q(\rv{x,y})-q(\rv{u,v})\right)\\ &=&(x+u)^2-2(x+u)\cdot(y+v)+\frac52(y+v)^2)\\&&\phantom{XXX}-(x^2-2xy+\frac52y^2)\\&&\phantom{XXX}-(u^2-2uv+\frac52v^2)\\&=&2xu-2xv-2uy+5yv\end{array}\]

Indeed, #f(\rv{x,y},\rv{x,y})=2x^2-2xy-2xy+5y^2# coincides with #q(\rv{x,y})#.

Homogeneity needed We will show homogeneity is needed for the definition of a quadratic form. We have seen that, for any quadratic form #q# on #\mathbb{R}#, there is a constant #a# such that # q(x)= a \cdot x^2#. In particular, #r(x) = x^2 + x# is not a quadratic form. Yet #r# satisfies the second condition (bilinearity of polarization) for a quadratic form, since \[f_r(x,y) = \frac12\left(r(x+y)-r(x)-r(y)\right) = x\cdot y\] is a symmetric bilinear function in #x# and #y#. This shows that the omission of the condition of homogeneity substantially changes the definition of a quadratic form.

Positive-definite If #q# is a quadratic form on a vector space #V# with bilinear form #f_q#, then #f_q# is symmetric. But #f_q# is not necessarily positive-definite. The form #f_q# is positive-definite (and therefore also an inner product on #V#) if and only if #q(\vec{x})\ge 0# and the equality #q(\vec{x}) = 0# holds only for #\vec{x} = \vec{0}#.

Notation If #V = \mathbb{R}^n#, for a vector #\vec{x} = \rv{x_1,\ldots,x_n}# we also write #q(x_1,\ldots,x_n)# instead of #q(\vec{x})#. For example, # q(x,y,z) = q(\rv{x,y,z}) # for #\rv{x,y,z}\in\mathbb{R}^3#.

Complex caseThe definition for complex vector spaces is the same with the understanding that the map #q# has range #\mathbb{C}#.

We will now show how the homogeneous polynomials of degree #2# appear after a basis has been fixed. Recall from Coordinatization that, if #\alpha# is a basis of #V#, the map #\alpha:V\to\mathbb{R}^n#, where #n=\dim{V}#, assigns the coordinate vector of a vector of #V# with respect to #\alpha#.

Quadratic forms and symmetric matrices Let #V# be a vector space of finite dimension #n# with basis #\alpha# and #q# a quadratic form on #V#.

If #f# is the bilinear form of #q#, then there is a unique symmetric matrix #A# such that, for all #\vec{u},\vec{v}\in V#, we have \[ f(\vec{u},\vec{v}) =\dotprod{\alpha( \vec{u}) }{( A\,\alpha( \vec{v}))}\] We call #A# the matrix of #q# with respect to #\alpha#. In particular, #q\circ\alpha^{-1}# is of the form \[q(\alpha^{-1}(\vec{x})) =\dotprod{ \vec{x} }{( A\,\vec{x})} = \sum_{i,j=1}^n a_{ij}x_i x_j\phantom{xx}\text{for }\vec{x} = \rv{x_1,\ldots,x_n}\in\mathbb{R}^n\] where #a_{ij}# is the #(i,j)#-entry of #A#.
If #\beta# is another basis of #V#, then the matrix #B# of #q# with respect to #\beta# is given by \[ B={}_\alpha I_{\beta}^\top\, A\,\; {}_\alpha I_{\beta}\]
There exists a basis #\beta# for #V# such that the transformation matrix #{}_\alpha I_\beta# is orthogonal and the matrix #B# of #q# with respect to #\beta# is diagonal. In particular, #q\circ\beta^{-1}# is of the form \[q(\beta^{-1}(\vec{x})) = \sum_{i=1}^n b_ix_i^2\phantom{xx}\text{for }\rv{x_1,\ldots,x_n}\in\mathbb{R}^n\] where #b_i# are the eigenvalues of #A#. We call such a form a diagonal form of #q#.
The bilinear form #f# of #q# is an inner product on #V# if and only if all of the eigenvalues of #A# are positive.

Polynomial The formula for \(q(\alpha^{-1}(\vec{x}))\) shows that the quadratic form is a polynomial in the coordinates of #\vec{x}\in\mathbb{R}^n#.

The formula for \(q(\beta^{-1}(\vec{x}))\) reveals that, relative to a suitable basis, the polynomial may be written as a sum of squared terms, that is to say, of the form #b_i\cdot x_i^2#.

By replacing #\vec{x}# by #\beta(\vec{x})# in the formula \(q(\beta^{-1}(\vec{x})) = \sum_{i=1}^n b_ix_i^2\), we get

\[q(\vec{x}) = \sum_{i=1}^n b_i\cdot (\beta (\vec{x})_i)^2\]

Thus, #q(\vec{x})# itself can also be written as a linear combination of squares.

We prove each of the statements individually.

1. Let #\vec{e}_1,\ldots,\vec{e}_n# be a standard basis of #\mathbb{R}^n# and put #a_{ij}= f(\alpha^{-1}(\vec{e}_i),\alpha^{-1}(\vec{e}_j))#. Further, let #A# be the #(n\times n)#-matrix with #(i,j)#-entry #a_{ij}#. Then, for #\vec{x} = \rv{x_1,\ldots,x_n}# in #\mathbb{R}^n#, we have

\[\begin{array}{rcl}q(\alpha^{-1}(\vec{x})) &=& f( \alpha^{-1}(\vec{x}),\alpha^{-1}(\vec{x}))\\ &&\phantom{xxx}\color{blue}{f \text{ is the bilinear form of }q}\\ &=&\displaystyle f\left( \alpha^{-1}\left(\sum_{i=1}^nx_i\vec{e}_i\right),\alpha^{-1}\left(\sum_{j=1}^nx_j\vec{e}_j\right)\right)\\ &&\phantom{xxx}\color{blue}{\text{definition }\vec{x}}\\ &=&\displaystyle \sum_{i,j=1}^nx_ix_j\cdot f\left( \alpha^{-1}(\vec{e}_i),\alpha^{-1}(\vec{e}_j)\right)\\ &&\phantom{xxx}\color{blue}{\text{bilinearity }f\text{ and linearity }\alpha^{-1}}\\ &=&\displaystyle \sum_{i,j=1}^nx_ix_j\cdot a_{ij}\\ &&\phantom{xxx}\color{blue}{\text{definition }a_{ij}}\\&=&\displaystyle \sum_{i,j=1}^nx_ix_j\cdot\dotprod{( \vec{e}_i}{(A\,\vec{e}_j))}\\ &&\phantom{xxx}\color{blue}{\text{definition }A}\\&=&\displaystyle \dotprod{ \vec{x}}{(A\,\vec{x})}\\ &&\phantom{xxx}\color{blue}{\text{bilinearity inner product and linearity }A}\\\end{array}\]

With this, the expressions in the theorem for #q(\alpha^{-1}(\vec{x}))# have been derived. Next, polarization shows that, for vectors #\vec{x}# and #\vec{y}# of #\mathbb{R}^n#, \[f( \alpha^{-1}(\vec{x}),\alpha^{-1}(\vec{y})) = \dotprod{ \vec{x}}{(A\,\vec{y})}\] Substituting #\vec{x} =\alpha(\vec{u})# and #\vec{y} =\alpha(\vec{v})#, we find \( f(\vec{u},\vec{v}) =\dotprod{\alpha( \vec{u}) }{( A\,\alpha( \vec{v}))}\).

2. The matrices #A# and #B# are both determined by values of #f#. This leads to the following relation between #A# and #B#:

\[\begin{array}{rcl} \dotprod{\beta(\vec{x})}{(B\, \beta(\vec{y}))} &=& f(\vec{x},\vec{y})\\ &&\phantom{xx}\color{blue}{\text{definition of }B}\\&=&\dotprod{\alpha(\vec{x})}{(A\, \alpha(\vec{y}))} \\ &&\phantom{xx}\color{blue}{\text{definition of }A}\\ &=&\dotprod{(\alpha\beta^{-1}(\beta(\vec{x})))}{(A\, (\alpha\beta^{-1})(\beta(\vec{y})))}\\&&\phantom{xx}\color{blue}{\text{rewritten}}\\&=&\dotprod{({}_\alpha I_\beta(\beta(\vec{x})))}{(A\, {}_\alpha I_\beta(\beta(\vec{y})))}\\&&\phantom{xx}\color{blue}{\text{notation for transition matrix}}\\&=&\dotprod{\beta(\vec{x})}{({}_\alpha I_\beta^\top\,A\, \;{}_\alpha I_\beta(\beta(\vec{y})))}\\&&\phantom{xx}\color{blue}{\text{definition transposed }}\\\end{array}\] Because #\beta# is surjective, both #\beta(\vec{x})# and #\beta(\vec{y})# run over all vectors of #\mathbb{R}^n#. Consequently, the matrices #B# and #{}_\alpha I_\beta^\top\,A\, \;{}_\alpha I_\beta# are identical. This proves that #B = {}_\alpha I_\beta^\top\,A\, \;{}_\alpha I_\beta#.

3. By theorem Diagonalizibilty of symmetric matrices, there is an orthogonal matrix #Q# such that #B = Q^\top \,A Q# is a diagonal matrix. Let \[\beta=\basis{\alpha^{-1}(Q\,\vec{e}_1),\ldots,\alpha^{-1}(Q\,\vec{e}_n)}\]

Then, for all #i#, \[\alpha^{-1}(Q\,\vec{e}_i)=\beta^{-1}(\vec{e}_i)=\alpha^{-1}({}_\alpha I_\beta\,\vec{e}_i)\]

So #\beta# is a basis for which #Q={}_\alpha I_\beta#. From statement 2 it follows that #B# is the matrix of #q# with respect to #\beta #. Because #B# is a diagonal matrix, the #(i,j)#-entries #b_{ij}# of #B# satisfy #b_{ij}=0# if #i\ne j#, so

\[q(\beta^{-1}(\vec{x})) = \sum_{i,j=1}^n b_{ij}x_i x_j= \sum_{i,j=1}^n b_{ii}x_i^2\]

Because #Q# is orthogonal, #B = Q^\top \,A Q =Q^{-1} \,A Q# is conjugate to # A#. In particular, the diagonal entries #b_i = b_{ii}# of #B# are the eigenvalues of #A#. This settles the proof of 3.

4. Let #\vec{v}# be a vector of #V# distinct from the zero vector. The value of #f(\vec{v},\vec{v})# is equal to #q(\vec{v})=\sum_{i,j=1}^n b_{i}x_i^2# for #\rv{x_1,\ldots,x_n} = \alpha(\vec{v})#. This value is positive for all nonzero vectors #\vec{v}# of #V# if and only if all #b_i# are positive.

Diagonal By scaling the vectors of the basis #\beta#, we can even achieve that each diagonal entry of #A# is equal to one of #0#, #1#, #-1#. To this end, we scale the #i#-th element of #\beta# by #\frac{1}{\sqrt{|b_i|}}# if #b_i\ne0#. Here, the transformation matrix is no longer orthogonal, so the distances in #V# are no longer preserved.

Example Consider once more the quadratic form #q# on #\mathbb{R}^2# determined by \(q(\rv{x,y}) = 2x^2-4xy+5y^2\) with corresponding bilinear form

\[\begin{array}{rcl}f(\rv{x,y},\rv{u,v}) &=& 2xu-2xv-2uy+5yv\end{array}\]

The matrix #A# of #f# satisfies

\[\begin{array}{rcl} \matrix{x&y}\, A\, \matrix{u\\ v} &=& f(\rv{x,y},\rv{u,v})\\&=&2xu-2xv-2uy+5yv\end{array}\]

From this we deduce:

\[\begin{array}{rcl} a_{11}&=&\matrix{1&0}\, A\, \matrix{1\\ 0}= 2 \\&&\phantom{xx}\color{blue}{x=1,y=0,u=1,v=0}\\
a_{12}&=&\matrix{1&0}\, A\, \matrix{0\\ 1}= -2 \\ &&\phantom{xx}\color{blue}{x=1,y=0,u=0,v=1}\\
a_{22}&=&\matrix{0&1}\, A\, \matrix{0\\ 1} = 5\\ &&\phantom{xx}\color{blue}{x=0,y=1,u=0,v=1}\\
\end{array}\]
The #(2,1)#-entry #a_{21}# of #A# is equal to #a_{12}=-2# because #A# is symmetric. So

\[ A = \matrix{2 & -2\\ -2&5}\]

We bring #q# into diagonal form by calculating an orthonormal basis #\beta# of eigenvectors of #A#. The eigenvalues are solutions of the characteristic equation:

\[\begin{array}{rcl}p_A(x) &=& x^2-7x+6\\ &&\phantom{xx}\color{blue}{\text{the characteristic polynomial}}\\\rv{\lambda_1,\lambda_2} &=& \rv{1,6}\\ &&\phantom{xx}\color{blue}{\text{solutions of the equation }p_A(x) = 0}\\\beta &=& \basis{\frac{1}{\sqrt{5}}\rv{2,1},\frac{1}{\sqrt{5}}\rv{1,-2}}\\&&\phantom{xx}\color{blue}{\text{corresponding normalized eigenvectors}}\\{}_\varepsilon I_\beta&=&\frac{1}{\sqrt{5} }\matrix{2&1\\1&-2}\\ &&\phantom{xx}\color{blue}{\text{corresponding transformation matrix}}\\ q(\beta^{-1}\rv{x,y}) &=&\matrix{x&y}\,{}_\varepsilon I_\beta^\top\, A\,{}_\varepsilon I_\beta \,\matrix{x\\ y}\\&&\phantom{xx}\color{blue}{\text{quadratic form with respect to }\beta}\\&=&\matrix{x&y}\,\matrix{1&0\\ 0&6}\,\matrix{x\\ y}\\&&\phantom{xx}\color{blue}{\text{diagonal matrix substituted}}\\&=& x^2+6y^2\\&&\phantom{xx}\color{blue}{\text{matrix products worked out}}\end{array}\] Thus we have found the diagonal form #q(\beta^{-1}\rv{x,y})=x^2+6y^2# for #q#. By replacing #\rv{x,y}# by #\beta(\rv{x,y})# we find an expression of #q(x,y)# as a linear combination of two squares: \[ q(x,y) = \frac15(2x+y)^2+\frac65(x-2y)^2\] In particular, #q# turns out to be positive definite.

The diagonal form is immediately clear once the eigenvalues of #A# are known: these are the coefficients of the squares of the coordinates in #q(\beta^{-1}\rv{x,y})#. The majority of the calculation thus consists of finding an orthonormal basis on which the diagonal form is assumed.

Eigenvalues The coordinate transformation #{}_\varepsilon I_\beta# made all mixed products (that is, products of two different variables) disappear! We can already write down the diagonal form of #q# once the eigenvalues of #A# are determined.

Orthonormal basis Often we will apply the theorem with #V = \mathbb{R}^n# and #\alpha = \varepsilon#, the standard basis, so that #\alpha# is orthonormal. If #\alpha# is orthonormal, then the transition matrix #{}_\alpha I_\beta# is orthogonal if and only if #\beta # is orthonormal. We recall from Orthogonality criteria for matrices and Transition matrices and orthonormal bases, that this means that \[{}_\alpha I_\beta^{-1}= {{}_\alpha I_\beta}^\top ={}_\beta I_\alpha\]

Complex caseThe theorem also holds for complex vector spaces.

Positive semi-definite If all of the eigenvalues of #A# are nonnegative, then #q# is positive semi-definite, which means that #q(\vec{x})\ge0# for all #\vec{x}# in #V#.

The collection of vectors at which a quadratic form assumes a fixed chosen value, is called a quadric. It is the set of solutions of a quadratic polynomial equation with several unknowns. In general, the equation of a quadric also involves linear terms in addition to a quadratic form. Later we will go into this further.

Let #q:\mathbb{R}^3\to \mathbb{R}# be the quadratic form defined by
\[\begin{array}{rcl}q(x,y,z) &=& \displaystyle -3 z^2-6 y z+8 x z-2 y^2+8 x y+2 x^2\end{array}\] What is the matrix #A# of #q#?

#A= # #\matrix{2 & 4 & 4 \\ 4 & -2 & -3 \\ 4 & -3 & -3 \\ }#

The matrix #A# is determined by
\[\begin{array}{rcl}q(x,y,z) &=& f(\rv{x,y,z},\rv{x,y,z})\\
&&\phantom{xxxxwwwwwwwxxxx}\color{blue}{f\text{ is the bilinear form of }q}\\
&=&\dotprod{\rv{x,y,z}}{\left(A\, \rv{x, y, z} \right)}\\
&&\phantom{xxxxwwwwwwwxxxx}\color{blue}{\text{definition of }A}\\
&=& {\matrix{x&y&z}}\,A\, \matrix{x\\ y\\ z}\\
&&\phantom{xxxxwwwwwwwxxxx}\color{blue}{\text{inner product rewritten as matrix product}}\\
&=&a_{11}x^2+(a_{12}+a_{21})xy+(a_{13}+a_{31})xz+a_{22}y^2+(a_{23}+a_{32})yz+a_{33}z^2\\
&&\phantom{xxxxwwwwwwwxxxx}\color{blue}{\text{matrix product worked out}}\\
&=& a_{11}x^2+2a_{12}xy+2a_{13}xz+a_{22}y^2+2a_{23}yz+a_{33}z^2\\
&&\phantom{xxxxwwwwwwwxxxx}\color{blue}{A\text{ is symmetric}}
\end{array}\] Comparison with the function rule #q(x,y,z) =-3 z^2-6 y z+8 x z-2 y^2+8 x y+2 x^2# gives
\[\begin{array}{rclcr} a_{11}&=&\text{coefficient of } x^2 &=& 2 \\
a_{12}&=&\frac12(\text{coefficient of } x y) &=& 4 \\
a_{13}&=&\frac12(\text{coefficient of } x z )&=& 4 \\
a_{22}&=&\text{coefficient of } y^2 &=& -2 \\
a_{23}&=&\frac12(\text{coefficient of } y z) &=& -3 \\
a_{33}&=&\text{coefficient of } z^2 &=& -3
\end{array}\] The remaining elements of #A# now follow from the fact that #A# is symmetric. The conclusion is
\[\begin{array}{rcl} A &=& \matrix{2 & 4 & 4 \\ 4 & -2 & -3 \\ 4 & -3 & -3 \\ }\end{array}\]

New example