A quadratic form on a vector space can be regarded as a homogeneous, second-degree polynomial in the coordinates of a vector relative to a fixed chosen basis of the vector space. An ordinary quadratic polynomial is the sum of a quadratic form, a linear map, and a constant function.
The quadratic polynomial #p# on #\mathbb{R}^3# determined by \[ p(x,y,z) = 2x^2 - 4xy+4xz-3y^2-3z^2+5x-7y+z-1\] can also be written as \[p(x,y,z)= \matrix{x & y & z}\, \matrix{2&-2&2\\-2&-3&0\\ 2&0&-3}\, \matrix{x\\ y\\ z} +\matrix{5&-7&1}\,\matrix{x\\ y\\ z} -1\]
The quadratic polynomial function #p# can be written as follows as a function of a column vector #\vec{x}#
\[ p(\vec{x}) =
\vec{x}^\top\, A\, \vec{x}+\vec{l}^\top\, \vec{x}+c
\] Here, #A# is a symmetric matrix, #\vec{l}# is a column vector, and #c# is a real number. We view the expression \( \vec{l}^\top\, \vec{x}\) as the matrix product of a #(1\times n)#-matrix with an #(n\times 1)#-matrix, which results in a real number; it is a different way of writing the inner product #\dotprod{\vec{l}}{\vec{x}}#. Similarly, \(\vec{x}^\top\, A\, \vec{x} \) is equal to the inner product #\dotprod{\vec{x}}{(A\,\vec{x})}#.
Let #\alpha =\basis{\vec{a}_1,\ldots ,\vec{a}_n}# be an orthonormal basis of eigenvectors of #A# with eigenvalues #\lambda_1, \ldots,\lambda_n#. Here we choose the order in such a way that the eigenvalues are equal to #0# at the end. As a consequence, there is an index #r\le n# with #\lambda_i\ne0 # for #i\le r# and #\lambda_i=0# for #i\gt r#. We indicate the coordinates of #\vec{x} = \rv{x_1, \ldots ,x_n}# with respect to #\alpha# by #\vec{x}' = \rv{x_1',\ldots, x_n'}#. The correspondence is
\[
\left(\,\begin{array}{c}
x_1\\ \vdots\\ x_n
\end{array}\,\right)=B\ \left(\,\begin{array}{c}
x_1'\\ \vdots\\ x_n'
\end{array}\,\right)
\] where the columns of #B # are the vectors #\vec{a}_1,\ldots ,\vec{a}_n#, so #B = {}_\varepsilon I_\alpha#. In terms of vectors:\[
\vec{x} =B\, \vec{x}'
\]Substitution in the function rule for #p(\vec{x})# gives\[\begin{array}{rcl}p(\vec{x}) &=& (\vec{x}' )^\top B^\top\,A\,B\,\vec{x}'
+\vec{l}^\top\,B\,\vec{x}'+c\\&=&( \vec{x}' )^\top\, D\,\vec{x}'
+(\vec{l}\,')^\top\,\vec{x}'+c\\ &=&
\lambda_1x_1'^2+\cdots + \lambda_nx_n'^2+l_1'x_1'+\cdots +l_n'x_n'+c\end{array}
\] Here, #D=B^\top\,A\,B# is the #(n\times n)#-diagonal matrix with the eigenvalues #\lambda_1,\ldots,\lambda_n# on the diagonal and #\vec{l}\,'# is given by
\[
\vec{l}\,' =\left(\vec{l}^\top\, B \right)^\top=B^\top\,\vec{l}
\] Thus, the components of #\vec{l}\,'# are the #\alpha#-coordinates of the vector #\vec{l}#.
If #i\leq r#, then we can eliminate the linear term with #x_i'# by completing the square. Here we proceed by using translations: Let #\vec{a}# be a column vector in #\mathbb{R}^n#, and write \[\vec{x}\,'' = T_{-\vec{a}}(\vec{x}')\] so #\vec{x}\,' = \vec{x}\,''+\vec{a}#. Substituting this expression for #\vec{x}\,' # in #p(\vec{x}) # gives \[\begin{array}{rcl}p(\vec{x}) &=&
\left(\vec{x}\,''+\vec{a}\right)^\top \,D\,(\vec{x}\,''+\vec{a})
+(\vec{l}\,')^\top\,(\vec{x}\,''+\vec{a})+c\\ &&\color{blue}{\vec{x}\,' = \vec{x}\,''+\vec{a}\text{ substituted in }p(\vec{x})=(\vec{x}' )^\top\, D\,\vec{x}'
+(\vec{l}\,')^\top\,\vec{x}'+c}\\&=&
(\vec{x}\,'')^\top \,D\,\vec{x}\,''+(\vec{x}\,'')^\top\,D\,\vec{a}+\vec{a}^\top\,D\,\vec{x}\,''+\vec{a}^\top\, D \, \vec{a}
+(\vec{l}\,')^\top\,\vec{x}\,''+(\vec{l}\,')^\top\,\vec{a}+c\\&&\color{blue}{\text{brackets expanded}}\\&=&
(\vec{x}\,'')^\top \,D\,\vec{x}\,''+\left(\vec{a}^\top\,D^\top\,\vec{x}\,''\right)^\top+\vec{a}^\top\,D\,\vec{x}\,''+\vec{a}^\top\, D \, \vec{a}
+(\vec{l}\,')^\top\,\vec{x}\,''+(\vec{l}\,')^\top\,\vec{a}+c\\&&\color{blue}{\text{computational rule }\left(\vec{a}^\top\,D^\top\,\vec{x}\,''\right)^\top=(\vec{x}\,'')^\top\,D\,\vec{a}}\\&=&
(\vec{x}\,'')^\top \,D\,\vec{x}\,'' +2\vec{a}^\top\,D\,\vec{x}\,''+\vec{a}^\top\, D \, \vec{a}
+(\vec{l}\,')^\top\,\vec{x}\,''+(\vec{l}\,')^\top\,\vec{a}+c\\&&\color{blue}{D^\top=D\text{ and }\left(\vec{a}^\top\,D\,\vec{x}\,''\right)^\top=\vec{a}^\top\,D\,\vec{x}\,''\text{ since this is a scalar}}\\ &=&
(\vec{x}\,'')^\top \,D\,\vec{x}\,''+\left(2D\,\vec{a}+\vec{l}\,'\right)^\top\,\vec{x}\,''+c'\\&&\color{blue}{\text{rewritten with }c' = \vec{a}^\top\, D \, \vec{a} +(\vec{l}\,')^\top\,\vec{a}+c}\end{array}
\]In order to eliminate the terms linear in #x_i# for #i\le r#, we choose \[\vec{a} = -\frac12\cdot D'\vec{l}\,'\] where #D'# is the diagonal matrix with #\lambda_i^{-1}# as its #i#-th diagonal entry for #i\le r# and zeros elsewhere. So, on the subspace perpendicular to the kernel of #D#, the matrix #D'# is the inverse of #D#, and \[\vec{a}^\top=-\frac12 \cdot \rv{\lambda_1^{-1}l_1',\ldots,\lambda_r^{-1}l_r',0,\ldots,0}\] The result is \[\begin{array}{rcl}p(\vec{x}) &=&
(\vec{x}\,'')^\top \,D\,\vec{x}\,'' +\left(\vec{l}\,''\right)^\top \,\vec{x}\,''+c'\end{array}
\] where #\vec{l}\,''# is the vector with zeros in the first #r# coordinates and #l_i''=l_i'# for #i=r+1,\ldots,n#.
If #\vec{l}\,''=\vec{0}#, then the function rule for #p# is as required, that is, as in the second case with #l_{r+1} = c'#. To see this, we choose #L = B\,T_{\vec{a}}# so #\vec{x}=L\, (\vec{x}\,'')# and \[p(L(\vec{x}\,'')) = p(\vec{x}) = (\vec{x}\,'')^\top \,D\,\vec{x}\,'' +l_{r+1}\] If we replace the argument #\vec{x}\,''# on both sides by #\vec{x}#, we find the required formula.
Assume, therefore, that #\vec{l}\,''# is not equal to the zero vector. In the rest of the proof we will only consider isometries of #\mathbb{R}^n# which fix the first #r# coordinates. So they will leave the following linear subspace invariant: \[W = \linspan{\vec{e}_1,\ldots,\vec{e}_r}^\perp=\linspan{ \vec{e}_{r+1},\ldots,\vec{e}_n}\] Because the first #r# coordinates of #\vec{l}\,''# are equal to zero, this vector lies in #W#.
Put #l_{r+1} = \norm{\vec{l}\,''}#. There is an orthogonal map #S:\mathbb{R}^n\to\mathbb{R}^n# which fixes each vector in #W^\perp=\linspan{\vec{e}_1,\ldots,\vec{e}_r}# and maps #l_{r+1}\vec{e}_{r+1}# onto #\vec{l}\,''#. An example is the orthogonal reflection #S = S_{l_{r+1}\vec{e}_{r+1}-\vec{l}\,''}#. Because #S# is orthogonal, according to property 5 of orthogonal maps it also leaves #W# invariant, such that we find for an arbitrary vector #\vec{y}# in #\mathbb{R}^n#:\[\begin{array}{rcll}S\,D\,\vec{y}&=&S\,D\left(\vec{w}+\vec{m}\right)&\color{blue}{\text{direct sum decomposition with }\vec{w}\in W\text{ and }\vec{m}\in W^\perp}\\&=&S\,D\,\vec{m}&\color{blue}{\vec{w}\in W=\ker{D}}\\&=&D\,\vec{m}&\color{blue}{D\,\vec{m}\in W^\perp\text { and }S\text{ fixes }W^\perp}\\&=&D\left(S\,\vec{w}+\vec{m}\right)&\color{blue}{S\,\vec{w}\in W=\ker{D}}\\&=&D\,S\left(\vec{w}+\vec{m}\right)&\color{blue}{\vec{m}\in W^\perp\text{ and }S\text{ fixes }W^\perp}\\&=&D\,S\,\vec{y}&\color{blue}{\vec{y}=\vec{w}+\vec{m}}\\\end{array}\] Below, we use this result in the form #S^{-1}\,D=D\,S^{-1}#. Write \[\vec{x}\,''' = S^{-1}\vec{x}\,''\] Then we have # \vec{x}\,''' = \vec{x}\,''# for all #\vec{x}''# in #W^\perp#, such that \[\begin{array}{rcl}p(\vec{x}) &=& (\vec{x}\,'')^\top \,D\,\vec{x}\,'' +\left(\vec{l}\,''\right)^\top \,\vec{x}\,''+c'\\&&\color{blue}{\text{previously found formula}}\\&=&(\vec{x}\,'')^\top\left(S^{-1}\right)^\top\,S^{-1}\,D\,\vec{x}\,'' +\left(\vec{l}\,''\right)^\top\left(S^{-1}\right)^\top\,S^{-1} \,\vec{x}\,''+c'\\&&\color{blue}{\left(S^{-1}\right)^\top\,S^{-1}=I_n\text{ because }S\text{ is orthogonal}}\\&=&
(S^{-1}\vec{x}\,'')^\top\,\,D\,\left(S^{-1}\vec{x}\,''\right)+\left(S^{-1}\vec{l}\,''\right)^\top \,(S^{-1}\vec{x}\,'')+c'\\&&\color{blue}{\text{computational rule }\left(X\,Y\right)^\top=Y^\top\,X^\top\text{ and }S^{-1}\,D=D\,S^{-1}}\\&=&
(\vec{x}\,''')^\top \,D\,\vec{x}\,''' +\left(l_{r+1}\vec{e}_{r+1}\right)^\top \,\vec{x}\,'''+c'\\&&\color{blue}{\vec{x}\,''' = S^{-1}\vec{x}\,''\text{ and }S\left(l_{r+1}\vec{e}_{r+1}\right)=\vec{l}\,''}\end{array}\] Finally, we choose the vector #\vec{b} = c'\cdot l_{r+1}^{-1}\,\vec{e}_{r+1}# and we write \[\vec{x}\,'''' = T_{-\vec{b}}\,\vec{x}\,''' \] so # \vec{x}\,''' = \vec{x}\,'''' - \vec{b} # and \[\begin{array}{rcl}p(\vec{x}) &=&(\vec{x}\,''')^\top \,D\,\vec{x}\,''' +\left(l_{r+1}\vec{e}_{r+1}\right)^\top \,\vec{x}\,'''+c'\\&&\color{blue}{\text{formula from above}}\\&=&(\vec{x}\,''''- \vec{b})^\top \,D\,\left(\vec{x}\,''''- \vec{b}\right)+\left(l_{r+1}\vec{e}_{r+1}\right)^\top \,(\vec{x}\,''''-\vec{b})+c'\\&&\color{blue}{\vec{x}\,''' = \vec{x}\,'''' - \vec{b}}\\&=& (\vec{x}\,'''')^\top \,D\,\vec{x}\,'''' +\left(l_{r+1}\vec{e}_{r+1}\right)^\top \,\vec{x}\,''''-\left(l_{r+1}\vec{e}_{r+1}\right)^\top \,\vec{b}+c'\\&&\color{blue}{\vec{b}^\top\,D=\vec{0}^\top\text{ and }D\,\vec{b}=\vec{0}\text{ because }\vec{b}\in W=\ker{D}}\\&=& (\vec{x}\,'''')^\top \,D\,\vec{x}\,'''' +\left(l_{r+1}\vec{e}_{r+1}\right)^\top \,\vec{x}\,''''-c'\left(l_{r+1}\vec{e}_{r+1}\right)^\top \, l_{r+1}^{-1}\vec{e}_{r+1}+c'\\&&\color{blue}{\vec{b} = c'\cdot l_{r+1}^{-1}\,\vec{e}_{r+1}\text{ substituted}}\\&=& (\vec{x}\,'''')^\top \,D\,\vec{x}\,'''' +l_{r+1}\cdot \vec{e}_{r+1}^\top \,\vec{x}\,''''\\&&\color{blue}{\text{last two terms cancel thanks to our choice for }\vec{b}}\\&=& (\vec{x}\,'''')^\top \,D\,\vec{x}\,''''+l_{r+1}\cdot {x''''}_{r+1}\end{array}
\] This shows that we arrive at the function rule of the first case.
We verify that, also in the first case, the result can be formulated as indicated in the theorem. Composing all transformations involved, we see that #K= B\,T_{\vec{a}}\,S\,T_{\vec{b}}# is an isometry that satisfies #\vec{x} = K(\vec{x}\,'''')#: \[\vec{x} = B\, \vec{x}\,' = B\,T_{\vec{a}}\vec{x}\,''=B\,T_{\vec{a}}\,S\,\vec{x}\,''' = B\,T_{\vec{a}}\,S\,T_{\vec{b}}\,\vec{x}\,''''=K(\vec{x}\,'''')\] Using this in the function rule for #p#, we find \[p(K(\vec{x}\,'''')) = (\vec{x}\,'''')^\top \,D\,\vec{x}\,'''' +l_{r+1}\cdot {x''''}_{r+1}\] Finally, we replace #\vec{x}\,''''# by #\vec{x}#, and arrive at \[\begin{array}{rcl}p(K(\vec{x})) &=& \vec{x}^\top \,D\,\vec{x} +l_{r+1}\cdot {x}_{r+1}\\ & =&\displaystyle\sum_{i=1}^r l_i\cdot x_i^2 +l_{r+1}\cdot x_{r+1}\end{array}\]
The standard form is not unique. For example, the ordering of diagonal entries of #A# is not fixed (although we often order the diagonal entries according to decreasing absolute value), and the coefficient of the linear term with #x_{r+1}# can be multiplied by #-1#.