Next:Approx. and Round-off-Errors Up :Main Previous : Least Squares Regression (continued)

Nonlinear Regression:

Suppose that we are interested in fitting a curve of the form

to a given data set of observations from an experiment. Unlike the transcendental curve or exponential curve one cannot linearize $ (1)$ by taking logarithms on both the sides. However, like in linear regression analysis, nonlinear regression is based on determining the values of the parameters that minimize the sum of the squares of residuals. In the nonlinear case this is achieved in a iterative fashion.

The Gauss-Newton method is used for minimizing the sum of the squares of the residuals between data and nonlinear equations. Here the Taylor series expansion is used to express the original nonlinear equation in an approximate, linear form. Then the principle of least square is used to obtain new estimates of the parameters that move in the direction of minimizing the residual.

Now to illustrate the methodology let us represent $ (1)$ as

Let us suppose that we are given a data set ... of observations from an experiment. At each of $ (x_{i},y_{i})$ while $ y_{i}$ is the measured value , $ f(x_{i},a_{0},a_{1})$ will be the estimated value. For simplicity we denote $ f(x_{i},a_{0},a_{1})$ by $ f(x_{i})$. Now going by the Gauss-Newton method , let us linearize the nonlinear mode at $ (j+1)^{th}$ iteration level using Taylor series as follows:

$ f(x_{i})_{j+1}\simeq f(x_{i})_{j}+\frac{\partial
f(x_{i})_{j}}{\partial a_{o}...
... a_{0}+\frac{\partial
f(x_{i})_{j}}{\partial a_{1}}\quad \Delta a_{1}\qquad(3)$

where the subscripts $ j,(j+1)$ denote iteration levels,

Now we may note that $ (3)$ represents the linearized version of the model w.r.t iteration level. $ i.e$ at $ (j+1)^{th}$ iteration we already know the $ j^{th}$ level values of the parameters i.e & . Now we have the residuals given by :

$ i=1,2...n$

which in the matrix form may be written as

$ \{D\}=[Z_{j}]\{\Delta A\}\qquad(6)$

where

$ i.e$ $ [Z_{j}]$ is the matrix of partial derivatives of the function evaluated with the $ j^{th}$ level iteration values.

,

$ i.e$ $ \{D\}$ contains the differences between the measurements and the function values and $ \{\Delta A\}$ contains the changes in the parameter values.

Applying the principle of least squares on $ (6)$ we arrive at

$ \left[[Z_{j}]^{T}[Z_{j}]\right]\{\Delta
A\}=\{[Z_{j}]^{T}\{D\}\}\qquad(8)$

Now on solving $ (8)$ we obtain $ \{\Delta A\}$, which can be used to compute improved values of the parameters $ a_{0},a_{1}$ $ i.e.$

We repeat the above procedure until the solution converges i.e until

(10)

where denotes the relative error in $ a_{k}^{th}$ parameter and is some prescribed tolerance level for the relative error in $ a_{k}^{th}$ parameter.

Example: Fit the function to the following data:

Using the initial guesses of $ (a_{0},a_{1})=(1,1)$ , find the solution to an accuracy of

Solution) The partial derivatives of the function w.r.t $ a_{1},a_{2}$ are:

Given that

$ (a_{0})_{0}=1,\quad(a_{1})_{0}=1,\quad n=5(data \quad size)$

using and the given data we get

\begin{displaymath}
% latex2html id marker 694
\therefore[Z^{T}_{0}][Z_{0}]=\lef...
....3193 & 0.9489 \\
0.9489 & 0.4404 \\
\end{array}%
\right]\end{displaymath}

\begin{displaymath}
% latex2html id marker 696
\therefore\left[[Z_{0}^{T}][Z_{0}...
...97 & - 7.8421\\
-7.8421 & 19.1678 \\
\end{array}%
\right]\end{displaymath}

Using and given data we get

= \begin{displaymath}\left\{%
\begin{array}{c}
0.0588 \\
0.0424 \\
-0.0335\\
-0.0862 \\
-0.1046 \\
\end{array}%
\right\}\end{displaymath}

\begin{displaymath}
% latex2html id marker 704
\therefore \{Z_{0}^{T}\}\{D\}=\le...
...in{array}{c}
-0.1533 \\
-0.0365 \\
\end{array}%
\right]\end{displaymath}

By $ (8)$ we get

\begin{displaymath}\{\Delta A\}=\left\{%
\begin{array}{c}
-0.2714 \\
0.5019 \\
\end{array}%
\right\}\end{displaymath}

\begin{displaymath}
% latex2html id marker 712
\therefore\left\{%
\begin{array}{c}
a_{0} \\
a_{1} \\
\end{array}%
\right\}_{1}\end{displaymath}= \begin{displaymath}\left\{%
\begin{array}{c}
a_{0} \\
a_{1} \\
\end{array}%
\right\}_{0}\end{displaymath}+ \begin{displaymath}\left\{%
\begin{array}{c}
\Delta a_{0} \\
\Delta a_{1} \\
\end{array}%
\right\}_{0}\end{displaymath}

+

\begin{displaymath}\qquad\qquad=\left\{%
\begin{array}{c}
0.7286 \\
1.5019\\
\end{array}%
\right\}\end{displaymath}

Now we start with \begin{displaymath}\left\{%
\begin{array}{c}
a_{o}\\
a_{1} \\
\end{array}%
\right\}_{1}\end{displaymath} and repeat the procedure to obtain \begin{displaymath}\left\{%
\begin{array}{c}
a_{o}\\
a_{1} \\
\end{array}%
\right\}_{2}\end{displaymath}

and so on till the convergence criteria $ (10)$ is satisfied with .

Remark:

(1) The partial derivatives $ \frac{\partial f_{i}}{\partial
a_{k}}$ may be calculated using difference equations i.e.

Where small perturbation in $ a_{k}$.

are m+1 parameters.

$ (2)$ The Gauss-Newton method may converge slowly or may sometimes oscillate widely or may not even converge. Modifications of the method have been suggested to overcome the shortcomings. This discussion is out of scope of the current discussion.



Next:Approx. and Round-off-Errors Up :Main Previous : Least Squares Regression (continued)