next up previous
Next: Condition Number of the Up: Review of The Basics: Previous: Unconstrained Optimization

Constrained Optimization

Consider next the problem


$\displaystyle \begin{array}{c}
\min_{\alpha} E(\alpha, U(\alpha)) \\
L(U(\alpha),\alpha) = 0
\end{array}$     (5)

where $U = (U_1, \dots, U_n), \alpha = (\alpha _1 , \dots , \alpha _q), L = (L_1, \dots, L_n )$. We derive now the optimality conditions for this case. Consider changes in $\alpha$ and correspondingly in $U$ as
$\displaystyle \begin{array}{c}
\alpha \rightarrow \alpha + \epsilon \tilde \alp...
...\epsilon \tilde \alpha ) = U + \epsilon \tilde U + O( \epsilon ^2),
\end{array}$     (6)

where $\tilde \alpha$ and $\tilde U$ are related through the equation
$\displaystyle L_U \tilde U + L_\alpha \tilde \alpha = 0,$     (7)

which is a linearization of the constraint equation in (5). The variation in the functional can be written as
$\displaystyle \delta E \equiv E(\alpha + \epsilon \tilde \alpha, U(\alpha + \ep...
...a) ) = \epsilon ( \tilde\alpha ^T E_\alpha + \tilde U^T E _U) + O(\epsilon ^2).$     (8)

For this formulation we see that a descent direction for the functional depends on $\tilde U$, which is not known before we have decided about the direction of change (since $\tilde U$ depends on $\tilde \alpha$). Using $\tilde U$ is not a viable approach and we are going to derive a different one. The idea is to eliminate the dependence of the variation in the functional on $\tilde U$. We derive it in details since later on we need to do a similar derivation for partial differential equations (PDE) and there things may look less obvious.

From equation (7) for $\tilde U$ we have (by taking transpose),

$\displaystyle \tilde U^T L_U^T + \tilde \alpha ^T L _\alpha ^T = 0$     (9)

and therefore also
$\displaystyle (\tilde U^T L_U^T + \tilde \alpha ^T L _\alpha ^T ) \lambda = 0$     (10)

for an arbitrary vector $\lambda = ( \lambda _1 , \dots, \lambda _n)$. The plan is to add this term, which is zero, to our expression for the variation in the cost functional and then by using a proper choice for $\lambda$ simplifying the variation of the cost functional such that it does not depend on $\tilde U$. Clearly,
$\displaystyle \delta E =\epsilon ( \tilde\alpha ^T E_\alpha + \tilde U^T E _U) ...
...n (
\tilde U^T L_U^T + \tilde \alpha ^T L _\alpha ^T ) \lambda + O(\epsilon ^2)$     (11)

and by recombination of terms
$\displaystyle \delta E = \epsilon \tilde\alpha ^T (E_\alpha + L _\alpha ^T \lambda ) + \epsilon \tilde U^T ( E _U + L_U^T \lambda ) + O(\epsilon ^2),$     (12)

and this is true for all $\lambda$. Now a proper choice for $\lambda$ that simplifies (12) is given by
$\displaystyle \mbox{\tt Adjoint Equation: } L_U^T \lambda + E _U = 0,$     (13)

which leads to
$\displaystyle \delta E = \epsilon \tilde \alpha ^T ( L _\alpha ^T \lambda + E_\alpha ) + O(\epsilon ^2).$     (14)

Equation (13) for $\lambda$ is called the adjoint (or costate) equation and $\lambda$ is called the adjoint (costate) variable, or the Lagrange multiplier. Note the last expression for the variation of $E$ given by (14) does not depends on $\tilde U$, but it does depends on $\lambda$, which satisfies the adjoint equation listed above. It is clear that the choice
$\displaystyle \tilde \alpha = -( L _\alpha ^T \lambda + E _\alpha )$     (15)

is a direction of descent for the functional $E$, since
$\displaystyle \delta E = - \epsilon \Vert L _\alpha ^T \lambda + E _\alpha \Vert ^2 + O(\epsilon ^2).$     (16)

This direction is called the steepest descent direction, and the method based on it is called the steepest descent method.

At a minimum $L _\alpha ^T \lambda + E _\alpha = 0 $, giving us the optimality condition (necessary condition)


$\displaystyle \mbox{\tt Optimality Conditions: }
\begin{array}{l}
L( U(\alpha )...
...a + E _U = 0 \\  \vspace{2mm}
L _\alpha ^T \lambda + E _\alpha = 0.
\end{array}$     (17)

The left hand side of the last equation is the gradient of the functional subject to the constraints,
$\displaystyle \nabla E = L _\alpha ^T \lambda + E _\alpha.$     (18)


next up previous
Next: Condition Number of the Up: Review of The Basics: Previous: Unconstrained Optimization
Shlomo Ta'asan 2001-08-22