Constrained Optimization

Next: Condition Number of the Up: Review of The Basics: Previous: Unconstrained Optimization

Constrained Optimization

Consider next the problem

$\displaystyle \begin{array}{c} \min_{\alpha} E(\alpha, U(\alpha)) \\ L(U(\alpha),\alpha) = 0 \end{array}$

(5)

where $U = (U_1, \dots, U_n), \alpha = (\alpha _1 , \dots , \alpha _q), L = (L_1, \dots, L_n )$ . We derive now the optimality conditions for this case. Consider changes in $\alpha$ and correspondingly in

$\displaystyle \begin{array}{c} \alpha \rightarrow \alpha + \epsilon \tilde \alp... ...\epsilon \tilde \alpha ) = U + \epsilon \tilde U + O( \epsilon ^2), \end{array}$

(6)

where $\tilde \alpha$ and $\tilde U$ are related through the equation

$\displaystyle L_U \tilde U + L_\alpha \tilde \alpha = 0,$

(7)

which is a linearization of the constraint equation in (5). The variation in the functional can be written as

$\displaystyle \delta E \equiv E(\alpha + \epsilon \tilde \alpha, U(\alpha + \ep... ...a) ) = \epsilon ( \tilde\alpha ^T E_\alpha + \tilde U^T E _U) + O(\epsilon ^2).$

(8)

For this formulation we see that a descent direction for the functional depends on $\tilde U$ , which is not known before we have decided about the direction of change (since $\tilde U$ depends on $\tilde \alpha$ ). Using $\tilde U$ is not a viable approach and we are going to derive a different one. The idea is to eliminate the dependence of the variation in the functional on $\tilde U$ . We derive it in details since later on we need to do a similar derivation for partial differential equations (PDE) and there things may look less obvious.

From equation (7) for $\tilde U$ we have (by taking transpose),

$\displaystyle \tilde U^T L_U^T + \tilde \alpha ^T L _\alpha ^T = 0$

(9)

and therefore also

$\displaystyle (\tilde U^T L_U^T + \tilde \alpha ^T L _\alpha ^T ) \lambda = 0$

(10)

for an arbitrary vector $\lambda = ( \lambda _1 , \dots, \lambda _n)$ . The plan is to add this term, which is zero, to our expression for the variation in the cost functional and then by using a proper choice for $\lambda$ simplifying the variation of the cost functional such that it does not depend on $\tilde U$ . Clearly,

$\displaystyle \delta E =\epsilon ( \tilde\alpha ^T E_\alpha + \tilde U^T E _U) ... ...n ( \tilde U^T L_U^T + \tilde \alpha ^T L _\alpha ^T ) \lambda + O(\epsilon ^2)$

(11)

and by recombination of terms

$\displaystyle \delta E = \epsilon \tilde\alpha ^T (E_\alpha + L _\alpha ^T \lambda ) + \epsilon \tilde U^T ( E _U + L_U^T \lambda ) + O(\epsilon ^2),$

(12)

and this is true for all $\lambda$ . Now a proper choice for $\lambda$ that simplifies (12) is given by

$\displaystyle \mbox{\tt Adjoint Equation: } L_U^T \lambda + E _U = 0,$

(13)

which leads to

$\displaystyle \delta E = \epsilon \tilde \alpha ^T ( L _\alpha ^T \lambda + E_\alpha ) + O(\epsilon ^2).$

(14)

Equation (13) for $\lambda$ is called the adjoint (or costate) equation and $\lambda$ is called the adjoint (costate) variable, or the Lagrange multiplier. Note the last expression for the variation of

given by (14) does not depends on $\tilde U$ , but it does depends on $\lambda$ , which satisfies the adjoint equation listed above. It is clear that the choice

$\displaystyle \tilde \alpha = -( L _\alpha ^T \lambda + E _\alpha )$

(15)

is a direction of descent for the functional

, since

$\displaystyle \delta E = - \epsilon \Vert L _\alpha ^T \lambda + E _\alpha \Vert ^2 + O(\epsilon ^2).$

(16)

This direction is called the steepest descent direction, and the method based on it is called the steepest descent method.

At a minimum $L _\alpha ^T \lambda + E _\alpha = 0$ , giving us the optimality condition (necessary condition)

$\displaystyle \mbox{\tt Optimality Conditions: } \begin{array}{l} L( U(\alpha )... ...a + E _U = 0 \\ \vspace{2mm} L _\alpha ^T \lambda + E _\alpha = 0. \end{array}$

(17)

The left hand side of the last equation is the gradient of the functional subject to the constraints,

$\displaystyle \nabla E = L _\alpha ^T \lambda + E _\alpha.$

(18)

Next: Condition Number of the Up: Review of The Basics: Previous: Unconstrained Optimization

Shlomo Ta'asan 2001-08-22