FOUNDATIONS OF OPTIMIZATION
Basics
Optimization problems
o An optimization problem is
minimise f (x ) subject to x Î
f is the objective (real) is the constraint set/feasible set/search space.
o x * is an optimal solution (global minimizer) if and only if
f (x * ) £ f (x ) "x Î
o Maximizing f(x) is equivalent to minimizing –f(x).
o We consider problems in the following form
minimize f (x )
subject to hi (x ) = 0 " 1£i £m
gi (x ) £ 0 " m £i £r
n
xÎ
o We consider the following subsets of the problem
In linear programming, all functions are linear.
In convex programming, the f and g are convex, and the h are linear.
o If is the feasible set of a problem, a point x Î is a local minimum if there
exists a neighborhood N r (x ) such that f (x ) £ f (y ) "y Î Ç N r (x ) . It is an
unconstrained local minimum if f (x ) £ f (y ) "y Î N r (x ) . (Strict equivalents
exist).
Topology
o An open ball around a point x Î n with radius r > 0 is the set
{ }
N r (x ) = y Î n : x - y < r , where x = å x i2 .
o A point x Î Ì n is an interior point if there exists an open ball such that
N r (x ) Ì . A set Ì n is open if = int .
Daniel Guetta
,Foundations of Optimization Notes Page 2
o A point x Î Ì n is a closure point if, for every open ball N r (x ) , there exists
y Î with y Î N r (x ) . A set Ì n is closed if = cl .
o The set of reals is both closed and open.
o Theorems:
The union of open sets is open. The intersection of a finite number of
open sets is open.
The intersection of closed sets is closed. The union of a finite number of
closed sets is closed.
Analysis
o A sequence of vectors {xn } Ì n converges to a limit x Î n if lim x - xk = 0 ,
k ¥
and we say that xk x .
o A set Ì n is (sequentially) compact if, given a sequence {xk } Ì , there is a
subsequence {xk } converging to an element x Î .
i
Theorem (Heine-Borel): A set Ì n is compact if and only if it is
closed and bounded.
Theorem: A closed subset of a compact set is compact.
Theorem: Suppose {n } are a sequence of non-empty, compact sets that
are nested (ie: n +1 Ì n ) – then their intersection is non-empty.
o A real-valued function f defined on a domain Ì n is continuous at the point
x Î if, for every sequence {xk } Ì with xk x , lim f (xk ) = f (x ) . f is
k ¥
continuous if it is continuous at all points in .
o A function f is coercive over a set Ì n if, for every sequence {xk } Ì with
xk ¥ , we have lim f (xk ) = ¥ .
k ¥
o The inverse image of the set Ì is defined by f -1() = {x Î : f (x ) Î } .
closed
Theorem: If f is continuous and is open and is open closed, then
f -1( ) is also open closed
. This is the standard way to prove that a set is
open/closed.
Daniel Guetta
,Foundations of Optimization Notes Page 3
o Definition: A function is convex if f (lx1 + (1 - l)x 2 ) £ l f (x1 ) + (1 - l)f (x 2 ) . It
is strictly convex if the inequality is strict for x1 ¹ x 2 . We say f is convex over
= dom f if it is convex when restricted to . f is (strictly )concave if –f is
(strictly) convex.
o Definition: If f is convex with a convex domain , we define the extended-
value extension f : n È {¥} by
ìïf (x ) if x Î
f(x ) = ïí
ïï ¥ otherwise
î
and we let
{
dom f = x Î n : f(x ) < ¥ }
An extended-value function is convex if
Its domain is convex.
The standard convexity property holds.
o Given Ì n , the indicator function I : n È {¥} as
ìï 0 if x Î
I (x ) = ïí
ï¥
ïî otherwise
If is a convex set, then I is a convex function.
o Theorem: If f is convex over a convex set Ì n , then every sublevel set
{x Î : f (x ) £ g } is a convex subset of n . The converse is not true (eg: log x
on (0, ¥) ). However, we define…
o …Definition: A extended real valued function f : n È {¥} is quasiconvex
if, every one of its sublevel sets (ie: for every g Î ) is convex.
Calculus
o A function f : with Ì n is differentiable at x Î int if there exists a
vector f (x ) Î n , known as the gradient, such that
f (x + d ) - f (x ) - f (x ) ⋅ d
lim =0
d 0 d
And
Daniel Guetta
, Foundations of Optimization Notes Page 4
T
é ¶f (x ) ¶f (x ) ùú ¶f (x ) f (x + hei ) - f (x )
f (x ) = êê , , ú Î
n
= lim
ëê ¶ x 1
¶ x n ûú
¶x i h 0 h
f is differentiable over an open set Î if it is differentiable at every point in
the set. If, in addition, the components of the gradient are continuous over ,
then f is continuously differentiable over .
o If, for a point x Î int , each component of the gradient is differentiable, we say
f is twice differentiable at x, and we define the Hessian Matrix 2 f (x ) Î n´n by
é ¶2 f (x ) ù
2 f (x ) = êê ú
ú
¶ x ¶ x
êë i j úûij
If f is twice continuously differentiable in a neighborhood of x, then the Hessian
is symmetric.
o Suppose at f is twice continuously differentiable over a neighborhood N r (x ) , then
for all d Î N r (0 )
æ 2
÷ö
f (x + d ) = f (x ) + f (x )T d + 12 dT 2 f (x )d + o çç d ÷ø
è
(Formally, this means that for every C > 0, there exists a neighborhood around 0
such that the estimate of f (x + d ) differs from the real value by no more than
2
C d .
o Consider a vector-valued function F : m , Ì n and a point x Î int .
We define the gradient to be the matrix F (x ) Î n´m with
¶Fj (x )
F(x ) = éëêF1(x ), , Fm (x )ùûú F (x )ij =
¶xi
o The chain rule states that for interior points, if h(x ) = g(f (x )) , then
h(x ) = f (x )g(f (x ))
Linear algebra – Kernels and Images
o Consider a matrix A Î m´n . Then
{
ker A = x Î n : Ax = 0 }
im A = {y Î m
: y = Ax, x Î n }
o {
Given a set Î n , ^ = x Î n : x ⋅ y = 0 "y Î }
Daniel Guetta