运筹学 - 理论

only for continuous optimization
Convex optimization
Theory
Algorithms

基础概念

Linear Space

\[\begin{align*} &A \in \mathbb{R}^{m} &\text{ m rows, n columns, } \\ &x \in \mathbb{R}^{n}&\\ &c \in \mathbb{R}^{m}&\\ &S = \{ x \vert Ax = c \} &\text{ can be seen as "linear space" with dimension } \\ &dim( S ) \le n - m & \end{align*}\]

例子加强记忆：

\[A \in \mathbb{R}^{4 \times 2} \\ x \in \mathbb{R}^{2}\\ A x = \left(\begin{array}{ccc} 1 & 5 \\ 2 & 6 \\ 3 & 7 \\ 4 & 8 \\ \end{array} \right) \left(\begin{array}{ccc} x \\ y \\ \end{array} \right) = \left(\begin{array}{ccc} x + 5y \\ 2x + 6y \\ 3x + 7y \\ 4x + 8y \\ \end{array} \right)\]

Convexity

Convex Sets & Hull

我们一般会要求这个假设。

A set \(S \subset \mathbb{R}^{n}\) is a convex set if \(\forall x_1, x_2 \in S, \, \lambda \in (0,1), \lambda x_1 + (1- \lambda) \in S\)

A convex hull of S, \(conv(S)\) is the set

\[\begin{align*} conv(S) &= \bigg\{ \sum_{j = 1}^{k} \lambda_j x_j : \sum_{j =1}^{k} \lambda_j = 1, \lambda_j \ge 0. x_j \in S, k = 1,2,... \bigg\} \\ \end{align*}\]

为什么这里需要一个 k, 我理解的是, 这个set S 有可能是个无穷集, k 只是一个arbitrarily large 的数。

如果 S 是有限集的话, 我觉得这个定义完全可以这么写：

\[conv(S) = \bigg\{ \sum_{j = 1}^{|S|} \lambda_j x_j : \sum_{j =1}^{|S|} \lambda_j = 1, \lambda_j \ge 0. x_j \in S \bigg\}\]

两个比较之下其实很好理解, Convex Set 就只是任意两个点之间的所有点, 在2维的图像里, 如果有三个点, 假设这三个顶点都在这个set, 那么这个set 就会覆一个三角形的三条边; 而 Convex Hull 就是里面任意几个点之间的所有点, 就会形成这个三角形的整个面积.

Lemma Convex Hull of S is the smallest convex set containing S.

第二个定义： \(conv(S)\) 是所以包含S 的convex set 的交集。 (不同的定义可以只是我们理解问题的不同角度。不要只关注技术性的等价关系)

点我展开证明

Proof: 其实只要3点：

\[conv(S) \supseteq S\]
\[conv(S) \text{ is a convex set}\]
\[conv(S) \subseteq H \, , \; \forall H \in C , \, \, C = \{ \text{ all convex set contains } S \}\]

这样这个 convex hull 就是最小的一个 convex set 了, 但是还不能证明唯一。

is trivial
简单就是, 根据 convex set 的定义, 任意两个点：

\[\forall x_1, x_2 \in conv(S) \to \\ x_1 = \sum_{ j = 1}^{k_1} \lambda_j x_j ,\, \text{where } \sum_{j = 1}^{k_1} \lambda_j = 1\\ x_2 = \sum_{j = 1}^{ k_2} \theta_j x_j ,\, \text{where } \sum_{j = 1}^{k_2} \theta_j = 1\\\]

要注意的是, 任意两个 \(x_1, x_2 \in conv(x)\) 我们都可以构造两个”长度”一样的sum, 只需令短的补一些 0 到长的就可以了。\(k_0 = max(k_1, k_2)\).

于是：

\[\lambda x_1 + (1 - \lambda) x_2 = \sum_{j = 1}^{k_0} (\lambda \lambda_j + (1 - \lambda) \theta_j) x_j \\ \sum_{j = 1}^{k_0} (\lambda \lambda_j + (1 - \lambda) \theta_j) = 1 \\\]

然后显然得我们就能证明第二点了。

(还有一个问题是, 比如椭圆, 这显然是一个not countable set 我们可以用 k 这个countable sum来证明吗？)

第三点也是毕竟显然的。

\(conv(S)\) is closed ?

if S is open then Conv(S) is open (未考证

if S is closed then Conv(S) is not necessary closed.

Caratheodory’s Theorem

\[\text{For } S \subset \mathbb{R}^{n} \text{ , all } x \in conv(S) \text{ can be written as a convex combination of at most } n + 1 \text{ points in } S.\]

用例子最简单理解了, 2维空间, 比如有3个点, 一个三角形, 那三角形内部的点我们需要这3个点的 convex combination 才能表示。只有两个点的话我们只能表示一个线段内部的点。即使是一个正方形内部, 我们也可以用3个点去代表她。

点我展开证明 proof

直觉告诉我们, 肯定和linear dependant 有关, 但是要证明我个人觉得不是那么容易的;

首先整体的思路是：

1) 假设我们一个 x 可被 k 个 convex combination 表示, 并且 k > n + 1
2) 然后我们只要给出一个能表示 x 的且只需要 k - 1 个convex combination, 证明就完成了。

这里的难点是, 我们要清楚, 在\(\{ x_1, x_2, ..., x_k \}\) 里, 一定存在不全为零的\(\{ \mu_1, \mu_2, ..., \mu_k \}\) 使得：

\[\sum_{j = 1}^{k} \mu_j x_j = 0 \\ \sum_{j = 1}^{k} \mu_j = 0\]

这个一般是通过构造一个新的集合：\(\{x_2 - x_1, x_3 - x_1, ..., x_k - x_1 \}\) 这个集合的长度为 \(k - 1 > n\) 所以在 \(\mathbb{R}^{n}\) 还是linear dependant, 我们写作

\[\sum_{j =2}^{k} ( \mu_k x_k - x_1) = 0 \, , \{ \mu_j \} \text{ not all zero} \\ \mu_1 = - \sum_{j = 2}^{k} \mu_j \\\]

这样我们有了这个结论之后, 就比较简单了：

首先我们这个 x

\[\begin{align*} x &= \sum_{j = 1}^{k} \lambda_j x_j - \boldsymbol{0}\\ &= \sum_{j = 1}^{k} \lambda_j x_j - \alpha \sum_{j = 1}^{k} \mu_j x_j \\ &= \sum_{j = 1}^{k} ( \lambda - \alpha \mu_j) x_j \end{align*}\]

然后我们要知道 \(\sum_{j = 1}^{k} ( \lambda_j - \alpha \mu_j ) = \sum_{j = 1}^{k} \lambda_j - \alpha \boxed{\sum_{j = 1}^{k} \mu_j} = 1\) 因为方块里的项恒为0.

然后我们只要构造 \(\alpha\) 达到两个条件, 那我们的证明就结束了

(1) \(( \lambda_j - \alpha \mu_j) \ge 0\) 总成立。
(2) \(( \lambda_j - \alpha \mu_j) = 0 \text{ for at least one j }\)

答案是

\[\alpha = \min_{1 \le j \le k} \bigg\{ {\frac{ \lambda_j}{ \mu_j} : \mu_j > 0} \bigg\} = {\frac{ \lambda_i }{ \mu_i}} \text{ (assume i is the index)}\]

因为
(1)
(a) 如果 \(\mu_j <= 0\) 那就是显然的.
(b) 如果 \(\mu > 0\) 我们可以假设 \(\lambda_j - \mu_j \cdot {\frac{ \lambda_i}{ \mu_i }} > 0 =\) 简单的化简一下等价于 \({\frac{ \lambda_j}{ \mu_j}} > {\frac{ \lambda_i}{ \mu_i}}\) 所以也是成立的。
(2) 这个也是显然的, 因为必然会有 \(i = j\) 的项, 自然的就是0了。

(发散一下, 构造的这个 \(\alpha\) 是唯一的吗？我个人认为是的)

Affine hull

The affine hull of \(C \subset \mathbb{R}^{n}\) is defined by: (Affine Combination)

\[\begin{align*} aff C &:= \bigg\{ \sum_{k = 1}^{m} \lambda_k x^k : x^k \in C, \, k = 1,...,m, \sum_{k = 1}^{m} \lambda_k = 1, \, m \ge 1 \bigg\} \\ \end{align*}\]

所以相比convex hull, 就不再要求 \(\lambda_k \ge 0\)

同样的, 我们也说 affine hull 是最小的 affine space (课纲没有) containing C.

Interior

Interior point 内部点

\(S \subset \mathbb{R}^{n}\) If exists an open ball centered at \(x\) which contained by \(S_{}^{}\) .

Interior set 内部(集)

Int S is the largest open subset of X contained in S}
Int S is the union of all open sets of X contained in S
Int S is the set of all interior points of S.

Relative Interior 相对内部点 of C is :

\[\begin{align*} ri \ C &:= \{ x \in C: \exists \epsilon > 0, (x + \epsilon B^1) \cap aff C \subset C \} \\ B^1 &:= \{ x \in \mathbb{R}^{n} : \| x \| \le 1 \} \end{align*}\]

比如, 在 \(\mathbb{R}^{2}\) 一个线段是没有 内部点 的, 但我们很方便的可以有 相对内部点。注意, 线段的两端的点不是相对内部点。

Closure

boundary point 就是一个 \(\epsilon\) ball, 是不是同时包括set外和set内的点。

close: (1) 所有的 boundary point is inside the set (2) 如果任意一个 sequence of points 趋于一个点的话, 然后这个点也是在这个set里,那么这个set 也可以是closed 。

教材 Nonlinear programming Theory and Algorithms 第45 页

Separations of Sets

Theorem:

存在一个 \(S \subset \mathbb{R}^{n}\) nonempty, closed, and convex.

只要 \(y \notin S\) 那么必存在 \(\boldsymbol{p} \neq \boldsymbol{0} \in \mathbb{R}^{n} \text{ and } \alpha \in \mathbb{R}\) such that

\[\begin{align*} \boldsymbol{p}^T y &> \alpha \\ \boldsymbol{p}^T y &\le \alpha \text{ for all } x \in S\\ \end{align*}\]

我们说 hyper plain \(H = \{ \boldsymbol{x} : \boldsymbol{p}^T \boldsymbol{x} = \alpha \}\) separate \(y_{}^{}\) and \(S_{}^{}\) .

要理解这个, 要对 vector space的运算非常熟悉, 要理解dot product 背后的真正含义, hyper plain 的含义, 这个结论其实是很容易得的。

思路是这样的：

证：唯一且存在一个点 \(x \in S\) 这个点是离点\(y \notin S\) 最近的点。\(\bar{x}\)
然后构造一个vector \(\boldsymbol{p} = y - \bar{x}\) 然后就很显然了。

第一点：点我展开证明proof

首先我们要先熟悉 极值定理 Extreme Value Theorem (Bolzano–Weierstrass theorem)

简单来说就是：如果 \(S\) is compact, 然后一个 continuous function \(f: S \to \mathbb{R}\) 一定存在最大值和最小值。

所以首先我们最开始的假设里没有 bounded, 只有closed, 所以构造一个 bounded的 set 在 \(\mathbb{R}^{n}\) (Euclidean space) 就已经是 compact 了。

(暂时不细究, 不是本门课重点)

然后定义一下 \(f(\boldsymbol{x}) = \| \boldsymbol{x} - \boldsymbol{y} \|\) 然后我们就能说最小值存在。

然后就是证明是唯一的。 (这就是要利用 convex 了)

其实也不难, 我们就假设不是唯一, 至少有两个点是 \(x_1 , x_2\, : \| x_1 - y \| = \| x_2 - y \| = m\) 是最小值, 那么 \(\| x_1 - x_2 \| > 0\) 然后我们可以定义一个在 \(x_1, x_2\) 之间的点 \(x_3\), 根据定义我们可以知道这个点也在convex set, 然后我们用两次 triangle inequality 证明 \(\| x_3 - y \| < m\)

\[\| x_1 - x_3 \| + m \ge \| x_3 -y \| \\ \| x_2 - x_3 \| + m \ge \| x_3 -y \| \\ \| x_1 - x_3 \| + \| x_2 - x_3 \| + 2 m \ge 2\| x_3 - y \|\]

然后就很显然了。

Farka’s Lemma

Let \(A\in \mathbb{R}^{m \times n}, \, c \in \mathbb{R}^{n}\), then exactly one of the following systems has a solution:

System 1: \(A x \le 0 ,\, c^T x > \mathbb{0} \, \text{ for some } x \in \mathbb{R}^{n}\)
System 2: \(A^T y = c ,\, y \ge \mathbb{0} \, \text{ for some } y \in \mathbb{R}^{m}\)

To prove this (1) 如果第二个有解, 第一个显然无解, 带入c 就看出来了

(2) 如果第二个无解, 我们得用我们前面的 separation theorem 了。构造 \(p^t y = \alpha\) 然后我们发现 \(A p \le 0\) 成立, \(p\) 恰好就是第一个的一个solution

(2a) 要先说明这个 set 是 closed & bounded: closed 我们要用定义, 任意sequence点, 然后用 A 可以看成是一个Linear function 所以任意sequence 我们都可以map 到点 \(y \ge 0\) 所以就显然是closed 了。

Important definition in Linear Algebra

\[A^T y = c , \,\, \text{ for some } y \ge 0\\ \Updownarrow\\ \text{ see every row of } A^T \text{ is a separate vector } : c \text{ is a convex combination of } \{A_1, A_2, A_3, ..\} \\ \text{ with positive coefficient }\]

离散运筹

Unimodular

Matrix B is unimodular if \(\vert Det(B) \vert = 1\) and all element is integer
Integer Matrix B is totally unimodular if every square, non-singular sub matrix is unimodular. \(\Leftrightarrow\) TU matrix is a \(\{ 0, \pm 1 \}\) matrix

Matrix Determinant 的本质

我一只很好奇, 这个 determinant为什么定义的这么奇怪, 一个非常奇怪的运算, 对角线的乘积然后在加上一些正负号, 现在终于找到一个非常让人满意的总结：

Determinant 本质

这个书确实很不错, 其他的内容也很值得看, 线性代数讲的很好。