snippetMinor
How can Lagrange Multipliers and Penalty Method be applied to optimize algorithms?
Viewed 0 times
canpenaltymethodmultipliersappliedalgorithmsoptimizehowandlagrange
Problem
I have a programming assignment which I was told that is solvable with some DP algorithm. The question is a variant of LIS problem where at most $k$ exceptions (restarts) are allowed.
But I know that there is a better solution. My professor mentioned Lagrange Multipliers and giving a penalty for each restart. But after googling these terms I wasn't able to find out anything related to algorithms. Every article is about Calculus and function optimization.
I also read about them on Wikipedia but I can't figure out how to use them.
Is there a keyword that can describe better what I want to read?
But I know that there is a better solution. My professor mentioned Lagrange Multipliers and giving a penalty for each restart. But after googling these terms I wasn't able to find out anything related to algorithms. Every article is about Calculus and function optimization.
I also read about them on Wikipedia but I can't figure out how to use them.
Is there a keyword that can describe better what I want to read?
Solution
This technique is called Lagrangian relaxation.
The regular $DP$ approach, where $DP[a][b]$ represents the length of the longest increasing subsequence that ends in the $a$'th number and restarts at most $b$ times, is $\mathcal{O}(nk \log n)$. For convenience we'll assume the last number is the largest, and therefore $DP[n][k]$ is the value we are looking for (if this isn't the case, append $\infty$ and after calculating the answer decrement it by $1$).
To optimize this, we'll select some $\lambda \in \mathbb{N}$ which represents the cost of every exception, and compute $DP'[a] = \max_{b} DP[a][b] - \lambda b$. This can be done in $\mathcal{O}(n \log n)$: first sort the values, and keep a range maximum data structure over them, with all positions $j$ initialised to $v_{j} = 0$. Assume the value at position $i$ is the $p_{i}$th in the sorted list of values. Then $DP'[i] = \max(1 + \max_{j p_{i}} v_{j})$, and we set $v_{p_{i}} = DP'[i]$.
What use is computing $\max_{b} DP[n][b] - \lambda b$ to us? Notice that as we increase $\lambda$, the optimal $b$ in the maximum cannot increase. Similarly as we decrease $\lambda$, the optimal $b$ cannot decrease. If $\lambda = 0$, it is optimal to take all elements to our subsequence, and if $\lambda = n$, it is optimal to have $0$ exceptions. If we can find $\lambda$ for which the optimal $b$ can be $k$, then $DP[n][b] = DP'[n] + \lambda b$. Further, if such $\lambda$ exists, we can do binary search for it, yielding a $\mathcal{O}(n \log^{2} n)$ algorithm.
Note that we can modify the $\mathcal{O}(n \log n)$ algorithm to calculate the minimum and maximum values of $b$ that achieve the maximum value with the specific $\lambda$ with no increase in complexity. We can always find a $\lambda$ for which $\min_{b} \leq k \leq \max_{b}$, but this doesn't guarantee that there exists some subsequence with $k$ exceptions achieving the maximum. However, if we can show that $DP[n][b]$ is concave, i.e. $DP[n][b+2] - DP[n][b+1] \leq DP[n][b+1] - DP[n][b]$, we get this result, as we know that $DP[n][\min_{b} + 1] \leq DP[n][\min_{b}] + \lambda$ (due to maximality), therefore $DP[n][\max_{b}] \leq DP[n][k] + (\max_{b} - k) \lambda$, hence $DP'[n] = DP[n][\max_{b}] - \lambda \max_{b} \leq DP[n][k] - \lambda k$ and there exists a subsequence with $k$ exceptions achieving the maximum.
EDIT: Here is a C++ program for finding a maximal subsequence in $\mathcal{O}(n \log^{2} n)$. I use a segment tree for the range maximum data structure.
EDIT2: Thanks to Jaehyun Koo over at Codeforces I now know how to show concavity. The following is a modified version of his proof.
Consider the array partitioning problem. In it we are given values $cost[A][B]$ representing the cost of interval $[a, b)$, and wish to partition the array into intervals $[0, x_{1}), [x_{1}, x_{2}), \dots, x_{k}, n)$. Let $DP[n][k]$ denote the maximum sum $\sum_{i = 0}^{k} cost[x_{i}][x_{i+1}]$, where $x_{0} = 0$, $x_{k+1} = n$. We claim that $DP[n][k]$ is concave if the cost
The regular $DP$ approach, where $DP[a][b]$ represents the length of the longest increasing subsequence that ends in the $a$'th number and restarts at most $b$ times, is $\mathcal{O}(nk \log n)$. For convenience we'll assume the last number is the largest, and therefore $DP[n][k]$ is the value we are looking for (if this isn't the case, append $\infty$ and after calculating the answer decrement it by $1$).
To optimize this, we'll select some $\lambda \in \mathbb{N}$ which represents the cost of every exception, and compute $DP'[a] = \max_{b} DP[a][b] - \lambda b$. This can be done in $\mathcal{O}(n \log n)$: first sort the values, and keep a range maximum data structure over them, with all positions $j$ initialised to $v_{j} = 0$. Assume the value at position $i$ is the $p_{i}$th in the sorted list of values. Then $DP'[i] = \max(1 + \max_{j p_{i}} v_{j})$, and we set $v_{p_{i}} = DP'[i]$.
What use is computing $\max_{b} DP[n][b] - \lambda b$ to us? Notice that as we increase $\lambda$, the optimal $b$ in the maximum cannot increase. Similarly as we decrease $\lambda$, the optimal $b$ cannot decrease. If $\lambda = 0$, it is optimal to take all elements to our subsequence, and if $\lambda = n$, it is optimal to have $0$ exceptions. If we can find $\lambda$ for which the optimal $b$ can be $k$, then $DP[n][b] = DP'[n] + \lambda b$. Further, if such $\lambda$ exists, we can do binary search for it, yielding a $\mathcal{O}(n \log^{2} n)$ algorithm.
Note that we can modify the $\mathcal{O}(n \log n)$ algorithm to calculate the minimum and maximum values of $b$ that achieve the maximum value with the specific $\lambda$ with no increase in complexity. We can always find a $\lambda$ for which $\min_{b} \leq k \leq \max_{b}$, but this doesn't guarantee that there exists some subsequence with $k$ exceptions achieving the maximum. However, if we can show that $DP[n][b]$ is concave, i.e. $DP[n][b+2] - DP[n][b+1] \leq DP[n][b+1] - DP[n][b]$, we get this result, as we know that $DP[n][\min_{b} + 1] \leq DP[n][\min_{b}] + \lambda$ (due to maximality), therefore $DP[n][\max_{b}] \leq DP[n][k] + (\max_{b} - k) \lambda$, hence $DP'[n] = DP[n][\max_{b}] - \lambda \max_{b} \leq DP[n][k] - \lambda k$ and there exists a subsequence with $k$ exceptions achieving the maximum.
EDIT: Here is a C++ program for finding a maximal subsequence in $\mathcal{O}(n \log^{2} n)$. I use a segment tree for the range maximum data structure.
#include
#include
#include
using namespace std;
using ll = long long;
const int INF = 2 * (int)1e9;
pair> combine(pair> le, pair> ri) {
if (le.first >> seg;
int h = 1;
pair> recGet(int a, int b, int i, int le, int ri) const {
if (ri > off) {
seg[i+h] = combine(seg[i+h], off);
for (i += h; i > 1; i /= 2) seg[i/2] = combine(seg[i], seg[i^1]);
}
pair> get(int a, int b) const {
return recGet(a, b+1, 1, 0, h);
}
};
// Binary searches index of v from sorted vector
int bins(const vector& vec, int v) {
int low = 0;
int high = (int)vec.size() - 1;
while(low != high) {
int mid = (low + high) / 2;
if (vec[mid] lisExc(int k, vector vec) {
// Compress values
vector ord = vec;
sort(ord.begin(), ord.end());
ord.erase(unique(ord.begin(), ord.end()), ord.end());
for (auto& v : vec) v = bins(ord, v) + 1;
// Binary search lambda
int n = vec.size();
int m = ord.size() + 1;
int lambda_0 = 0;
int lambda_1 = n;
while(true) {
int lambda = (lambda_0 + lambda_1) / 2;
SegTree seg(m);
if (lambda > 0) seg.set(0, {0, {0, 0}});
else seg.set(0, {0, {0, INF}});
// Calculate DP
vector>> dp(n);
for (int i = 0; i = this
off1.first += 1 - lambda;
off1.second.first += 1;
off1.second.second += 1;
dp[i] = combine(off0, off1);
seg.set(vec[i], dp[i]);
}
// Is min_b k) {
lambda_0 = lambda + 1;
} else {
// Construct solution
ll r = off.first + 1;
int v = m;
int b = k;
vector res;
for (int i = n-1; i >= 0; --i) {
if (vec[i] > n >> k;
vector vec(n);
for (int i = 0; i > vec[i];
vector ans = lisExc(k, vec);
for (auto i : ans) cout << i+1 << ' ';
cout << '\n';
}EDIT2: Thanks to Jaehyun Koo over at Codeforces I now know how to show concavity. The following is a modified version of his proof.
Consider the array partitioning problem. In it we are given values $cost[A][B]$ representing the cost of interval $[a, b)$, and wish to partition the array into intervals $[0, x_{1}), [x_{1}, x_{2}), \dots, x_{k}, n)$. Let $DP[n][k]$ denote the maximum sum $\sum_{i = 0}^{k} cost[x_{i}][x_{i+1}]$, where $x_{0} = 0$, $x_{k+1} = n$. We claim that $DP[n][k]$ is concave if the cost
Code Snippets
#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;
using ll = long long;
const int INF = 2 * (int)1e9;
pair<ll, pair<int, int>> combine(pair<ll, pair<int, int>> le, pair<ll, pair<int, int>> ri) {
if (le.first < ri.first) swap(le, ri);
if (ri.first == le.first) {
le.second.first = min(le.second.first, ri.second.first);
le.second.second = max(le.second.second, ri.second.second);
}
return le;
}
// Specialised range maximum segment tree
class SegTree {
private:
vector<pair<ll, pair<int, int>>> seg;
int h = 1;
pair<ll, pair<int, int>> recGet(int a, int b, int i, int le, int ri) const {
if (ri <= a || b <= le) return {-INF, {INF, -INF}};
else if (a <= le && ri <= b) return seg[i];
else return combine(recGet(a, b, 2*i, le, (le+ri)/2), recGet(a, b, 2*i+1, (le+ri)/2, ri));
}
public:
SegTree(int n) {
while(h < n) h *= 2;
seg.resize(2*h, {-INF, {INF, -INF}});
}
void set(int i, pair<ll, pair<int, int>> off) {
seg[i+h] = combine(seg[i+h], off);
for (i += h; i > 1; i /= 2) seg[i/2] = combine(seg[i], seg[i^1]);
}
pair<ll, pair<int, int>> get(int a, int b) const {
return recGet(a, b+1, 1, 0, h);
}
};
// Binary searches index of v from sorted vector
int bins(const vector<int>& vec, int v) {
int low = 0;
int high = (int)vec.size() - 1;
while(low != high) {
int mid = (low + high) / 2;
if (vec[mid] < v) low = mid + 1;
else high = mid;
}
return low;
}
// Finds longest strictly increasing subsequence with at most k exceptions in O(n log^2 n)
vector<int> lisExc(int k, vector<int> vec) {
// Compress values
vector<int> ord = vec;
sort(ord.begin(), ord.end());
ord.erase(unique(ord.begin(), ord.end()), ord.end());
for (auto& v : vec) v = bins(ord, v) + 1;
// Binary search lambda
int n = vec.size();
int m = ord.size() + 1;
int lambda_0 = 0;
int lambda_1 = n;
while(true) {
int lambda = (lambda_0 + lambda_1) / 2;
SegTree seg(m);
if (lambda > 0) seg.set(0, {0, {0, 0}});
else seg.set(0, {0, {0, INF}});
// Calculate DP
vector<pair<ll, pair<int, int>>> dp(n);
for (int i = 0; i < n; ++i) {
auto off0 = seg.get(0, vec[i]-1); // previous < this
off0.first += 1;
auto off1 = seg.get(vec[i], m-1); // previous >= this
off1.first += 1 - lambda;
off1.second.first += 1;
off1.second.second += 1;
dp[i] = combine(off0, off1);
seg.set(vec[i], dp[i]);
}
// Is min_b <= k <= max_b?
auto off = seg.get(0, m-1);
if (off.second.second < k) {
lambda_1 = lambda - 1;
} else if (off.second.first > k) {
lambda_0 = lambda + 1;
} else {
// Context
StackExchange Computer Science Q#118858, answer score: 5
Revisions (0)
No revisions yet.