Coursera

The knapsack problem

Problem definition

Input: $n$ items. Each has a value

value $v_i$ (non-negative)
size $w_i$ (non-negative and integral)
capacity $W$ (a non-negative integer)

Output: a subset $S \subseteq {1, 2, 3, .., n}$ that maximizes $\sum_{i\in S}v_i$ subject to $\sum_{i\in S}w_i \leq W$

Developing a DP Algorithm

Step 1: formulate recurrence (optimal solution as function of solutions to “smaller” subproblems) based on structure of an optimal solution

Let $S$ be the max-value solution to an instance of Knapsack.

Case 1: suppose item $n\notin S$. Therefore $S$ must be optimal with the first $n - 1$ items (same capacity $W$). If $S^*$ were better than $S$ with respect to first $n - 1$ items, then this equally true with all $n$ of the items, but that contradicts the purported optimality of $S$.

Case 2: suppose item $n \in S$. Then $S - {n}$ is an optimal solution with respect to the first $n - 1$ items and capacity $W - w_n$.

Proof: if $S^$ has higher value than $S - {n}$ + total size $\leq W - w_n$, then $S^\cup{n}$ has size $\leq W$ and value more than $S$ (contradiction).

Notation: Let $V_{i, x}$ be the value of the best solution that

uses only the first $i$ items
has total size $\leq x$. n

Upshot of last video: for $i\in {1, 2, .., n}$ and any $x$, $$ V_{i, x}=\max{V_{(i - 1), x}, v_i + V_{(i - 1), x - w_i}} $$ Edge case: if $w_i > x$, must have $V_{i, x} = V_{(i- 1), x}$

Step 2: identify the subproblems

all possible prefixes of items ${1, 2, .., i}$
all possible (integral) residual capacities $x \in {0, 1, 2, .., W}$. (recall $W$ and the $w_i$’s are integral).

Step 3: use recurrence from Step 1 to systematically solve all subproblems

Let A is a 2-D array
Initialize A[0, x] = 0 for x = 0, 1, 2, .., W
for i = 1, 2, 3, .., n:
	for x = 0, 1, 2, 3, .., W:
		if x - w_i >= 0 (ignore case where w_i > x)
			A[i, x] = max(A[i - 1, x], A[i - 1, x - w_i] + v_i)
return A[n, W]

We can easily see that previously results are available for $\mathcal{O}(1)$ time lookup.

What is the running time of this algorithm? $\Theta(nW)$ - solve each in $\Theta(1)$ time.

Correctness: straightforward induction (use step 1 argument to justify inductive step).

Sequence Alignment

Problem definition

Recall: sequence alignment (Needleman-Wunsch score = similarity measure between strings)

Example: AGGGCT - AGG-CA

Total penalty = $\alpha_{gap} + \alpha_{AT}$

Input: strings $X = x_1, x_2, .., x_m$, $Y = y_1, y_2, .., y_n$ over some alphabet $\Sigma$ (for example ${A, C, G, T}$)

Penalty $\alpha_{gap} \geq 0$ for inserting a gap, $\alpha_{ab}$ for matching $a$ and $b$ (presumably $\alpha_{ab} = 0$ if $a = b$)

Feasible solutions: alignments - i.e. insert gaps to equalize lengths of the strings.

A dynamic programming approach

Key step: identify subproblems. As usual, will look at structure of an optimal solution for Clues (i.e. develop a recurrence and then reverse engineer the subproblems)

Structure of optimal solution: consider an optimal alignment of $X$, $Y$ and its final position:

Question: how many relevant possibilities are there for the contents of the final position? Answer: 3

Case 1: $x_m, y_n$ matched Case 2: $x_m$ matched with a gap Case 3: $y_n$ matched with a gap

Point: narrow optimal solution down to 3 candidates.

Optimal substructure: let $X’ = X - x_m, Y’ = Y - y_n$.

If case 1 holds, then induced alignment of $X’$ and $Y’$ is optimal
If case 2 holds, then induced alignment of $X’$ and $Y’$ is optimal
If case 3 holds, then induced alignment of $X’$ and $Y’$ is optimal

Proof (of case 1, other cases are similar): by contradiction, suppose induced alignment of $X’, Y’$ has penalty $P$ while some other one has penalty $P^* < P$. Therefore, appending $x_m - y_n$ (contents of final position) to the latter, get an alignment of $X$ and $Y$ with penalty $P^* + \alpha_{x_my_n} < P + \alpha_{x_my_n}$ - penalty of original alignment. Therefore, this contradicts optimality of original alignment of $X$ and $Y$.

Relevant subproblems: have the form $(X_i, Y_j)$ where $X_i$ be the first $i$ letters of $X$, $Y_j$ be the first $j$ letters of $Y$ (since only peel off letters from the right ends of the strings).

The recurrence

Notation: let $P_{ij}$ be the penalty of optimal alignment of $X_i$ and $Y_j$.

Recurrence: for all $i = 1, 2, 3, .., n$ and $j = 1, 2, 3, .., n$,

$$ P_{ij}= \begin{cases} \alpha_{x_iy_j} + P_{(i - 1), (j - 1)}\ \alpha_{gap} + P_{(i - 1), j}\ \alpha_{gap} + P_{i, (j - 1)} \end{cases} $$

Correctness: optimal solution is one of these 3 candidates, and recurrence selects the best of 3.

Base cases

Question: what is the value of $P_{i, 0}$ and $P_{0, i}$? Answer: $i\alpha_{gap}$

The algorithm

A = 2-d array
A[i, 0] = A[0, i] = i

for i = 1 to m
	for j = 1 to n
		A[i,j] = min(A[i - 1, j - 1] + alpha[x_i][y_i], A[i - 1, j] + alpha[gap], A[j - 1] + alpha[gap])
		(All available for O(1) lookup)

Correctness: (i.e. $A_{ij} = P_{ij}\ \forall i, j \geq 0$) follows from induction & correctness of recurrence

Running time: $\mathcal{O}(mn)$ - $\Theta(1)$ for each of $\Theta(mn)$ subproblems.

Reconstructing a solution

Trace back through filled-in table $A_i$ storing at $A_{mn}$. Upon reaching subproblem $A_{ij}$:

If $A_{ij}$ filled using case 1, match $x_i$ and $y_j$ and go to $A_{(i - 1), (j - 1)}$
If $A_{ij}$ filled using case 2, match $x_i$ with a gap and go to $A_{(i - 1), j}$
If $A_{ij}$ filled using case 3, match $y_j$ with a gap and go to $A_{i, (j - 1)}$

If $i = 0$ or $j = 0$, match remaining substring with gaps

Running time: $\mathcal{O}(m + n)$.

Optimal BST

Problem definition

Recall: for a given set of keys, there are lots of valid search trees (say $x < y < z$).

Question: what is the “best” search tree for a given set of keys?

A good answer: a balanced search tree, like a red-black tree (worst-case search time = $\Theta(height) = \Theta(\log{n})$).

Question: suppose we have keys $x < y < z$ and we know that

80% of searches are for $x$
10% of searches are for $y$
10% of searches are for $z$

What is the average search time (i.e. the number of nodes looked at) in the trees with root $y$ (left child $x$ and right child $z$) and with root $x$ (right child $y$ and right grandchild $z$), respectively? Answer: 1.9 and 1.3

Input: frequencies $p_1, p_2, .., p_n$ for items $1, 2, 3, .., n$ (assuming items in sorted order i.e. $1 < 2 < 3 < .. < n$).

Goal: compute a valid search tree that minimizes the weighted (average) search time

$$ C(T) = \sum_{\text{items } i}P_i\cdot\text{search time for $i$ in $T$ (depth of $i$ in $T + 1$)} $$

(assuming $\sum_i p_i = 1$)

Example: if $T_i$’s a red-black tee, then $C(T) \in \mathcal{O}(\log{n})$.

Comparison with Huffman codes:

Similar:

Output a binary tree
Goal is (essentially) to minimize average depth with respect to given probabilities

Difference:

with Huffman codes, constraint was prefix-tree-ness (i.e. symbols only at leaves)
here, constraint = search tree property (seems harder to deal with).

Intuition: want the most (respectively, least) frequently accessed items closest (respectively, furthest) from the root.

Ideas for greedy algorithms:

Bottom-up (populate lowest level with least frequently accessed keys)
Top-down (put most frequently accessed item at root, recurse)

Choosing the root

Issue: with the top-down approach, the choice of root has hard-to-predict repercussions further down the tree (stymies both greedy and naïve divide and conquer approaches)

Idea: what if we knew the cost? (i.e. maybe can try all possibilities within a dynamic programming algorithm!)

Question: suppose an optimal BST for keys ${1, 2, .., n}$ has root $r$, left subtree $T_1$, right subtree $T_2$. Pick the strongest statement that you suspect is tree. Answer: $T_1$ is optimal for the keys ${1, 2, .., r-1}$ and $T_2$ is optimal for the keys ${r + 1, r + 2, .., n}$.

Proof of optimal substructure

Let $T$ be an optimal BST for keys ${1, 2, .., n}$ with frequencies $p_1, p_2, .., p_n$. Suppose $T$ has root $r$. Suppose for contradiction that $T_1$ is not optimal for ${1, 2, .., r - 1}$ with $C(T^*_1) < C(T_1)$ (other case is similar).

Obtain $T^$ from $T$ by “cutting + pasting” $T^_1$ in for $T_1$. Note: to complete contradiction and proof, only need to show that $C(T^*) < C(T)$

A calculation:

$$ \begin{aligned} C(T) &= \sum_{i = 1}^nP_i\cdot [\text{search time for $i$ in $T$}]\ &= P_r\cdot 1 + \sum_{i = 1}^{r - 1} P_i\cdot [\text{search time for $i$ in $T$}]+\sum_{i = r + 1}^{n} P_i\cdot [\text{search time for $i$ in $T$}]\ &= \sum_{i = 1}^nP_i + \sum_{i = 1}^{r - 1} P_i\cdot [\text{search time for $i$ in $T_1$}]+\sum_{i = r + 1}^{n} P_i\cdot [\text{search time for $i$ in $T_2$}] \end{aligned} $$

Similarly:

$$ C(T^) = \sum_{i = 1}^n P_i + C(T^_1) + C(T_2) $$

Upshot: $C(T^_1) < C(T_1)$ implies $C(T^) < C(T)$, which contradicting optimality of $T$.

A Dynamic Programming

Optimal substructure lemma: If $T$ is an optimal BST for the keys ${1, 2, .., n}$ with root $r$, then its subtrees $T_1$ and $T_2$ are optimal BSTs for the keys ${1, 2, .., r - 1}$ and ${r + 1, .., n}$, respectively.

Note: items in a subproblem are either a prefix or a suffix of the original problem.

Question: let ${1, 2, .., n}$ be original items. For which subsets $S \subseteq {1, 2, .., n}$ might we need to compute the optimal binary search tree for $S$? Answer: contiguous intervals ($S = {i, i + 1, .., j - 1, j}$ for every $i \leq j$)

The recurrence

Notation: for $1\leq i \leq j \leq n$, let $C_{ij}$ be weighted search cost of an optimal BST for the items ${i, i + 1, .., j - 1, j}$ (with probabilities $p_i, p_{i + 1}, .., p_j$)

Recurrence: for every $1 \leq i \leq j \leq n$:

$$ C_{ij} = \min_{r = i}^j\left{\sum_{k = i}^jp_k + C_{i, r - 1} + C_{r + 1, j}\right} $$

The algorithm

Important: solve smallest subproblems (with fewest number $j - i + 1$ of items) first.

Let A = 2D array
For s = 0 to (n - 1) (s represent C_j - i)
	for i = 1 to n
		A[i, i + s] = min(sum p_k + A[i, r - 1] + A[r + 1, i + s]) (interpret A[i, r - 1] + A[r + 1, i + s] as 0 if 1st index > 2nd index)

return A[1, n]

Running time

$\Theta(n^2)$ subproblems
$\Theta(j - i)$ time to compute $A_{ij}$

Hence, $\Theta(n^3)$ time overall.

Fun fact: (Knuth 71, Yao 80) optimized version of this DP algorithm correctly fills up entire table in only $\Theta(n^2)$ time ($\Theta(1)$ on average per subproblem).

Idea: piggyback on work done in previous subproblems to avoid trying all possible roots.

Quiz

Question 1

Consider a variation of the Knapsack problem where we have two knapsacks, with integer capacities $W_1$ and $W_2$. As usual, we are given $n$ items with positive values and positive integer weights. We want to pick subsets $S_1,S_2$ with maximum total value (i.e., $\sum_{i \in S_1}v_i + \sum_{i \in S_2}v_i$) such that the total weights of $S_1$ and $S_2$ are at most $W_1$ and $W_2$, respectively. Assume that every item fits in either knapsack (i.e., $w_i\leq \min{W_1,W_2}$ for every item $i$). Consider the following two algorithmic approaches. (1) Use the algorithm from lecture to pick a a max-value feasible solution $S_1$ for the first knapsack, and then run it again on the remaining items to pick a max-value feasible solution $S_2$ for the second knapsack. (2) Use the algorithm from lecture to pick a max-value feasible solution for a knapsack with capacity $W_1+W_2$, and then split the chosen items into two sets $S_1,S_2$ that have size at most $W_1$ and $W_2$, respectively. Which of the following statements is true?

[p] Neither algorithm is guaranteed to produce an optimal feasible solution to the original problem.
[c] Algorithm (1) is guaranteed to produce an optimal feasible solution to the original problem but algorithm (2) is not.
[c] Algorithm (2) is guaranteed to produce an optimal feasible solution to the original problem but algorithm (1) is not.
[c] Algorithm (1) is guaranteed to produce an optimal feasible solution to the original problem provided $W_1=W_2$.

Question 2

Recall the dynamic programming algorithms from lecture for the Knapsack and sequence alignment problems. Both fill in a two-dimensional table using a double-for loop. Suppose we reverse the order of the two for loops. (I.e., cut and paste the second for loop in front of the first for loop, without otherwise changing the text in any way.) Are the resulting algorithms still well defined and correct?

[p] Both algorithms remain well defined and correct after reversing the order of the for loops.
[c] The Knapsack algorithm remains well defined and correct after reversing the order of the for loops, but the sequence alignment algorithm does not.
[c] The sequence alignment algorithm remains well defined and correct after reversing the order of the for loops, but the Knapsack algorithm does not.
[c] Neither algorithm remains well defined and correct after reversing the order of the for loops.

Question 3

Consider an instance of the optimal binary search tree problem with 7 keys (say 1, 2, 3, 4, 5, 6, 7 in sorted order) and frequencies

$i$	$w_i$
1	0.05
2	0.4
3	0.08
4	0.04
5	0.1
6	0.1
7	0.23

What is the minimum-possible average search time of a binary search tree with these keys?

[p] 2.18
[c] 2.42
[c] 2.9
[c] 2.08

Question 4

The following problems all take as input two strings $X$ and $Y$, of length $m$ and $n$, over some alphabet $\Sigma$. Which of them can be solved in $\mathcal{O}(mn)$ time? (Check all that apply.)

[x] Consider the following variation of sequence alignment. Instead of a single gap penalty $\alpha_{gap}$, you’re given two numbers $a$ and $b$. The penalty of inserting $k$ gaps in a row is now defined as $ak+b$, rather than $k\alpha_{gap}$. Other penalties (for matching two non-gaps) are defined as before. The goal is to compute the minimum-possible penalty of an alignment under this new cost model.
[x] Compute the length of a longest common substring of $X$ and $Y$. (A substring is a consecutive subsequence of a string. So “bcd” is a substring of “abcdef”, whereas “bdf” is not.)
[x] Compute the length of a longest common subsequence of $X$ and $Y$. (Recall a subsequence need not be consecutive. For example, the longest common subsequence of “abcdef” and “afebcd” is “abcd”.)
[x] Assume that $X$ and $Y$ have the same length $n$. Does there exist a permutation $f$, mapping each $i\in{1,2,…,n}$ to a distinct $f(i)\in{1,2,…,n}$, such that $X_i=Y_{f(i)}$ for every $i=1,2,…,n$?

Question 5

Recall our dynamic programming algorithms for maximum-weight independent set, sequence alignment, and optimal binary search trees. The space requirements of these algorithms are proportional to the number of subproblems that get solved: $\Theta(n)$ (where $n$ is the number of vertices), $\Theta(mn)$ (where $m,n$ are the lengths of the two strings), and $\Theta(n^2)$ (where $n$ is the number of keys), respectively.

Suppose we only want to compute the value of an optimal solution (the final answer of the first, forward pass) and don’t care about actually reconstructing an optimal solution (i.e., we skip the second, reverse pass over the table). How much space do you then really need to run each of three algorithms?

[p] $\Theta(1), \Theta(n)$ and $\Theta(n^2)$
- [c] $\Theta(1), \Theta(1)$ and $\Theta(n)$
[c] $\Theta(n), \Theta(mn)$ and $\Theta(n^2)$
[c] $\Theta(1), \Theta(n)$ and $\Theta(n)$