4 Recursion and Dynamic Programming 递归 & 动态规划¶

4.1 Longest Common Subsequence¶

QUESTION 1: Write a function to find the length of the longest common subsequence between two sequences. E.g. Given the strings “serendipitous” and “precipitation”, the longest common subsequence is “reipito” and its length is 7.

A “sequence” is a group of items with a deterministic ordering. Lists, tuples and ranges are some common sequence types in Python.

A “subsequence” is a sequence obtained by deleting zero or more elements from another sequence. For example, “edpt” is a subsequence of “serendipitous”.

General case¶

Test cases¶

General case (string)
General case (list)
No common subsequence
One is a subsequence of the other
One sequence is empty
Both sequences are empty
Multiple subsequences with same length
1. “abcdef” and “badcfe”

Longest common subsequence test cases:

T0 = {
    'input': {
        'seq1': 'serendipitous',
        'seq2': 'precipitation'
    },
    'output': 7
}

T1 = {
    'input': {
        'seq1': [1, 3, 5, 6, 7, 2, 5, 2, 3],
        'seq2': [6, 2, 4, 7, 1, 5, 6, 2, 3]
    },
    'output': 5
}

T2 = {
    'input': {
        'seq1': 'longest',
        'seq2': 'stone'
    },
    'output': 3
}

T3 = {
    'input': {
        'seq1': 'asdfwevad',
        'seq2': 'opkpoiklklj'
    },
    'output': 0
}

T4 = {
    'input': {
        'seq1': 'dense',
        'seq2': 'condensed'
    },
    'output': 5
}

T5 = {
    'input': {
        'seq1': '',
        'seq2': 'opkpoiklklj'
    },
    'output': 0
}

T6 = {
    'input': {
        'seq1': '',
        'seq2': ''
    },
    'output': 0
}

T7 = {
    'input': {
        'seq1': 'abcdef',
        'seq2': 'badcfe'
    },
    'output': 3
}

lcq_tests = [T0, T1, T2, T3, T4, T5, T6, T7]

4.1.1 Recursive Solution¶

Create two counters idx1 and idx2 starting at 0. Our recursive function will compute the LCS of seq1[idx1:] and seq2[idx2:]
If seq1[idx1] and seq2[idx2] are equal, then this character belongs to the LCS of seq1[idx1:] and seq2[idx2:] (why?). Further the length this is LCS is one more than LCS of seq1[idx1+1:] and seq2[idx2+1:]

If not, then the LCS of seq1[idx1:] and seq2[idx2:] is the longer one among the LCS of seq1[idx1+1:], seq2[idx2:] and the LCS of seq1[idx1:], seq2[idx2+1:]

If either seq1[idx1:] or seq2[idx2:] is empty, then their LCS is empty.

Here’s what the tree of recursive calls looks like:

def lcs_rec(seq1, seq2, idx1=0, idx2=0):
    if len(seq1)==idx1 or len(seq2)==idx2:
        return 0
    if seq1[idx1] == seq2[idx2]:
        return 1+lcs_rec(seq1, seq2, idx1+1, idx2+1)
    else:
        return max(lcs_rec(seq1, seq2, idx1+1, idx2), lcs_rec(seq1, seq2, idx1, idx2+1))

from jovian.pythondsa import evaluate_test_cases

result = evaluate_test_cases(lcs_rec, lcq_tests)

TEST CASE #0

Input:
{'seq1': 'serendipitous', 'seq2': 'precipitation'}

Expected Output:
7

Actual Output:
7

Execution Time:
273.414 ms

Test Result:
PASSED

TEST CASE #1

Input:
{'seq1': [1, 3, 5, 6, 7, 2, 5, 2, 3], 'seq2': [6, 2, 4, 7, 1, 5, 6, 2, 3]}

Expected Output:
5

Actual Output:
5

Execution Time:
3.762 ms

Test Result:
PASSED

TEST CASE #2

Input:
{'seq1': 'longest', 'seq2': 'stone'}

Expected Output:
3

Actual Output:
3

Execution Time:
0.165 ms

Test Result:
PASSED

TEST CASE #3

Input:
{'seq1': 'asdfwevad', 'seq2': 'opkpoiklklj'}

Expected Output:
0

Actual Output:
0

Execution Time:
71.515 ms

Test Result:
PASSED

TEST CASE #4

Input:
{'seq1': 'dense', 'seq2': 'condensed'}

Expected Output:
5

Actual Output:
5

Execution Time:
0.093 ms

Test Result:
PASSED

TEST CASE #5

Input:
{'seq1': '', 'seq2': 'opkpoiklklj'}

Expected Output:
0

Actual Output:
0

Execution Time:
0.001 ms

Test Result:
PASSED

TEST CASE #6

Input:
{'seq1': '', 'seq2': ''}

Expected Output:
0

Actual Output:
0

Execution Time:
0.001 ms

Test Result:
PASSED

TEST CASE #7

Input:
{'seq1': 'abcdef', 'seq2': 'badcfe'}

Expected Output:
3

Actual Output:
3

Execution Time:
0.031 ms

Test Result:
PASSED

SUMMARY

TOTAL: 8, PASSED: 8, FAILED: 0

Complexity Analysis

Worst case occurs when each time we have to try 2 subproblems i.e. when the sequences have no common elements.

Here’s what the tree looks like in such a case (source - Techie Delight):

All the leaf nodes are (0, 0). Can you count the number of leaf nodes?

HINT: Count the number of unique paths from root to leaf. The length of each path is m+n and at each level there are 2 choices.

Based on the above can you infer that the time complexity is \(O(2^{m+n})\).

4.1.2 Recursive Solution with Memoization¶

This time we will use a dictionary called memo to save the results, and if we find that an intermediate result already exists in the memo, we won’t calculate it again.

def lcs_memo(seq1, seq2):
    memo = {}
    def recurse(idx1=0, idx2=0):
        key = (idx1, idx2)
        if key in memo:
            return memo[key]
        elif idx1==len(seq1) or idx2==len(seq2):
            memo[key] = 0
        elif seq1[idx1]==seq2[idx2]:
            memo[key] = 1+recurse(idx1+1, idx2+1)
        else:
            memo[key] = max(recurse(idx1+1, idx2), recurse(idx1, idx2+1))
        return memo[key]
    return recurse(0,0)

result = evaluate_test_cases(lcs_memo, lcq_tests)

TEST CASE #0

Input:
{'seq1': 'serendipitous', 'seq2': 'precipitation'}

Expected Output:
7

Actual Output:
7

Execution Time:
0.424 ms

Test Result:
PASSED

TEST CASE #1

Input:
{'seq1': [1, 3, 5, 6, 7, 2, 5, 2, 3], 'seq2': [6, 2, 4, 7, 1, 5, 6, 2, 3]}

Expected Output:
5

Actual Output:
5

Execution Time:
0.158 ms

Test Result:
PASSED

TEST CASE #2

Input:
{'seq1': 'longest', 'seq2': 'stone'}

Expected Output:
3

Actual Output:
3

Execution Time:
0.294 ms

Test Result:
PASSED

TEST CASE #3

Input:
{'seq1': 'asdfwevad', 'seq2': 'opkpoiklklj'}

Expected Output:
0

Actual Output:
0

Execution Time:
1.353 ms

Test Result:
PASSED

TEST CASE #4

Input:
{'seq1': 'dense', 'seq2': 'condensed'}

Expected Output:
5

Actual Output:
5

Execution Time:
0.072 ms

Test Result:
PASSED

TEST CASE #5

Input:
{'seq1': '', 'seq2': 'opkpoiklklj'}

Expected Output:
0

Actual Output:
0

Execution Time:
0.004 ms

Test Result:
PASSED

TEST CASE #6

Input:
{'seq1': '', 'seq2': ''}

Expected Output:
0

Actual Output:
0

Execution Time:
0.003 ms

Test Result:
PASSED

TEST CASE #7

Input:
{'seq1': 'abcdef', 'seq2': 'badcfe'}

Expected Output:
3

Actual Output:
3

Execution Time:
0.04 ms

Test Result:
PASSED

SUMMARY

TOTAL: 8, PASSED: 8, FAILED: 0

Complexity Analysis:
Time complexity is \(O(m*n)\), much better than \(O(2^{m+n})\)

4.1.3 Dynamic programming¶

Create a table of size (n1+1) * (n2+1) initialized with 0s, where n1 and n2 are the lengths of the sequences. table[i][j] represents the longest common subsequence of seq1[:i] and seq2[:j]. Here’s what the table looks like (source: Kevin Mavani, Medium).

If seq1[i] and seq2[j] are equal, then table[i+1][j+1] = 1 + table[i][j]
If seq1[i] and seq2[j] are not equal, then table[i+1][j+1] = max(table[i][j+1], table[i+1][j])

The time complexity of the dynamic programming approach is \(O(N1 * N2)\).

def lcs_dp(seq1, seq2):
    m, n = len(seq1), len(seq2)
    results = [[0 for _ in range(n+1)] for _ in range(m+1)]
    idx1, idx2 = 0, 0
    for idx1 in range(m):
        for idx2 in range(n):
            if seq1[idx1]==seq2[idx2]:
                results[idx1+1][idx2+1] = 1+results[idx1][idx2]
            else:
                results[idx1+1][idx2+1] = max(results[idx1][idx2+1], results[idx1+1][idx2])
    return results[-1][-1]

result = evaluate_test_cases(lcs_dp, lcq_tests)

TEST CASE #0

Input:
{'seq1': 'serendipitous', 'seq2': 'precipitation'}

Expected Output:
7

Actual Output:
7

Execution Time:
0.231 ms

Test Result:
PASSED

TEST CASE #1

Input:
{'seq1': [1, 3, 5, 6, 7, 2, 5, 2, 3], 'seq2': [6, 2, 4, 7, 1, 5, 6, 2, 3]}

Expected Output:
5

Actual Output:
5

Execution Time:
0.11 ms

Test Result:
PASSED

TEST CASE #2

Input:
{'seq1': 'longest', 'seq2': 'stone'}

Expected Output:
3

Actual Output:
3

Execution Time:
0.057 ms

Test Result:
PASSED

TEST CASE #3

Input:
{'seq1': 'asdfwevad', 'seq2': 'opkpoiklklj'}

Expected Output:
0

Actual Output:
0

Execution Time:
0.13 ms

Test Result:
PASSED

TEST CASE #4

Input:
{'seq1': 'dense', 'seq2': 'condensed'}

Expected Output:
5

Actual Output:
5

Execution Time:
0.064 ms

Test Result:
PASSED

TEST CASE #5

Input:
{'seq1': '', 'seq2': 'opkpoiklklj'}

Expected Output:
0

Actual Output:
0

Execution Time:
0.01 ms

Test Result:
PASSED

TEST CASE #6

Input:
{'seq1': '', 'seq2': ''}

Expected Output:
0

Actual Output:
0

Execution Time:
0.008 ms

Test Result:
PASSED

TEST CASE #7

Input:
{'seq1': 'abcdef', 'seq2': 'badcfe'}

Expected Output:
3

Actual Output:
3

Execution Time:
0.056 ms

Test Result:
PASSED

SUMMARY

TOTAL: 8, PASSED: 8, FAILED: 0

Complexity Analysis:
The time complexity is \(O(m*n)\).

4.2 0-1 Knapsack Problem¶

Problem statement

You’re in charge of selecting a football (soccer) team from a large pool of players. Each player has a cost, and a rating. You have a limited budget. What is the highest total rating of a team that fits within your budget. Assume that there’s no minimum or maximum team size.

General problem statemnt:
Given n elements, each of which has a weight and a profit, determine the maximum profit that can be obtained by selecting a subset of the elements weighing no more than w.

Test cases:

Some generic test cases
All the elements can be included
None of the elements can be included
Only one of the elements can be included
???

Knapsack test cases:

test0 = {
    'input': {
        'capacity': 165,
        'weights': [23, 31, 29, 44, 53, 38, 63, 85, 89, 82],
        'profits': [92, 57, 49, 68, 60, 43, 67, 84, 87, 72]
    },
    'output': 309
}

test1 = {
    'input': {
        'capacity': 3,
        'weights': [4, 5, 6],
        'profits': [1, 2, 3]
    },
    'output': 0
}

test2 = {
    'input': {
        'capacity': 4,
        'weights': [4, 5, 1],
        'profits': [1, 2, 3]
    },
    'output': 3
}

test3 = {
    'input': {
        'capacity': 170,
        'weights': [41, 50, 49, 59, 55, 57, 60],
        'profits': [442, 525, 511, 593, 546, 564, 617]
    },
    'output': 1735
}

test4 = {
    'input': {
        'capacity': 15,
        'weights': [4, 5, 6],
        'profits': [1, 2, 3]
    },
    'output': 6
}

test5 = {
    'input': {
        'capacity': 15,
        'weights': [4, 5, 1, 3, 2, 5],
        'profits': [2, 3, 1, 5, 4, 7]
    },
    'output': 19
}

tests = [test0, test1, test2, test3, test4, test5]

4.2.1 Recursion¶

We’ll write a recursive function that computes max_profit(weights[idx:], profits[idx:], capacity), with idx starting from 0.
If weights[idx] > capacity, the current element is cannot be selected, so the maximum profit is the same as max_profit(weights[idx+1:], profits[idx+1:], capacity).
Otherwise, there are two possibilities: we either pick weights[idx] or don’t. We can recursively compute the maximum

A. If we don’t pick weights[idx], once again the maximum profit for this case is max_profit(weights[idx+1:], profits[idx+1:], capacity)

B. If we pick weights[idx], the maximum profit for this case is profits[idx] + max_profit(weights[idx+1:], profits[idx+1:], capacity - weights[idx]
If weights[idx:] is empty, the maximum profit for this case is 0.

Here’s a visualization of the recursion tree:

Verify that the time complexity of the recursive algorithm is \(O(2^N)\)

def kp_rec(capacity, weights, profits, idx=0):
    if len(weights[idx:]) == 0:
        return 0
    elif weights[idx] > capacity:
        return kp_rec(capacity, weights, profits, idx+1)
    else:
        return max(kp_rec(capacity, weights, profits, idx+1), profits[idx]+kp_rec(capacity-weights[idx], weights, profits, idx+1))

result = evaluate_test_cases(kp_rec, tests)

TEST CASE #0

Input:
{'capacity': 165, 'weights': [23, 31, 29, 44, 53, 38, 63, 85, 89, 82], 'profits': [92, 57, 49, 68, 6...

Expected Output:
309

Actual Output:
309

Execution Time:
0.554 ms

Test Result:
PASSED

TEST CASE #1

Input:
{'capacity': 3, 'weights': [4, 5, 6], 'profits': [1, 2, 3]}

Expected Output:
0

Actual Output:
0

Execution Time:
0.007 ms

Test Result:
PASSED

TEST CASE #2

Input:
{'capacity': 4, 'weights': [4, 5, 1], 'profits': [1, 2, 3]}

Expected Output:
3

Actual Output:
3

Execution Time:
0.012 ms

Test Result:
PASSED

TEST CASE #3

Input:
{'capacity': 170, 'weights': [41, 50, 49, 59, 55, 57, 60], 'profits': [442, 525, 511, 593, 546, 564,...

Expected Output:
1735

Actual Output:
1735

Execution Time:
0.164 ms

Test Result:
PASSED

TEST CASE #4

Input:
{'capacity': 15, 'weights': [4, 5, 6], 'profits': [1, 2, 3]}

Expected Output:
6

Actual Output:
6

Execution Time:
0.02 ms

Test Result:
PASSED

TEST CASE #5

Input:
{'capacity': 15, 'weights': [4, 5, 1, 3, 2, 5], 'profits': [2, 3, 1, 5, 4, 7]}

Expected Output:
19

Actual Output:
19

Execution Time:
0.129 ms

Test Result:
PASSED

SUMMARY

TOTAL: 6, PASSED: 6, FAILED: 0

4.2.2 Recursion with Memoization¶

def kp_memo(capacity, weights, profits):
    memo = {}
    def recurse(idx, remaining_c):
        key = (idx, remaining_c)
        if key in memo:
            return key[memo]
        if idx == len(weights):
            return 0
        elif weights[idx] > remaining_c:
            return recurse(idx+1, remaining_c)
        else:
            return max(recurse(idx+1, remaining_c),
                       profits[idx]+recurse(idx+1, remaining_c-weights[idx]))
    return recurse(0, capacity)

evaluate_test_cases(kp_memo, tests)

TEST CASE #0

Input:
{'capacity': 165, 'weights': [23, 31, 29, 44, 53, 38, 63, 85, 89, 82], 'profits': [92, 57, 49, 68, 6...

Expected Output:
309

Actual Output:
309

Execution Time:
0.56 ms

Test Result:
PASSED

TEST CASE #1

Input:
{'capacity': 3, 'weights': [4, 5, 6], 'profits': [1, 2, 3]}

Expected Output:
0

Actual Output:
0

Execution Time:
0.012 ms

Test Result:
PASSED

TEST CASE #2

Input:
{'capacity': 4, 'weights': [4, 5, 1], 'profits': [1, 2, 3]}

Expected Output:
3

Actual Output:
3

Execution Time:
0.016 ms

Test Result:
PASSED

TEST CASE #3

Input:
{'capacity': 170, 'weights': [41, 50, 49, 59, 55, 57, 60], 'profits': [442, 525, 511, 593, 546, 564,...

Expected Output:
1735

Actual Output:
1735

Execution Time:
0.14 ms

Test Result:
PASSED

TEST CASE #4

Input:
{'capacity': 15, 'weights': [4, 5, 6], 'profits': [1, 2, 3]}

Expected Output:
6

Actual Output:
6

Execution Time:
0.019 ms

Test Result:
PASSED

TEST CASE #5

Input:
{'capacity': 15, 'weights': [4, 5, 1, 3, 2, 5], 'profits': [2, 3, 1, 5, 4, 7]}

Expected Output:
19

Actual Output:
19

Execution Time:
0.108 ms

Test Result:
PASSED

SUMMARY

TOTAL: 6, PASSED: 6, FAILED: 0

[(309, True, 0.56),
 (0, True, 0.012),
 (3, True, 0.016),
 (1735, True, 0.14),
 (6, True, 0.019),
 (19, True, 0.108)]

4.2.3 Dynamic Programming¶

Create a table of size (n+1) * (capacity+1) consisting of all 0s, where is n is the number of elements. table[i][c] represents the maximum profit that can be obtained using the first i elements if the maximum capacity is c. Here’s a visual representation of a filled table (source - geeksforgeeks):

(The 0th row will contain all zeros and is not shown above.)

We’ll fill the table row by row and column by column. table[i][c] can be filled using some values in the row above it.
If weights[i] > c i.e. if the current element can is larger than capacity, then table[i+1][c] is simply equal to table[i][c] (since there’s no way we can pick this element).
If weights[i] <= c then we have two choices: to either pick the current element or not to get the value of table[i+1][c]. We can compare the maximum profit for both these options and pick the better one as the value of table[i][c].

A. If we don’t pick the element with weight weights[i], then once again the maximum profit is table[i][c]

B. If we pick the element with weight weights[i], then the maximum profit is profits[i] + table[i][c-weights[i]], since we have used up some capacity.

Verify that the complexity of the dynamic programming solution is \(O(N * W)\).

def kp_dp(capacity, weights, profits):
    results = [[0 for _ in range(capacity+1)] for _ in range(len(weights)+1)]
    for i in range(len(weights)):
        for c in range(capacity+1):
            if weights[i] > c:
                results[i+1][c] = results[i][c]
            else:
                results[i+1][c] = max(results[i][c],
                                      profits[i]+results[i][c-weights[i]])
    return results[-1][-1]

evaluate_test_cases(kp_dp, tests)

TEST CASE #0

Input:
{'capacity': 165, 'weights': [23, 31, 29, 44, 53, 38, 63, 85, 89, 82], 'profits': [92, 57, 49, 68, 6...

Expected Output:
309

Actual Output:
309

Execution Time:
1.463 ms

Test Result:
PASSED

TEST CASE #1

Input:
{'capacity': 3, 'weights': [4, 5, 6], 'profits': [1, 2, 3]}

Expected Output:
0

Actual Output:
0

Execution Time:
0.018 ms

Test Result:
PASSED

TEST CASE #2

Input:
{'capacity': 4, 'weights': [4, 5, 1], 'profits': [1, 2, 3]}

Expected Output:
3

Actual Output:
3

Execution Time:
0.023 ms

Test Result:
PASSED

TEST CASE #3

Input:
{'capacity': 170, 'weights': [41, 50, 49, 59, 55, 57, 60], 'profits': [442, 525, 511, 593, 546, 564,...

Expected Output:
1735

Actual Output:
1735

Execution Time:
1.11 ms

Test Result:
PASSED

TEST CASE #4

Input:
{'capacity': 15, 'weights': [4, 5, 6], 'profits': [1, 2, 3]}

Expected Output:
6

Actual Output:
6

Execution Time:
0.04 ms

Test Result:
PASSED

TEST CASE #5

Input:
{'capacity': 15, 'weights': [4, 5, 1, 3, 2, 5], 'profits': [2, 3, 1, 5, 4, 7]}

Expected Output:
19

Actual Output:
19

Execution Time:
0.06 ms

Test Result:
PASSED

SUMMARY

TOTAL: 6, PASSED: 6, FAILED: 0

[(309, True, 1.463),
 (0, True, 0.018),
 (3, True, 0.023),
 (1735, True, 1.11),
 (6, True, 0.04),
 (19, True, 0.06)]

Jinzhao's Learning Notes

4 Recursion and Dynamic Programming 递归 & 动态规划

Contents