In the dynamic programming section the lesson walks us through how to find longest common subsequence of “ACCGTT” and “CCAGCA”. Once I finished the extra credit which also prints the subsequence, the function prints “GCC”. Shouldn’t the answer be “AGCC”? Am I missing something?
The given function answer prints as CCG which is indeed the longest common subsequence.
The key here is how do we define subsequence?
It basically has to be a sequence from each string, not necessarily consecutive, but with strictly increasing indexes.
So for CCG the indexes are 1, 2, 3 for the first string and 0,1, 3 for the second. You’ll notice that the second is not consecutive, but they are both in strictly increasing order.
AGCC as a comparison would be
0, 3, 1, 2 in the first string and 2, 3, 0, 1 in the second. They both fail to be in increasing index order.
For contrast, a substring would have the stricter constraint that the indexes have to be both consecutive and in order.
The longest common subsequence (LCS) problem finds the longest subsequence present in two sequences, preserving order but not necessarily contiguous elements.