S _ N O W Y
S U N N _ Y
Cost: 3
_ S N O W _ Y
S U N _ _ N Y
cost: 5
The edit distance between two strings is the cost of their best possible alignment. Do you see that there is no better alignment of SNOWY and SUNNY than the one shown here with a cost of 3?
Edit distance is so named because it can also be thought of as the minimum number of
edits- insertions, deletions, and substitutions of characters-needed to transform the first
string into the second. For instance, the alignment shown on the left corresponds to three
edits: insert U, substitute O ! N, and delete W.
When solving a problem by dynamic programming, the most crucial question is, What are the
subproblems? It is an easy matter to write down the algorithm: iteratively solve one subproblem after the other, in order of increasing size.
Our goal is to fnd the edit distance between two strings x[1….m] and y[1….n].
Let, an example
E X P O T E N T I A L
P O L Y N O M I A L
we have to find the minimum number of operation to convert them from one to another.
For this to work, we need to somehow express E(i; j) in terms of smaller subproblems.
Let’s see-what do we know about the best alignment between x[1…..i] and y[1….j]? Well, its
rightmost column can only be one of three things:
x[i] or _ or x[i]
_ y[j] y[j]
But this is exactly the subproblem E(i-1; j)! We seem to be getting somewhere. In the second case, also with cost 1, we still need to align x[1….i] with y[1….j-1]. This is again another subproblem, E(i; j-1). And in the final case, which either costs 1 (if x[i] != y[j]) or 0 (if x[i] = y[j]), what’s left is the subproblem E(i-1;j-1). In short, we have expressed E(i; j) in terms of three smaller subproblems E(i-1; j), E(i; j-1), E(i-1;j-1). We have no idea which of them is the right one, so we need to try them all and pick the best:
E(i; j) = min{1 + E(i – 1; j); 1 + E(i; j – 1); diff(i; j) + E(i – 1; j – 1)};
where for convenience diff(i; j) is defined to be 0 if x[i] = y[j] and 1 otherwise.
For instance, in computing the edit distance between EXPONENTIAL and POLYNOMIAL,
subproblem E(4; 3) corresponds to the prefixes EXPO and POL. The rightmost column of their
best alignment must be one of the following:
O _ O
_ L L
Thus, E(4; 3) = min{1 + E(3; 3); 1 + E(4; 2); 1 + E(3; 2)}.
So,the psudocode:
Here, m is the number of letters in POLYNOMIAL and n is the number of EXPONENTIAL
[code]
for i = 0; 1; 2; : : : ;m:
E(i; 0) = i
for j = 1; 2; : : : ; n:
E(0; j) = j
for i = 1; 2; : : : ;m:
for j = 1; 2; : : : ; n:
E(i; j) = min{E(i – 1; j) + 1;E(i; j – 1) + 1;E(i – 1; j – 1) + diff(i; j)}
return E(m; n)
[/code]
And in our example, the edit distance turns out to be 6:
E X P O N E N _ T I A L
_ _ P O L Y N O M I A L
Here is 300 DP problems of Uva
300 DP problems
Before solving them learn first from here Links for the tutorial on DP
]]>