Edit Distance Alignment
This problem asks:
Given: Two protein strings s and t.
Return: The edit distance dE(s,t) followed by two augmented strings s′ and t′ representing an optimal alignment of s and t.
References
- Alignment
- More on alignment
- Indels
- Gap symbols
- Augmented strings
- Edit alignment score
- Optimal alignment
- Margaret Oakley Dayoff
- Biopython alignment package
Restate the problem
I’m going to get two protein strings, and I need to return the count of the fewest edits possible to transform the first string into the second string, as well as the optimal alignments for both strings.
Solution steps
I read about the alignment package in Biopython and found the PairwiseAligner function to be a good fit for this challenge. After writing the code to use that package, I got a result on Project Rosalind’s sample dataset that was different from the sample output, but I felt like it was probably a mistake on the Project Rosalind site.
I returned an incorrect result on my first attempt at a challenge dataset because the print output from PairwiseAligner breaks the output into screen-width readable sections. I read the documentation, but could not find a way to get the resulting alignment strings in raw format.
Then I decided that using a library to do this wasn’t really teaching me much anyway, so I scrapped that approach entirely and started writing one based on the approach from the previous challenge, Edit Distance.
I submitted a correct response on my second attempt. This was my 58th correct result. By solving this challenge, I unlocked Project Rosalind’s “Alignment” badge level 1. I’ve solved 5 of 19 alignment challenges in the set.
I was the first person to solve this in 4 days. 1,315 people have solved this before me.