Enumerating k-mers Lexicographically
This problem asks:
Given: A collection of at most 10 symbols defining an ordered alphabet, and a positive integer n (n≤10).
Return: All strings of length n that can be formed from the alphabet, ordered lexicographically
Required reading
Restate the problem
This challenge can be simplified by substituting the word “Alphabetical” for “Lexicographical”. In this case, they mean the same thing.
I’m going to get a set of letters in alphabetical order, and a number less than or equal to 10. I need to return all the possible n-length strings that I can make from the original set of letters, in alphabetical order.
Solution steps
First, I decided that itertools is a general-enough library that using it is fair game for Project Rosalind.
As a result, the solution was:
def lexf(A, n):
for x in itertools.product(A, repeat = n):
print(str(''.join(x)), end = '\n')
The full code is here.
Python concepts
Working with itertools was the only new element in this challenge.
Bioinformatics concepts
This challenge didn’t have much to do with bioinformatics, except that the introduction mentions that it’s good form to list genetic strings in alphabetical order.