Inferring Protein from Spectrum
This problem asks:
Given: A list L of n (n≤100) positive real numbers.
Return: A protein string of length n−1 whose prefix spectrum is equal to L.
References
- Real numbers
- Mass spectrum
- Prefix spectrum
- Monoisotopic mass table
- stackoverflow on “Get key by value in dictionary”
Restate the problem
I’m going to get a list of weights. I need to find the differences between those weights. Then I need to look those differences up in the dictionary of masses and find the corresponding proteins.
Solution steps
Looking up the keys of the dictionary based on the values was more difficult than I expected. My solution uses this syntax:
list(rounded_dict.keys())[list(rounded_dict.values()).index(j)]
This separates all the values in the dictionary into a list, finds the position of the desired value and returns the key at that position. It might have been easier just to rewrite the dictionary so that the weights were the keys.
I found that I had to round the differences between the weights and the weights in the dictionary to 2 decimal places to get reliable matches. As a result, I had to build a second dictionary with rounded values as keys and my lookup routine was:
for i in range(1, len(L)):
j = round((L[i] - L[i-1]), 2)
print(list(rounded_dict.keys())[list(rounded_dict.values()).index(j)], end='')
I got a correct result on my first attempt. The protein string was long. Specifically, it was “IVYQTQCQKPGKSPYAKFDGFYREVAGGDHCWDGEECIIGTFEWGKDEWRHTPQNRATFTGPKKMFWSIIEMDWQFHVII”, 80 characters long.
Post-solution notes
Challenges solved so far: 54
How many people solved this before me: 1,822
Most recent solve before me: yesterday
Time spent on challenge: 90 minutes
Most time-consuming facet: rounding the differences between the weights and finding a way to look up dictionary keys based on values
Problem explanation: The explanation included reversing the keys and values in the dictionary.