python - 在字典中拆分字符串和翻译字符的问题 - 生物信息学 OOP
问题描述
我的程序有问题。这部分代码是有问题的。
def revcmpl(self):
# TODO:convert sequence contained in the object
# to a list called seq
seq = list(self.seq)
# TODO: reverse the list in-place
seq.reverse()
# TODO: using string method join(), the class dictionary ALPH and a
# list comprehension, translate the reversed sequence and
# convert into a string
seq = list(seq)
seq_revcmpl = ''.join(DNASeq.ALPH[key] for key in self.seq.split())
seq_revcmpl = str(seq_revcmpl)
# TODO: create seqid variable and assign to it the object's seqid
# and the suffix '_revcmpl'
seqid = f'{self.seqid}_revcmpl'
# TODO: create a new object od DNASeq type using the new seqid,
# title contained in the object and
# reveresed and translated sequence,
# return the new object
obj1 = DNASeq(seqid, title, seq_revcmpl)
return obj1
我尝试使用字符串方法 join()、类字典 ALPH 和列表推导,翻译反向序列并转换为字符串。我尝试运行这个:
# reload the sequences to have a collection of objects
# that are instances of the up-to-date DNASeq class
seqs = DNASeq.from_file('input/Staphylococcus_MLST_genes.fasta')
# select one of the sequences by its sequence id (seqid)
seq = seqs['yqiL']
new_seq = seq.revcmpl()
print( new_seq )
但我得到一个错误
KeyError Traceback (most recent call last)
<ipython-input-57-a28b468b9cfe> in <module>
7 seq = seqs['yqiL']
8
----> 9 new_seq = seq.revcmpl()
10
11 print( new_seq )
<ipython-input-43-07d175957482> in revcmpl(self)
211
212 seq = list(seq)
--> 213 seq_revcmpl = ''.join(DNASeq.ALPH[key] for key in self.seq.split())
214 seq_revcmpl = str(seq_revcmpl)
215
<ipython-input-43-07d175957482> in <genexpr>(.0)
211
212 seq = list(seq)
--> 213 seq_revcmpl = ''.join(DNASeq.ALPH[key] for key in self.seq.split())
214 seq_revcmpl = str(seq_revcmpl)
215
KeyError: 'GCGTTTAAAGACGTGCCAGCCTATGATTTAGGTGCGACTTTAATAGAACATATTATTAAAGAGACGGGTTTGAATCCAAGTGAGATTGATGAAGTTATCATCGGTAACGTACTACAAGCAGGACAAGGACAAAATCCAGCACGAATTGCTGCTATGAAAGGTGGCTTGCCAGAAACAGTACCTGCATTTACAGTGAATAAAGTATGTGGTTCTGGGTTAAAGTCGATTCAATTAGCATATCAATCTATTGTGACTGGTGAAAATGACATCGTGCTAGCTGGCGGTATGGAGAATATGTCTCAGTCACCAATGCTTGTCAACAACAGTCGCTTCGGTTTTAAAATGGGACATCAATCAATGGTTGATAGCATGGTATATGATGGTTTAACAGATGTATTTAATCAATATCATATGGGTATTACTGCTGAAAATTTAGTGGAGCAATATGGTATTTCAAGAGAAGAACAAGATACATTTGCTGTAAACTCACAACAAAAAGCAGTACGTGCACAGCAA'
但为什么????我拆分了一个序列,seq_revcmpl = ''.join(DNASeq.ALPH[key] for key in self.seq.split())
解决方案
问题在这里:
seq_revcmpl = ''.join(DNASeq.ALPH[key] for key in self.seq.split())
self.seq
将不包含任何空格,因此self.seq.split()
将返回一个包含单个项目的列表 - 序列本身。
然后生成器表达式只有一次迭代(因为列表中只有一个项目,一个大字符串),并且key
将是整个序列。
我想你想要的是:
seq_revcmpl = ''.join(DNASeq.ALPH[key] for key in self.seq)
推荐阅读
- function - Kotlin - 使用 Lambda 函数 range、map、filter、reduce/fold 来查找 1-1000 之间可被 3 或 5 整除的所有数字的总和
- python - 使用 pandas 和 XlsxWriter 写入现有的 .xlsm
- angular - Angular Material 的 MatDialog 对话框未关闭
- unity3d - Light Baking Issue
- html - 跨度超出 div 且不完全适合
- windows - 编译后命令提示符更改颜色
- python - 使用 Python 使用 SELECT 语句的 SQL 中的嵌套循环
- typescript - 打字稿和处理 FileReader
- apache-kafka - 如何制作kafka主题的unqiue和去重版本
- scala - 在 Scala 中使用依赖注入模拟分发