字典-生物信息学OOP中的字符串拆分和字符翻译问题



我的程序有问题。这部分代码有问题。

def revcmpl(self):

# TODO:convert sequence contained in the object
#      to a list called seq

seq = list(self.seq)

# TODO: reverse the list in-place

seq.reverse()

# TODO: using string method join(), the class dictionary ALPH and a
#       list comprehension, translate the reversed sequence and
#       convert into a string

seq = list(seq)
seq_revcmpl = ''.join(DNASeq.ALPH[key] for key in self.seq.split())
seq_revcmpl = str(seq_revcmpl)

# TODO: create seqid variable and assign to it the object's seqid
#       and the suffix '_revcmpl'

seqid = f'{self.seqid}_revcmpl'

# TODO: create a new object od DNASeq type using the new seqid,
#       title contained in the object and
#       reveresed and translated sequence,
#       return the new object

obj1 = DNASeq(seqid, title, seq_revcmpl)
return obj1

我尝试使用字符串方法join((、类字典ALPH和列表理解,翻译反向序列并转换为字符串。我试着运行这个:

# reload the sequences to have a collection of objects
# that are instances of the up-to-date DNASeq class
seqs = DNASeq.from_file('input/Staphylococcus_MLST_genes.fasta')
# select one of the sequences by its sequence id (seqid)
seq = seqs['yqiL']
new_seq = seq.revcmpl()
print( new_seq )

但是我收到一个错误

KeyError                                  Traceback (most recent call last)
<ipython-input-57-a28b468b9cfe> in <module>
7 seq = seqs['yqiL']
8 
----> 9 new_seq = seq.revcmpl()
10 
11 print( new_seq )
<ipython-input-43-07d175957482> in revcmpl(self)
211 
212         seq = list(seq)
--> 213         seq_revcmpl = ''.join(DNASeq.ALPH[key] for key in self.seq.split())
214         seq_revcmpl = str(seq_revcmpl)
215 
<ipython-input-43-07d175957482> in <genexpr>(.0)
211 
212         seq = list(seq)
--> 213         seq_revcmpl = ''.join(DNASeq.ALPH[key] for key in self.seq.split())
214         seq_revcmpl = str(seq_revcmpl)
215 
KeyError: 'GCGTTTAAAGACGTGCCAGCCTATGATTTAGGTGCGACTTTAATAGAACATATTATTAAAGAGACGGGTTTGAATCCAAGTGAGATTGATGAAGTTATCATCGGTAACGTACTACAAGCAGGACAAGGACAAAATCCAGCACGAATTGCTGCTATGAAAGGTGGCTTGCCAGAAACAGTACCTGCATTTACAGTGAATAAAGTATGTGGTTCTGGGTTAAAGTCGATTCAATTAGCATATCAATCTATTGTGACTGGTGAAAATGACATCGTGCTAGCTGGCGGTATGGAGAATATGTCTCAGTCACCAATGCTTGTCAACAACAGTCGCTTCGGTTTTAAAATGGGACATCAATCAATGGTTGATAGCATGGTATATGATGGTTTAACAGATGTATTTAATCAATATCATATGGGTATTACTGCTGAAAATTTAGTGGAGCAATATGGTATTTCAAGAGAAGAACAAGATACATTTGCTGTAAACTCACAACAAAAAGCAGTACGTGCACAGCAA'

但是为什么????我拆分了一个序列,seq_revcmpl = ''.join(DNASeq.ALPH[key] for key in self.seq.split())

问题就在这里:

seq_revcmpl = ''.join(DNASeq.ALPH[key] for key in self.seq.split())

self.seq将不包含任何空白,因此self.seq.split()将返回一个包含单个项目的列表——序列本身。

生成器表达式只有一次迭代(因为列表中只有一个项,一个大字符串(,key将是整个序列。

我想你想要的是:

seq_revcmpl = ''.join(DNASeq.ALPH[key] for key in self.seq)

最新更新