打开文件的步骤:
lines = []
with open('seq.3p.peixes.seq') as f:
lines = f.readlines()
count = 0
for line in lines:
count += 1
print(f' {line}')
然后我希望能够使用它的每一行的内容作为搜索词,而不是一个接一个地写。文件示例(它们位于不同的行上):MZ051983.1
MZ051929.1
MZ051921.1
from Bio import Entrez
Entrez.email = "youremail@gmail.com"
search_term = "MZ051983.1, MZ051929.1, MZ051921.1"
handle = Entrez.esearch(db="nucleotide", term="search_term", usehistory="y", idtype="acc")
record = Entrez.read(handle)
handle.close()
print(record.keys())
record['IdList']
给出你的例子,你很接近解决方案。根据您的需要,您可以将其写成"单例"。但我想要一个更"精致"的。通过使用函数解决。同样,在你的第一个代码示例中,你使用了一个计数器,但我不明白它的意义,所以请随意调整以下代码:
def get_file_data(file_path): # file_path could be sys.argv|1] for a standalone script
with open(file_path) as f:
return f.readlines()
def search_term(term):
Entrez.email = "youremail@gmail.com" # for optimisation this should be handled outside of the function, probably using constants or POO
handle = Entrez.esearch(db="nucleotide", term=f"{term}", usehistory="y", idtype="acc")
record = Entrez.read(handle)
handle.close()
print(record.keys())
return record['IdList']
if __name__ == "__main__":
records = []
for i in get_file_data("file_path.seq"):
records.append(search_term(i))
# do something with records