如何在entrez . search中使用txt文件内容作为搜索词?



打开文件的步骤:

lines = []
with open('seq.3p.peixes.seq') as f:
lines = f.readlines()
count = 0
for line in lines:
count += 1
print(f' {line}')

然后我希望能够使用它的每一行的内容作为搜索词,而不是一个接一个地写。文件示例(它们位于不同的行上):
MZ051983.1
MZ051929.1
MZ051921.1

from Bio import Entrez
Entrez.email  = "youremail@gmail.com"
search_term   = "MZ051983.1, MZ051929.1, MZ051921.1"
handle        = Entrez.esearch(db="nucleotide", term="search_term", usehistory="y", idtype="acc")
record        = Entrez.read(handle)
handle.close()
print(record.keys())
record['IdList']

给出你的例子,你很接近解决方案。根据您的需要,您可以将其写成"单例"。但我想要一个更"精致"的。通过使用函数解决。同样,在你的第一个代码示例中,你使用了一个计数器,但我不明白它的意义,所以请随意调整以下代码:

def get_file_data(file_path): # file_path could be sys.argv|1] for a standalone script
with open(file_path) as f:
return f.readlines()

def search_term(term):
Entrez.email  = "youremail@gmail.com" # for optimisation this should be handled outside of the function, probably using constants or POO
handle        = Entrez.esearch(db="nucleotide", term=f"{term}", usehistory="y", idtype="acc")
record        = Entrez.read(handle)
handle.close()
print(record.keys())
return record['IdList']

if __name__ == "__main__":
records = []
for i in get_file_data("file_path.seq"):
records.append(search_term(i))
# do something with records 

最新更新