我有一个像这样的bam.txt
文件:
exomesinglesample_out/bam/pfg001G.GRCh38DH.target.bam
exomesinglesample_out/bam/pfg002G.GRCh38DH.target.bam
exomesinglesample_out/bam/pfg014G.GRCh38DH.target.bam
另一个文件bai.txt
:
exomesinglesample_out/bam/pfg001G.GRCh38DH.target.bai
exomesinglesample_out/bam/pfg002G.GRCh38DH.target.bai
exomesinglesample_out/bam/pfg014G.GRCh38DH.target.bai
我想创建一个字典列表,它的键总是这样的:
keys = ['bam','bam_index']
d = dict.fromkeys(keys)
l = [d for x in range(3)]
print(l)
[{'bam': None, 'bam_index': None}, {'bam': None, 'bam_index': None}, {'bam': None, 'bam_index': None}]
字典应该是这样的,而不是有None
值:
[{'bam': 'exomesinglesample_out/bam/pfg001G.GRCh38DH.target.bam', 'bam_index': 'exomesinglesample_out/bam/pfg001G.GRCh38DH.target.bai'}, {'bam': 'exomesinglesample_out/bam/pfg002G.GRCh38DH.target.bam', 'bam_index': 'exomesinglesample_out/bam/pfg002G.GRCh38DH.target.bai'}, {'bam': 'exomesinglesample_out/bam/pfg014G.GRCh38DH.target.bam', 'bam_index': 'exomesinglesample_out/bam/pfg014G.GRCh38DH.target.bai'}]
换句话说,字典列表中第一个字典的第一个值必须有bam.txt
的第一行,列表中第一个字典的第二个值必须有bai.txt
的第一行,直到两个文件的最后一行结束。
我知道dict
值不能被索引,因为字典在python中是无序的数据类型,所以也许这需要用tuples
generators
或list
来解决。
可以使用zip()
逐行遍历这两个文件,并将字典准备为:
keys = ['bam','bam_index']
l = []
with open('bam.txt', 'r') as f1, open('bai.txt', 'r') as f2:
for lf1, lf2 in zip(f1, f2):
d = {keys[0] : lf1.strip(), keys[1] : lf2.strip()}
l.append(d)
print(l)
输出:
[{'bam': 'exomesinglesample_out/bam/pfg001G.GRCh38DH.target.bam', 'bam_index': 'exomesinglesample_out/bam/pfg001G.GRCh38DH.target.bai'}, {'bam': 'exomesinglesample_out/bam/pfg002G.GRCh38DH.target.bam', 'bam_index': 'exomesinglesample_out/bam/pfg002G.GRCh38DH.target.bai'}, {'bam': 'exomesinglesample_out/bam/pfg014G.GRCh38DH.target.bam', 'bam_index': 'exomesinglesample_out/bam/pfg014G.GRCh38DH.target.bai'}]