i有2个列表:ListA
包含由Tabs和ListB
界定的字符串,其中包含与ListA
中的字符串部分匹配的字符串。我想通过在ListB
中与ListB
的部分字符串匹配ListA
中的字符串。
ListA
中的字符串。我尝试的是在ListA
上循环,用t
将每一行分开,将第五列按_
拆分,然后将字符串附加到临时ListC
。然后,我订购了ListC
,但我仍然不知道如何订购ListA
给定的ListC
。
ListA = ['rs141130360tchr1:16495tCt653635tNC_024540.1tTranscripttintron_variant,non_coding_transcript_variantt-t-t-t-t-trs3210724tGtMODIFIERt-t-1t-tSNVtWASH7PtEntrezGenetHGNC:38034ttranscribed_pseudogenet-t-t-t-t-t-t-t-t-tRefSeqtGtGtOKt-t-t-t-t8/10t-t-tNR_024540.1:n.1080+112C>Gt-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-n',
'rs141130360tchr1:16495tCt100287102tNR_046018.2tTranscripttdownstream_gene_variantt-t-t-t-t-trs3210724tGtMODIFIERt2086t1t-tSNVtDDX11L1tEntrezGenetHGNC:37102ttranscribed_pseudogenet-t-t-t-t-t-t-t-t-tRefSeqtGtGt-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-n',
'rs141130360tchr1:16495tCt102466751tNG_106918.1tTranscripttdownstream_gene_variantt-t-t-t-t-trs3210724tGtMODIFIERt874t-1t-tSNVtMIR6859-1tEntrezGenetHGNC:50039tmiRNAt-t-t-t-t-t-t-t-t-tRefSeqtGtGt-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-n']
ListB = ["NC", "NG", "NM", "NP", "NR", "XM", "XP", "XR", "WP"]
ListC = []
for i in ListA:
i_split = i.split("t")[4].split("_")[0]
ListC.append(i_split)
ListC = sorted(ListC, key=lambda x: ListB.index(x))
print(ListC)
将打印:
['NC', 'NG', 'NR']
我的预期结果如下:
['rs141130360tchr1:16495tCt653635tNC_024540.1tTranscripttintron_variant,non_coding_transcript_variantt-t-t-t-t-trs3210724tGtMODIFIERt-t-1t-tSNVtWASH7PtEntrezGenetHGNC:38034ttranscribed_pseudogenet-t-t-t-t-t-t-t-t-tRefSeqtGtGtOKt-t-t-t-t8/10t-t-tNR_024540.1:n.1080+112C>Gt-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-n',
'rs141130360tchr1:16495tCt102466751tNG_106918.1tTranscripttdownstream_gene_variantt-t-t-t-t-trs3210724tGtMODIFIERt874t-1t-tSNVtMIR6859-1tEntrezGenetHGNC:50039tmiRNAt-t-t-t-t-t-t-t-t-tRefSeqtGtGt-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-n',
'rs141130360tchr1:16495tCt100287102tNR_046018.2tTranscripttdownstream_gene_variantt-t-t-t-t-trs3210724tGtMODIFIERt2086t1t-tSNVtDDX11L1tEntrezGenetHGNC:37102ttranscribed_pseudogenet-t-t-t-t-t-t-t-t-tRefSeqtGtGt-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-n']
i而不是将ListB
转换为[value, index]
字典,然后创建一个函数,从字符串中提取值并在DICT中查看。那将是我们的key
函数sorted
。
d = {x: i for i, x in enumerate(ListB)}
def get_index(s):
by_tabs = s.split('t')
by_underscore = by_tabs[4].split('_')
return d[by_underscore[0]]
listC = sorted(ListA, key=get_index)