完成初学者,所以很抱歉,如果这很明显!
我有一个名称的文件|/-或ig_name |在这样的长列表中的0 -
S1 +
IG_1 0
S2 -
IG_S3 0
S3 +
S4 -
dnaA +
IG_dnaA 0
以IG_开头的所有内容都有相应的名称。我想将 或 - 添加到ig_name中。例如IG_S3是 S3是。
信息是基因名称和链信息,Ig =基因间区域。基本上,我想知道
的基因间区域是哪个链。我认为我想要的:
open file
for every line, if the line starts with IG_*
find the line with *
print("IG_" and the line it found)
else
print line
我拥有的:
with open(sys.argv[2]) as geneInfo:
with open(sys.argv[1]) as origin:
for line in origin:
if line.startswith("IG_"):
name = line.split("_")[1]
nname = name[:-3]
for newline in geneInfo:
if re.match(nname, newline):
print("IG_"+newline)
else:
print(line)
with是混合列表,而geneinfo只有名称而不是ig_names。
使用此代码,我最终只包含一个包含其他语句的列表。
S1 +
S2 -
S3 +
S4 -
dnaA +
我的问题是我不知道搜索有什么问题,所以我可以(尝试)修复它!
以下是一些逐步注释的代码,希望可以执行您想要的操作(尽管而不是使用print
,但我已将结果汇总到列表中,因此您实际上可以使用它)。我不太确定您现有代码发生了什么(尤其是您如何处理两个文件?)
s_dict = {}
ig_list = []
with open('genes.txt', 'r') as infile: # Simulating reading the file you pass in sys.argv
for line in infile:
if line.startswith('IG_'):
ig_list.append(line.split()[0]) # Collect all our IG values for later
else:
s_name, value = line.split() # Separate out the S value and its operator
s_dict[s_name] = value.strip() # Add to dictionary to map S to operator
# Now you can go back through your list of IG values and append the appropriate operator
pulled_together = []
for item in ig_list:
s_value = item.split('_')[1]
# The following will look for the operator mapped to the S value. If it is
# not found, it will instead give you 'not found'
corresponding_operator = s_dict.get(s_value, 'Not found')
pulled_together.append([item, corresponding_operator])
print ('List structure')
print (pulled_together)
print ('n')
print('Printout of each item in list')
for item in pulled_together:
print(item[0] + 't' + item[1])
nname = name[:-3]
python的切片在列表中非常有力,但正确理解可能很棘手。
写[:-3]时,您会采用除最后三个项目以外的所有内容。问题是,如果您的列表中的三个元素少于三个元素,则不会返回错误,而是一个空列表。
我认为这是事物不起作用的地方,因为每行的元素不多,它会返回一个空列表。如果您可以告诉您确切希望它能以示例或其他方式返回那里,那将有很大的帮助,因为我真的不知道您要在切片中得到什么。
这可以做您想要的吗?
from __future__ import print_function
import sys
# Read and store all the gene info lines, keyed by name
gene_info = dict()
with open(sys.argv[2]) as gene_info_file:
for line in gene_info_file:
tokens = line.split()
name = tokens[0].strip()
gene_info[name] = line
# Read the other file and lookup the names
with open(sys.argv[1]) as origin_file:
for line in origin_file:
if line.startswith("IG_"):
name = line.split("_")[1]
nname = name[:-3].strip()
if nname in gene_info:
lookup_line = gene_info[nname]
print("IG_" + lookup_line)
else:
pass # what do you want to do in this case?
else:
print(line)