这是我的代码。这仅适用于一个单词,并且它会打印相同的单词两次。我怎样才能使它通过单词列表和文本文件并打印带有编号行的单词。例如:
index (‘raven.txt’, [‘raven’, ‘mortal’, ‘dying’, ‘ghost’, ghastly’, ‘evil’, ‘demon’])
ghost 9
dying 9
demon 122
evil 99, 106
ghastly 82
mortal 30
我的代码:
filename = input("type a filename:")
file = open(filename)
counter = 0
lst = []
while True:
x = input("type word:")
for line in file.readlines():
counter += 1
if line.find(x) >= 0:
print(x, counter)
通过使用集合,您可以或多或少地一次检查所有关键字,而无需循环。
def index(filepath, keywords):
# Convert list to a set
keys = set(keywords)
data = {}
with open(filepath, "r") as fd:
for i, line in enumerate(fd.readlines()):
for key in set.intersection(keys, set(line.split())):
data.setdefault(key, []).append(i)
return data
filepath = raw_input("Enter file: ")
keywords = raw_input("Enter keywords: ").split()
data = index(filepath, keywords)
for key in sorted(data.keys()):
print "%s :: %s"%(key, ", ".join([str(i) for i in sorted(data[key])]))
使用下面的测试文件,输出将是:
>python kaka.py
Enter file: test.txt
Enter keywords: help some test
help :: 3, 7
test :: 0, 3, 7
测试.txt:
ssdfsdf test sdfsdf
sdf
sdfsdfs
sdf help test
sdfsdf
sdfsdf help test
file.readlines()
返回一个生成器。生成器中的项目只能访问一次。为了再次访问生成器中的 iterms,您必须重新初始化它。
相反,您可以将文件的所有行添加到列表中,然后在每次向程序输入单词时搜索此列表。
filename = input("type a filename:")
file = open(filename)
files_lines = [line for line in file]
counter = 0
while True:
x = input("type word:")
for line in files_lines:
counter += 1
if line.find(x) >= 0:
print(x, counter)
# Reset the counter so that for the next search
# word the line number begins from line number = 0
counter = 0
可以使用枚举进一步改进代码
filename = input("type a filename:")
file = open(filename)
files_lines = [line for line in file]
while True:
x = input("type word:")
line_nos = []
for line_no, line in enumerate(files_lines, start=1):
if line.find(x) >= 0:
line_nos.append(line_no)
if line_nos:
print(x, line_nos)