如何读取文件的一行,将其分配给变量,在另一个文件中搜索它,并在完成后移动到文件的下一行?



这是我正在做的一个项目,我已经简化了我的问题。我手头有两份文件。第一个文件包含一个术语列表,例如:

dog
apple
gold
boy
tomato

我的第二个文件包含一个段落,可能包含在第一个文件中找到的术语,但不是必须的。例子:

the dog for some reason had a grand 
appetite for eating golden apples, 
but the dog did not like eating tomatoes
the dog only likes eating gold colored foods

我的目标是打开第一个文件,赋值变量"wanted_word";第一行的术语(在本例中为";dog")。然后我要搜索这个&;wanted_word&;在第二个文件的每行中。如果找到了字符串,我想创建一个文件,其中包含"wanted_word"发现于。所以我想要的输出是:

the dog for
but the dog
the dog only

用我现在的代码,我可以做到这一点。我的问题是,在创建文件后,我想移动到第一个文件的下一行的字符串上(在本例中:"apple"。该代码的思想是重复整个过程,每次在第二个文件中找到第一个文件中的字符串时创建一个新文件。如果字符串不在第二个文件中,那么我希望程序移动到下一行。

我的代码:

def word_match(Listofwords, string):
wordnumber = 0
listOfAssociatedWords = []
with open(Listofwords, 'r') as read_obj:
for line in read_obj:
wordnumber += 1
if string in line:
listOfAssociatedWords.append(line.split()[:3])
return listOfAssociatedWords
#------------------------------------------------------------------------------
Firstfile = open("/Directory/firstfilename", "r")
wanted_word = Firstfile.readline(3) #This part also undermines my goal since I limit the read to 3 chars
Firstfile.close()
#------------------------------------------------------------------------------
matched_words = word_match("/Directory/secondfilename", wanted_word)
NewFile = open(wanted_word + '.txt', "w") #this is for the file creation part
for elem in matched_words:
NewFile.write(elem[0] + "  " + elem[1] + "  " + elem[2])
NewFile.write("n")

那么最后,按照这个逻辑,我将有4个文件,除了"boy"第二个档案里没有。我知道我需要一个循环,但是我对Python缺乏经验,需要帮助。

您需要遍历单词,并在循环内遍历每行:

with open("/Directory/firstfilename") as words:
for word in words:
found_lines = []
with open("/Directory/secondfilename") as lines:
for line in lines:
if word in line:
found_lines.append(' '.join(line.split()[:3]))
if found_lines:
with open(word + '.txt', 'w') as out_file:
for line in found_lines:
out_file.write(line + 'n')

这应该会写入一个新文件"对于段落列表中的每个单词

def wordMatch(wordListFileLocation, paragraphListFileLocation):
fileCounter = 0 
with open(paragraphListFileLocation) as file:
paragraphs = file.readlines()
with open(wordListFileLocation) as wordListFile:
for word in wordListFile:
matching = [s for s in paragraphs if word in s]
if len(matching):
with open('words{0}.txt'.format(fileCounter), 'w') as newFile:
words = matching[0]
firstWords = words[0:3]
line = firstWords.join(' ')
newFile.write(line)
fileCounter += 1

最新更新