在另一个主文本文件中搜索每个文本文件单词,如果在主文件中找不到,则使用 Python 追加



我需要有关以下情况下的python代码的帮助。

我有两个文本文件。一个主文件和一个列表文件。主文件包含许多单词,当我从列表文件中找到新单词时,我需要更新这些单词。

我需要在主文件中搜索列表文件的每个单词。 如果在主文件中找不到任何单词,那么我需要在主文件中附加该新单词。

我有代码,如果找不到字符串,它将更新文件。 但是,我需要从文本文件中搜索每个单词。

Main_File = "file path"
list_file="file path"
with open("Main_File", "r+") as file:
for line in file:
if needle in line:
break
else: # not found, we are at the eof
file.write(needle) # append missing data
#this code will append if specific word not found in file.. but,i need to search each word from another file.

您可以使用 mmap 加载主文件并从列表文件中搜索单词,如下所示:

import mmap
mainFilePath= "mainFile.txt"
listFilePath= "listFile.txt"
newWords=[]
# open main file with mmap
with open(mainFilePath, 'r') as mainFile:
mainFileMmap = mmap.mmap(mainFile.fileno(), 0 , access=mmap.ACCESS_READ)
# open list file and search for words in main file with mmap.find()
with open(listFilePath, 'r') as listFile:
for line in listFile:
line= line.replace("r", "").replace("n", "") # remove line-feeds (quick and dirty)
if mainFileMmap.find(line.encode()) == -1:
newWords.append(line)
# append new words to main file
with open(mainFilePath, 'a') as mainFile:
for newWord in set(newWords):
mainFile.write("n{}".format(newWord))

如果主文件上的单词可以加载到内存中,那么您可以加载 set 中的单词并检查单词是否在主文件中,如下面的 sudo 代码所示

main_file_words = set("load words from your main file".split())
list_file = # read list file
for word in list_file:
if word not in main_file_words:
main_file_words.add(word)
list_file.write(word)

最新更新