我正在尝试编写一个程序,该程序打开文本文档并用**替换所有四个字母单词。我一直在处理这个程序多个小时。我似乎无法到达任何地方。我希望有人能够帮助我解决这个问题。这是我到目前为止所拥有的。非常感谢帮助!
def censor():
filename = input("Enter name of file: ")
file = open(filename, 'r')
file1 = open(filename, 'w')
for element in file:
words = element.split()
if len(words) == 4:
file1 = element.replace(words, "xxxx")
alist.append(bob)
print (file)
file.close()
这里是经过修订的verison,我不知道这是否更好
def censor():
filename = input("Enter name of file: ")
file = open(filename, 'r')
file1 = open(filename, 'w')
i = 0
for element in file:
words = element.split()
for i in range(len(words)):
if len(words[i]) == 4:
file1 = element.replace(i, "xxxx")
i = i+1
file.close()
for element in file:
words = element.split()
for word in words:
if len(word) == 4:
etc etc
这是原因:
说您文件中的第一行是"你好,我叫约翰"然后,对于循环的第一次迭代:element = 'hello, my name is john'
和words = ['hello,','my','name','is','john']
您需要检查每个单词中的内容,因此for word in words
也可能值得注意的是,在您当前的方法中,您不关注标点符号。请注意上面words
中的第一个单词...
摆脱标点符号而不是说:
import string
blah blah blah ...
for word in words:
cleaned_word = word.strip(string.punctuation)
if len(cleaned_word) == 4:
etc etc
这是一个提示: len(words)
返回当前行上的单词数,而不是任何特定单词的长度。您需要添加代码,以查看行上的每个单词并决定是否需要更换。
另外,如果文件比简单的单词列表更复杂(例如,如果包含需要保留的标点符号),则可能值得使用正则表达式来完成工作。
可以是这样的:
def censor():
filename = input("Enter name of file: ")
with open(filename, 'r') as f:
lines = f.readlines()
newLines = []
for line in lines:
words = line.split()
for i, word in enumerate(words):
if len(word) == 4:
words[i] == '**'
newLines.append(' '.join(words))
with open(filename, 'w') as f:
for line in newLines:
f.write(line + 'n')
def censor(filename):
"""Takes a file and writes it into file censored.txt with every 4-letterword replaced by xxxx"""
infile = open(filename)
content = infile.read()
infile.close()
outfile = open('censored.txt', 'w')
table = content.maketrans('.,;:!?', ' ')
noPunc = content.translate(table) #replace all punctuation marks with blanks, so they won't tie two words together
wordList = noPunc.split(' ')
for word in wordList:
if 'n' in word:
count = word.count('n')
wordLen = len(word)-count
else:
wordLen = len(word)
if wordLen == 4:
censoredWord = word.replace(word, 'xxxx ')
outfile.write(censoredWord)
else:
outfile.write(word + ' ')
outfile.close()