在Python中替换四个字母单词

我正在尝试编写一个程序，该程序打开文本文档并用**替换所有四个字母单词。我一直在处理这个程序多个小时。我似乎无法到达任何地方。我希望有人能够帮助我解决这个问题。这是我到目前为止所拥有的。非常感谢帮助！

def censor():
    filename = input("Enter name of file: ")
    file = open(filename, 'r')
    file1 = open(filename, 'w')
    for element in file:
        words = element.split()
        if len(words) == 4:
            file1 = element.replace(words, "xxxx")
            alist.append(bob)
        print (file)
    file.close()

这里是经过修订的verison，我不知道这是否更好

def censor():
    filename = input("Enter name of file: ")
    file = open(filename, 'r')
    file1 = open(filename, 'w')
    i = 0
    for element in file:
        words = element.split()
        for i in range(len(words)):
            if len(words[i]) == 4:
                file1 = element.replace(i, "xxxx")
                i = i+1
    file.close()

for element in file:
    words = element.split()
    for word in words:
        if len(word) == 4:
            etc etc

这是原因：

说您文件中的第一行是"你好，我叫约翰"然后，对于循环的第一次迭代：element = 'hello, my name is john'和words = ['hello,','my','name','is','john']

您需要检查每个单词中的内容，因此for word in words

也可能值得注意的是，在您当前的方法中，您不关注标点符号。请注意上面words中的第一个单词...

摆脱标点符号而不是说：

import string
blah blah blah ...
for word in words:
    cleaned_word = word.strip(string.punctuation)
    if len(cleaned_word) == 4:
       etc etc

这是一个提示： len(words)返回当前行上的单词数，而不是任何特定单词的长度。您需要添加代码，以查看行上的每个单词并决定是否需要更换。

另外，如果文件比简单的单词列表更复杂（例如，如果包含需要保留的标点符号），则可能值得使用正则表达式来完成工作。

可以是这样的：

def censor():
    filename = input("Enter name of file: ")
    with open(filename, 'r') as f:
        lines = f.readlines()
    newLines = []
    for line in lines:
        words = line.split()
        for i, word in enumerate(words):
            if len(word) == 4:
                words[i] == '**'
        newLines.append(' '.join(words))
    with open(filename, 'w') as f:
        for line in newLines:
            f.write(line + 'n')

def censor(filename):
"""Takes a file and writes it into file censored.txt with every 4-letterword replaced by xxxx"""
infile = open(filename)
content = infile.read()
infile.close()
outfile = open('censored.txt', 'w')
table = content.maketrans('.,;:!?', '      ')
noPunc = content.translate(table) #replace all punctuation marks with blanks, so they won't tie two words together
wordList = noPunc.split(' ')
for word in wordList:
    if 'n' in word:
        count = word.count('n')
        wordLen = len(word)-count
    else:
        wordLen = len(word)
    if wordLen == 4:
        censoredWord = word.replace(word, 'xxxx ')
        outfile.write(censoredWord)
    else:
        outfile.write(word + ' ')
outfile.close()

相关内容

最新更新

热门标签：