我正在尝试制作一个Python单词计数器，用于统计输入字典的文件中的单词。然而，我的计数器只计算一次单词，我不知道为什么。另外，有没有办法不使用收款柜台？

cloud = {}
val = 0
with open('objects.txt', 'r') as file:
for line in file:
for thing in line:
new_thing = thing.strip(' ')
cloud[new_thing] = val
for new_thing in cloud:
cloud[new_thing] = cloud.get(new_thing, val) + 1

在代码中，对于每一行新行，都设置

cloud[new_thing] = 0

其重置字CCD_ 1的计数器。

由于您已经使用了cloud.get(new_thing, 0)，如果找不到关键字new_thing，它将返回0，因此您可以删除该行。

除了将每个"new_thing"的值初始化为0(cloud[new_thing] = 0(之外，还有一个主要问题：在向其添加任何元素之前，您尝试对cloud进行迭代(因此，for new_thing in cloud:及其块实际上什么都不做，因为cloud是空的(。这是不必要的，因为字典是按非顺序访问的。

您可以更换

new_thing = thing.strip(string.punctuation)
cloud[new_thing] = 0
for new_thing in cloud:
cloud[new_thing] = cloud.get(new_thing, 0) + 1

只有：

new_thing = thing.strip(string.punctuation)
cloud[new_thing] = cloud.get(new_thing, 0) + 1

或者使用collections.Counter，正如其他人所建议的那样，它已经完成了你想要完成的任务，可能会让你的任务变得更容易。

您可以使用python字典的new_thing0函数

for new_thing in cloud:
count = cloud.setdefault(new_thing, 0)
cloud[new_thing] = count + 1

我会提取将文件拆分为行和词的部分，并去掉标点符号：

def strip_punctuation(lines):
for line in lines:
for word in line:
yield word.strip(string.punctuation)

with open('objects.txt', 'r') as file:
cloud = collections.Counter(strip_punctuation(file))

或者，使用itertools.chain和map:更简洁

with open('objects.txt', 'r') as file:
words = itertools.chain.from_iterable(file)
words_no_punctuation = map(lambda x: x.strip(string.punctuation))
cloud = collections.Counter(words_no_punctuation)

单词

附言：for thing in line:并没有用文字分割，而是用字符分割。我猜你是指for thing in line.split():

那么最后一个选项变成：

with open('objects.txt', 'r') as file:
words_per_line = map(lambda line: line.split(), file)
words = itertools.chain.from_iterable(words_per_line)
words_no_punctuation = map(lambda x: x.strip(string.punctuation))
cloud = collections.Counter(words_no_punctuation)

Python单词计数器只计算单词一次

单词

相关内容

最新更新

热门标签：