计算字符串时跳过重复单词



我有这样的代码,它用于计算字符串中的字数:

s = "Python is great but Java is also great"
f_s = s.split()
for word in f_s:
str_f = s.count(word)
print('There are' , str_f , '[',word,'] from ' , s)

输出为

There are 1 [ Python ] from  Python is great but Java is also great
There are 2 [ is ] from  Python is great but Java is also great
There are 2 [ great ] from  Python is great but Java is also great
There are 1 [ but ] from  Python is great but Java is also great
There are 1 [ Java ] from  Python is great but Java is also great
There are 2 [ is ] from  Python is great but Java is also great
There are 1 [ also ] from  Python is great but Java is also great
There are 2 [ great ] from  Python is great but Java is also great 

for循环会遍历每个单词,但是,我想跳过重复数("is"one_answers"great"(,所以他们只计算一次,但我不知道If I should do的哪个条件。如有帮助,我们将不胜感激!

最好先用Counter:对术语进行一次遍历

>>> from collections import Counter
>>> counter = Counter("Python is great but Java is also great".split())
>>> for word, count in counter.items():
...     print(word, count)
Python 1
is 2
great 2
but 1
Java 1
also 1

由于Counter是dict,dict是保序的,所以顺序将被保留。

这样做更好的原因是,对每个单词使用s.count(word)看起来像O(n^2(复杂性,这是不好的。

没有任何其他库:

counter = {}
for word in f_s:
counter.setdefault(word, 0)
counter[word] += 1
print(counter)
# Output
{'Python': 1, 'is': 2, 'great': 2, 'but': 1, 'Java': 1, 'also': 1}

最新更新