为什么 Python 'for word in words:'迭代单个字符而不是单词？

当我在字符串words:上运行以下代码时

def word_feats(words):
    return dict([(word, True) for word in words])
print(word_feats("I love this sandwich."))

我得到的输出dict理解是字母而不是单词：

{'a': True, ' ': True, 'c': True, 'e': True, 'd': True, 'I': True, 'h': True, 'l': True, 'o': True, 'n': True, 'i': True, 's': True, 't': True, 'w': True, 'v': True, '.': True}

我做错了什么？

您需要显式在空白处拆分字符串：

def word_feats(words):
    return dict([(word, True) for word in words.split()])

这使用了不带参数的str.split()，在任意宽度的空白（包括制表符和行分隔符）上进行拆分。否则，字符串就是单个字符的序列，直接迭代实际上只是在每个字符上循环。

然而，拆分为单词必须是您自己需要执行的明确操作，因为不同的用例对如何将字符串拆分为单独的部分有不同的需求。例如，标点符号算数吗？括号或引用呢？也许，按括号或引用分组的单词应该不被拆分吗？等等

如果您所做的只是将所有值设置为True，那么使用dict.fromkeys()会更有效率：

def word_feats(words):
    return dict.fromkeys(words.split(), True)

演示：

>>> def word_feats(words):
...     return dict.fromkeys(words.split(), True)
... 
>>> print(word_feats("I love this sandwich."))
{'I': True, 'this': True, 'love': True, 'sandwich.': True}

您必须split words字符串：

def word_feats(words):
    return dict([(word, True) for word in words.split()])
print(word_feats("I love this sandwich."))

示例

>>> words = 'I love this sandwich.'
>>> words = words.split()
>>> words
['I', 'love', 'this', 'sandwich.']

您也可以使用其他字符进行拆分：

>>> s = '23/04/2014'
>>> s = s.split('/')
>>> s
['23', '04', '2014']

您的代码

def word_feats(words):
    return dict([(word, True) for word in words.split()])
print(word_feats("I love this sandwich."))
[OUTPUT]
{'I': True, 'love': True, 'this': True, 'sandwich.': True}

示例

您的代码

相关内容

最新更新

热门标签：