蟒蛇单词出现

>我正在尝试打开并读取一个文本文件并计算单词出现的类型数，例如，如果单词在文本中更好，则其频率将为8。我附上了下面的代码。我收到以下错误

UnicodeDecodeError："utf-8"编解码器无法解码位置 861 中的字节0x97：无效的起始字节

file=open('IntroductoryCS.txt')
wordcount={}
for word in file.read().split():
        if word not in wordcount:
           wordcount[word] = 1
        else:
           wordcount[word] += 1
for k,v in wordcount.items():
      print k, v

我正在使用 IDLE 3.5.1

看来你 IntroductoryCS.txt 不在 UTF-8 中。

您应该在 open(( 函数中更改编码。

像这样：

file=open('IntroductoryCS.txt', encoding='<your_encoding_here>')

请参阅此处的文档。

我不知道你的文件是什么编码，但试试这个：

file=open('IntroductoryCS.txt', encoding='latin-1')

以下是可用的编码。

您的代码工作正常。

尝试将 txt 文件另存为 UTF-8 。在记事本上打开文件，然后另存为，然后选择编码UTF-8 。

相关内容

最新更新

热门标签：