无法使用熊猫打开.csv文件(从 Outlook 下载); 'utf-8'编解码器错误



我正在尝试下载并保存我每天在outlook收件箱中收到的。csv文件。代码将文件保存在本地文件夹中。当我尝试使用pandas访问它时,我得到一个'utf-8'编解码器错误

import os
from enum import Enum
import win32com.client as win32
from datetime import datetime
import pandas as pd
class OutlookFolder(Enum):
olFolderInbox = 6
outlook = win32.Dispatch("Outlook.Application").GetNamespace("MAPI")
inbox = outlook.GetDefaultFolder(OutlookFolder.olFolderInbox.value)
messages = inbox.Items
messages = messages.Restrict("[SenderEmailAddress] = ABCD@XYZ.com")
t = datetime.today().date()
outputDir = r"C:UsersABCDDocumentsABCDDaily Seed File"
try:
for message in list(messages):
if message.ReceivedTime.date() == t:
try:
s = message.sender
for attachment in message.Attachments:
fn = attachment.FileName[:-4] + "_" + str(message.ReceivedTime.date()) + ".csv"
attachment.SaveASFile(os.path.join(outputDir,fn))
print(f"attachment {attachment.FileName} from {s} saved")
except Exception as e:
print("error when saving the attachment:" + str(e))
except Exception as e:
print("error when processing email messages:" + str(e))
fp = outputDir + "\" + fn
df = pd.read_csv(fp)
df.head()

我得到以下错误-

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte

您可以尝试使用encoding参数:

df = pd.read_csv(fp, encoding='utf8')

文件具有UTF-16编码和制表符或t分隔符。以下作品——

df = pd.read_csv(fp, encoding ='utf-16', sep="t")

最新更新