我已经抓取了几个网站并将输出整合到一个文本文件中。当我尝试将该文件放入SMTPLib电子邮件链时,出现编码错误:
", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa0 in position 156: invalid start byte
这是我的代码。据我所知,该文本文件中没有任何特别之处:
import requests, os, smtplib, codecs
from bs4 import BeautifulSoup
from email.mime.text import MIMEText
homeworkResults = open('homeworkResults.txt','r', encoding= 'utf-8')
homeworkContent = homeworkResults.read()
#homeworkContent.encode()
homeworkResults.close()
print("attempting email...")
smtpObj = smtplib.SMTP('smtp.gmail.com', 587)
smtpObj.ehlo()
smtpObj.starttls()
smtpObj.login('someemail@gmail.com','Password')
smtpObj.sendmail('someemail@gmail.com' , 'anotheremail@gmail.com','Subject: Kids Homework Updatenn ' + homeworkContent)
smtpObj.quit()
它在我在open
函数之前添加codes.
后工作。
此代码还通过使用ignore
参数忽略解码错误:通过将errors
参数设置为ignore
或replace
,可以使codecs.open()
忽略文件中的解码错误。默认情况下,它设置为strict
。
import requests, os, smtplib, codecs
from bs4 import BeautifulSoup
from email.mime.text import MIMEText
homeworkResults = codecs.open('homeworkResults.txt','r', encoding= 'utf-8', errors='ignore')
homeworkContent = homeworkResults.read()
#homeworkContent.encode()
homeworkResults.close()
print("attempting email...")
smtpObj = smtplib.SMTP('smtp.gmail.com', 587)
smtpObj.ehlo()
smtpObj.starttls()
smtpObj.login('someemail@gmail.com','Password')
smtpObj.sendmail('someemail@gmail.com' , 'anotheremail@gmail.com','Subject: Kids Homework Updatenn ' + homeworkContent)
smtpObj.quit()