我需要将所有邮件迭代到GMAIL收件箱中。此外,我需要下载每封邮件的所有附件(有些邮件有4-5个附件(。我在这里找到了一些帮助:https://stackoverflow.com/a/27556667/8996442
def save_attachments(self, msg, download_folder="/tmp"):
for part in msg.walk():
if part.get_content_maintype() == 'multipart':
continue
if part.get('Content-Disposition') is None:
continue
filename = part.get_filename()
print(filename)
att_path = os.path.join(download_folder, filename)
if not os.path.isfile(att_path):
fp = open(att_path, 'wb')
fp.write(part.get_payload(decode=True))
fp.close()
return att_path
但是,它每封电子邮件只下载一个附件(但帖子的作者提到,它通常会下载所有附件,不是吗?(。print(filename)
仅显示一个附件知道为什么吗?
from imap_tools import MailBox
# get all attachments from INBOX and save them to files
with MailBox('imap.my.ru').login('acc', 'pwd', 'INBOX') as mailbox:
for msg in mailbox.fetch():
for att in msg.attachments:
print(att.filename, att.content_type)
with open('/my/{}/{}'.format(msg.uid, att.filename), 'wb') as f:
f.write(att.payload)
https://pypi.org/project/imap-tools/
*我是自由撰稿人
正如评论中已经指出的,直接的问题是return
退出for
循环并离开函数,并且在保存第一个附件后立即执行此操作。
根据您想要完成的具体内容,更改代码,使您在完成msg.walk()
的所有迭代后仅使用return
。下面是一个返回附件文件名列表的尝试:
def save_attachments(self, msg, download_folder="/tmp"):
att_paths = []
for part in msg.walk():
if part.get_content_maintype() == 'multipart':
continue
if part.get('Content-Disposition') is None:
continue
filename = part.get_filename()
# Don't print
# print(filename)
att_path = os.path.join(download_folder, filename)
if not os.path.isfile(att_path):
# Use a context manager for robustness
with open(att_path, 'wb') as fp:
fp.write(part.get_payload(decode=True))
# Then you don't need to explicitly close
# fp.close()
# Append this one to the list we are collecting
att_paths.append(att_path)
# We are done looping and have processed all attachments now
# Return the list of file names
return att_paths
请参阅内联注释,了解我更改的内容和原因。
一般来说,避免从工作函数内部使用print()
;或者使用logging
以调用方可以控制的方式打印诊断,或者只返回信息并让调用方决定是否将其呈现给用户。
并非所有MIME部分都有Content-Disposition:
;事实上,我预计这会错过大多数附件,并可能提取一些内联部分。更好的方法可能是查看零件是否具有Content-Disposition: attachment
,否则,如果不存在Content-Disposition:
或Content-Type:
不是text/plain
或text/html
,则继续提取。也许还可以看看";部件";在一封由多部分组成的电子邮件中?