使用Python 3.4处理希伯来文文件和文件夹



我使用Python 3.4创建了一个程序,该程序可以处理电子邮件并将特定附件保存到文件服务器。

每个文件都保存到特定的目的地,具体取决于发件人的电子邮件地址。

我的问题是,目标文件夹和附件都是希伯来语的,对于一些附件,我会收到一个错误,即路径不存在。现在这是不可能的,因为在同一封邮件中,一个附件可能会失败,但其他附件则不会失败(目标文件夹由发件人的地址决定)。

我想调试这个问题,但我无法让python正确显示它试图保存的文件路径。(它是希伯来语和英语的混合体,它总是把路径显示得一团糟,尽管当文件被保存到文件服务器时,它95%的时间都能正常工作)

所以我的问题是:我应该向这个代码添加什么,以便它正确地处理Hewbrew?我应该对某些东西进行编码还是解码?在处理文件时,是否有应该避免的字符?

以下是失败的主要代码:

try:
    found_attachments = False
    for att in msg.Attachments:
        _, extension = split_filename(str(att))
        # check if attachment is not inline
        if str(att) not in msg.HTMLBody:
            if extension in database[sender][TYPES]:
                file = create_file(str(att), database[sender][PATH], database[sender][FORMAT], time_stamp)
                # This is where the program fails:
                att.SaveAsFile(file)
                print("Created:", file)
                found_attachments = True
    if found_attachments:
        items_processed.append(msg)
    else:
        items_no_att.append(msg)
except:
    print("Error with attachment: " + str(att) + " , in: " + str(msg))

以及创建文件功能:

def create_file(att, location, format, timestamp):
    """
    process an attachment to make it a file
    :param att: the name of the attachment
    :param location: the path to the file
    :param format: the format of the file
    :param timestamp: the time and date the attachment was created
    :return: return the file created
    """
    # create the file by the given format
    if format == "":
        output_file = location + "\" + att
    else:
        # split file to name and type
        filename, extension = split_filename(att)
        # extract and format the time sent on
        time = str(timestamp.time()).replace(":", ".")[:-3]
        # extract and format the date sent on
        day = str(timestamp.date())
        day = day[-2:] + day[4:-2] + day[:4]
        # initiate the output file
        output_file = format
        # add the original file name where needed
        output_file = output_file.replace(FILENAME, filename)
        # add the sent date where needed
        output_file = output_file.replace(DATE, day)
        # add the time sent where needed
        output_file = output_file.replace(TIME, time)
        # add the path and type
        output_file = location + "\" + output_file + "." + extension
        print(output_file)
    # add an index to the file if necessary and return it
    index = get_file_index(output_file)
    if index:
        filename, extension = split_filename(output_file)
        return filename + "(" + str(index) + ")." + extension
    else:
        return output_file

提前感谢,如果需要,我很乐意解释更多或提供更多代码。

我发现promlem没有使用希伯来语。我发现(路径+文件名)可以容纳的字符数(255个字符)是有限制的。

失败的文件超出了该限制并导致问题

最新更新