使用Rails应用程序中的ActionMailer,winmail.dat附件已损坏



我在Ruby on Rails应用程序中使用ActionMailer读取电子邮件(Ruby 1.9.3,Rails 3.2.13)。我有一封电子邮件,它附带了一个winmail.dat文件(ms-tnef),我正在使用tnef gem提取其内容。

问题是,当我从邮件中读取附件时,它被损坏了,tnef无法从中提取文件。

$ tnef winmail.dat
ERROR: invalid checksum, input file may be corrupted

使用任何邮件应用程序提取winmail.dat附件,提取的winmail.dat与tnef配合良好,我得到了它的内容。

比较这两个文件,我注意到:-原始文件更大(76k对72k)-它们在换行符上有所不同:原始文件采用windows格式(0D0A),rails保存的文件采用linux格式(0A)

我写了这个测试:

it 'should extract winmail.dat from email and extract its contents' do
    file_path = "#{::Rails.root}/spec/files/winmail-dat-001.eml"
    message = Mail::Message.new(File.read(file_path))
    anexo = message.attachments[0]
    files = []
    Tnef.unpack(anexo) do |file|
      files << File.basename(file)
    end
    puts files.inspect
    files.size.should == 2
end

以下消息失败:

WARNING: invalid checksum, input file may be corrupted
Invalid RTF CRC, input file may be corrupted
WARNING: invalid checksum, input file may be corrupted
Assertion failed: ((attr->lvl_type == LVL_MESSAGE) || (attr->lvl_type == LVL_ATTACHMENT)), function attr_read, file attr.c, line 240.
Errno::EPIPE: Broken pipe

anexo = message.attachments[0]
 => #<Mail::Part:2159872060, Multipart: false, Headers: <Content-Type: application/ms-tnef; name="winmail.dat">, <Content-Transfer-Encoding: quoted-printable>, <Content-Disposition: attachment; filename="winmail.dat">>

我试着把它作为bynary保存到磁盘上,然后再读一遍,但我得到了相同的结果

it 'should extract winmail.dat from email and extract its contents' do
    file_path = "#{::Rails.root}/spec/files/winmail-dat-001.eml"
    message = Mail::Message.new(File.read(file_path))
    anexo = message.attachments[0]
    tmpfile_name = "#{::Rails.root}/tmp/#{anexo.filename}"
    File.open(tmpfile_name, 'w+b', 0644) { |f| f.write anexo.body.decoded }
    anexo = File.open(tmpfile_name)
    files = []
    Tnef.unpack(anexo) do |file|
      files << File.basename(file)
    end
    puts files.inspect
    files.size.should == 2
end

我应该如何阅读附件?

方法anexo.body.exoded调用附件的最适合编码(Mail::Encodings)的decode方法,在您的情况下为quoted_prinable

其中一些编码(7bit8bitquoted_prinable)执行转换,将不同类型的换行符更改为特定于平台的换行符。损坏winmail.dat文件的*quoted_prinable"调用.to_lf

  # Decode the string from Quoted-Printable. Cope with hard line breaks
  # that were incorrectly encoded as hex instead of literal CRLF.
  def self.decode(str)
    str.gsub(/(?:=0D=0A|=0D|=0A)rn/, "rn").unpack("M*").first.to_lf
  end

mail/core_extensions/string.rb:

def to_lf
  to_str.gsub(/n|rn|r/) { "n" }
end

要解决这个问题,您必须执行相同的编码,而不使用最后一个.To_lf。为此,您可以创建一个不会损坏文件的新编码,并使用它对附件进行编码。

创建文件:lib/encodings/tnef_encoding.rb

require 'mail/encodings/7bit'
module Mail
  module Encodings
    # Encoding to handle Microsoft TNEF format
    # It's pretty similar to quoted_printable, except for the 'to_lf' (decode) and 'to_crlf' (encode)
    class TnefEncoding < SevenBit
      NAME='tnef'
      PRIORITY = 2
      def self.can_encode?(str)
        EightBit.can_encode? str
      end
      def self.decode(str)
        # **difference here** removed '.to_lf'
        str.gsub(/(?:=0D=0A|=0D|=0A)rn/, "rn").unpack("M*").first
      end
      def self.encode(str)
        # **difference here** removed '.to_crlf'
        [str.to_lf].pack("M")
      end
      def self.cost(str)
        # These bytes probably do not need encoding
        c = str.count("x9xAxDx20-x3Cx3E-x7E")
        # Everything else turns into =XX where XX is a
        # two digit hex number (taking 3 bytes)
        total = (str.bytesize - c)*3 + c
        total.to_f/str.bytesize
      end
      private
      Encodings.register(NAME, self)
    end
  end
end

要使用您的自定义编码,您必须首先注册:

Mail::Encodings.register('tnef', Mail::Encodings::TnefEncoding)

然后,将其设置为附件的首选编码:

anexo.body.encoding('tnef')

那么,你的测试将变成:

it 'should extract winmail.dat from email and extract its contents' do
    file_path = "#{::Rails.root}/spec/files/winmail-dat-001.eml"
    message = Mail::Message.new(File.read(file_path))
    anexo = message.attachments[0]
    tmpfile_name = "#{::Rails.root}/tmp/#{anexo.filename}"
    Mail::Encodings.register('tnef', Mail::Encodings::TnefEncoding)
    anexo.body.encoding('tnef')
    File.open(tmpfile_name, 'w+b', 0644) { |f| f.write anexo.body.decoded }
    anexo = File.open(tmpfile_name)
    files = []
    Tnef.unpack(anexo) do |file|
        files << File.basename(file)
    end
    puts files.inspect
    files.size.should == 2
end

希望它能有所帮助!

相关内容

  • 没有找到相关文章

最新更新