验证 pdf 完整性失败



我正在尝试通过bash命令验证pdf文件的完整性。

使用 dd,我提取了 pdf 的 signedContent 和 pkcs7 分离对象。

然后我通过以下方式解码了 pkcs。

xxd -r -p pkcs7_extracted > pkcs7_extracted.bin
openssl asn1parse -inform DER <pkcs7_extracted.bin >pkcs7_extracted_decoded

从解码的 pkcs7 中,我得到了一些有用的信息,如

 0:d=0  hl=4 l=5498 cons: SEQUENCE         
 4:d=1  hl=2 l=   9 prim: OBJECT            :pkcs7-signedData
 15:d=1  hl=4 l=5483 cons: cont [ 0 ]        
 19:d=2  hl=4 l=5479 cons: SEQUENCE          
 23:d=3  hl=2 l=   1 prim: INTEGER           :01
 26:d=3  hl=2 l=  15 cons: SET               
 28:d=4  hl=2 l=  13 cons: SEQUENCE          
 30:d=5  hl=2 l=   9 prim: OBJECT            :sha256
 41:d=5  hl=2 l=   0 prim: NULL              
 43:d=3  hl=2 l=  11 cons: SEQUENCE          
 ...
 5154:d=7  hl=2 l=   9 prim: OBJECT            :contentType
 5165:d=7  hl=2 l=  11 cons: SET               
 5167:d=8  hl=2 l=   9 prim: OBJECT            :pkcs7-data
 5178:d=6  hl=2 l=  47 cons: SEQUENCE          
 5180:d=7  hl=2 l=   9 prim: OBJECT            :messageDigest
 5191:d=7  hl=2 l=  34 cons: SET               
 5193:d=8  hl=2 l=  32 prim: OCTET STRING      [HEX DUMP]:18B399D208A08815DDF23C93B1B63B13757A6AA24B1932569D7A69D0DB3A34C2
 5227:d=5  hl=2 l=  13 cons: SEQUENCE          
 5229:d=6  hl=2 l=   9 prim: OBJECT            :sha256WithRSAEncryption
 5240:d=6  hl=2 l=   0 prim: NULL              
 5242:d=5  hl=4 l= 256 prim: OCTET STRING      [HEX DUMP]:8F4B21914173EC57E6B0533BB5E04FB7054F23AC299C1BDBF589ED164A3EABB611727BE9117AAC3161D9C18DCA08BD113DD3AA90E5922009FA12BA59E7F6587E81CD79BDED09F862C2C76F35D950926F1A31A3DCCE999A52DCE0C7F67D081E81A44397E8AF96A1051B8E51F2E2271221B06D05C9895E1846B1DBE02B558F5B9EF97C7EB0FF9A7C71A9764D5E205900818F07E82027D79D3F9A5AA72B3A0CF131F1B890D0BCBF3C4DD8A0229FABE15F6C2CA0CE079EB925B3998A1A6190596A88D8F07C1C12B8750636E69108E30E643A653B285A400080C9C5590C112451F6D69BAFC2686D6F1107B37A5DB36B9F797C49E61D4B44E62E17DD541778DE763AC5
 5502:d=0  hl=2 l=   0 prim: EOC              

特别是我注意到 messageDigest 字段等于使用 ByteRange 获得的有符号内容的计算摘要。

我已经提取了加密的哈希,用我的公钥解密了它,然后用 asn1 命令再次解码。

dd if=pkcs7_extracted.bin of=extracted.sign.bin bs=1 skip=$[ 5242 + 4 ] count=256
#decrypt
openssl rsautl -verify -pubin -inkey publickey.pem < extracted.sign.bin > verified.bin
#decode of result
openssl asn1parse -inform der -in verified.bin

结果是这个对象

0:d=0  hl=2 l=  49 cons: SEQUENCE          
2:d=1  hl=2 l=  13 cons: SEQUENCE          
4:d=2  hl=2 l=   9 prim: OBJECT            :sha256
15:d=2  hl=2 l=   0 prim: NULL              
17:d=1  hl=2 l=  32 prim: OCTET STRING      [HEX DUMP]:EBAA31519CD0CCA793FEC34AA6BDD8DFA5E4D5F63BA4711F6C8ECE5D20FEF393

我很确定解密有效,因为该对象已正确解码并且正如我预期的那样包含一个 sha256 对象,但正如您所看到的,摘要值是不同的......

我找错地方了吗?我不知道如何验证完整性。

此外,Acrobat 当然会验证此签名的文档的完整性。

提前感谢!

请注意,在SignedData对象中,需要考虑多个哈希值,这些哈希值通常不相等。

查看 RFC 3852 中加密消息语法 (CMS) 对象的定义。

(RFC 3852 是从当前的 PDF 规范 ISO 32000-1 引用的 RFC;因此,即使它已被 RFC 5652 淘汰,较新的 RFC 中的更改可能不适用于此上下文。

  SignedData ::= SEQUENCE {
    version CMSVersion,
    digestAlgorithms DigestAlgorithmIdentifiers,
    encapContentInfo EncapsulatedContentInfo,
    certificates [0] IMPLICIT CertificateSet OPTIONAL,
    crls [1] IMPLICIT RevocationInfoChoices OPTIONAL,
    signerInfos SignerInfos }
...
  SignerInfo ::= SEQUENCE {
    version CMSVersion,
    sid SignerIdentifier,
    digestAlgorithm DigestAlgorithmIdentifier,
    signedAttrs [0] IMPLICIT SignedAttributes OPTIONAL,
    signatureAlgorithm SignatureAlgorithmIdentifier,
    signature SignatureValue,
    unsignedAttrs [1] IMPLICIT UnsignedAttributes OPTIONAL }
...
  SignedAttributes ::= SET SIZE (1..MAX) OF Attribute
...
  signedAttrs is a collection of attributes that are signed.  The
  field is optional, but it MUST be present if the content type of
  the EncapsulatedContentInfo value being signed is not id-data.
  SignedAttributes MUST be DER encoded, even if the rest of the
  structure is BER encoded.  Useful attribute types, such as signing
  time, are defined in Section 11.  If the field is present, it MUST
  contain, at a minimum, the following two attributes:
     A content-type attribute having as its value the content type
     of the EncapsulatedContentInfo value being signed.  Section
     11.1 defines the content-type attribute.  However, the
     content-type attribute MUST NOT be used as part of a
     countersignature unsigned attribute as defined in section 11.4.
     A message-digest attribute, having as its value the message
     digest of the content.  Section 11.2 defines the message-digest
     attribute.
...
  The result of the message digest calculation process depends on
  whether the signedAttrs field is present.  When the field is absent,
  the result is just the message digest of the content as described
  above.  When the field is present, however, the result is the message
  digest of the complete DER encoding of the SignedAttrs value
  contained in the signedAttrs field.  Since the SignedAttrs value,
  when present, must contain the content-type and the message-digest
  attributes, those values are indirectly included in the result.

因此,您的观察

消息摘要字段等于使用字节范围获得的有符号内容的计算摘要。

 5178:d=6  hl=2 l=  47 cons: SEQUENCE          
 5180:d=7  hl=2 l=   9 prim: OBJECT            :messageDigest
 5191:d=7  hl=2 l=  34 cons: SET               
 5193:d=8  hl=2 l=  32 prim: OCTET STRING      [HEX DUMP]:18B399D208A08815DDF23C93B1B63B13757A6AA24B1932569D7A69D0DB3A34C2

指示对正确的数据进行签名,因为消息摘要属性的值应为内容的消息摘要

但正如您在此处也读到的那样,由实际的内部签名字节(您解密的)签名的数据不是内容的这个消息摘要,而是签名Attrs的属性集合

因此,您不得根据内容哈希验证这些签名字节,而必须根据 RFC 中所述的签名属性哈希验证这些签名字节。

PS:OP同时发现了关于CMS签名数据验证主题的另一个答案,该答案还说明了如何更图形化地识别哪些属性已签名,哪些属性未签名。

PPS:OP 通过解密签名字节、提取包含的哈希并将其与实际哈希进行比较来验证。这对于基于 RSA 的签名是可以的。但是,基于 DSA 或 ECDSA 的签名无法解密,因此无法提取哈希值。必须使用特殊的验证例程进行验证。

PPPS:集成的PDF签名有不同的样式。虽然这里使用的样式(PKCS7/CAdES 分离)是最常见和推荐的样式,但在通用解决方案中,必须事先检查并相应地进行验证。

最新更新