内存转储格式为 gdb 中的 xxd



我正在尝试检查一个缓冲区,其中包含二进制格式的消息,但也包含字符串数据。例如,我正在使用以下 C 代码:

int main (void) {
    char buf[100] = "x01x02x03x04String DataxAAxBBxCC";
    return 0;
}

我想获得 buf 中内容的十六进制转储,格式类似于 xxd(我不在乎它是否完全匹配,我真正想要的是与可打印字符并排的十六进制转储(。

在 GDB 中,我可以使用类似的东西:

(gdb) x /100bx buf
0x7fffffffdf00: 0x01    0x02    0x03    0x04    0x53    0x74    0x72    0x69
0x7fffffffdf08: 0x6e    0x67    0x20    0x44    0x61    0x74    0x61    0xaa
0x7fffffffdf10: 0xbb    0xcc    0x00    0x00    0x00    0x00    0x00    0x00
0x7fffffffdf18: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7fffffffdf20: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7fffffffdf28: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7fffffffdf30: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7fffffffdf38: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7fffffffdf40: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7fffffffdf48: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7fffffffdf50: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7fffffffdf58: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00

这很好,但很难以这种方式挑选字符串......或者我可以使用

(gdb) x /100bs buf
0x7fffffffdf00:  "01020304String Data252273314"
0x7fffffffdf13:  ""
0x7fffffffdf14:  ""
0x7fffffffdf15:  ""
0x7fffffffdf16:  ""
0x7fffffffdf17:  ""
...

这使得很难阅读二进制部分......我正在处理的实际消息中也有很多ascii nul,所以实际上它看起来一团糟。

我能想到的最好的办法就是这样做:

(gdb) dump binary memory dump.bin buf buf+100

然后

$ xxd dump.bin
0000000: 0102 0304 5374 7269 6e67 2044 6174 61aa  ....String Data.
0000010: bbcc 0000 0000 0000 0000 0000 0000 0000  ................
0000020: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000030: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000040: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000050: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000060: 0000 0000                                ....

但每次都这样做很痛苦。我想有人以前想要这个,所以想知道是否有人在 gdb 中找到了一种方法。 另外,您会以这种方式丢失原始内存中的地址。

我使用的是内置 python 支持的 GDB 7.4,所以我对使用漂亮的打印机或类似打印机的想法持开放态度,但我不知道如何设置。

(gdb) define xxd
>dump binary memory dump.bin $arg0 $arg0+$arg1
>shell xxd dump.bin
>end
(gdb) xxd &j 10 
0000000: 0000 0000 0000 0000 0000 0000 4d8c a7f7  ............M...
0000010: ff7f 0000 0000 0000 0000 0000 c8d7 ffff  ................
0000020: ff7f 0000 0000 0000

似乎很容易;-(

你可以写一个Python脚本(现代GDB版本有嵌入式Python解释器(来做同样的事情,并摆脱"掏空"的需要。

更新:

这是一个可能的Python实现(将其保存到xxd.py(:

class XXD(gdb.Command):
  def __init__(self):
    super(XXD, self).__init__("xxd", gdb.COMMAND_USER)
  def _PrintLine(self, offset, bytes, size):
      print('{:08x}: '.format(offset), end='')
      todo = size
      while todo >= 4:
        print(''.join('{:02x}'.format(b) for b in bytes[0:4]), end='')
        todo -= 4
        bytes = bytes[3:]
        if todo:
          print(' ', end='')
      # Print any remaining bytes
      print(''.join('{:02x}'.format(b) for b in bytes[0:todo]), end='')
      print()
      return size
  def invoke(self, arg, from_tty):
    args = arg.split()
    if len(args) != 2:
      print("xxd: <addr> <count>")
      return
    size = int(args[1])
    addr = gdb.parse_and_eval(args[0])
    inferior = gdb.inferiors()[0]
    bytes = inferior.read_memory(addr, size).tobytes()
    offset = int(addr)
    while size > 0:
      n = self._PrintLine(offset, bytes, min(len(bytes), 16))
      size -= n
      offset += n
      bytes = bytes[n:]
XXD()

像这样使用它:

// Sample program x.c
char foo[] = "abcdefghijklmopqrstuvwxyz";
int main() { return 0; }
gcc -g x.c
gdb -q ./a.out
(gdb) source xxd.py
Temporary breakpoint 1, main () at x.c:3
3       int main() { return 0; }
(gdb) xxd &foo[0] 18
00404030: 61626364 64656667 6768696a 6a6b6c6d
00404040: 7273

所以,我最终玩了一下python接口,并想出了这个:

import gdb
from curses.ascii import isgraph
def groups_of(iterable, size, first=0):
    first = first if first != 0 else size
    chunk, iterable = iterable[:first], iterable[first:]
    while chunk:
        yield chunk
        chunk, iterable = iterable[:size], iterable[size:]
class HexDump(gdb.Command):
    def __init__(self):
        super (HexDump, self).__init__ ('hex-dump', gdb.COMMAND_DATA)
    def invoke(self, arg, from_tty):
        argv = gdb.string_to_argv(arg)
        if len(argv) != 2:
            raise gdb.GdbError('hex-dump takes exactly 2 arguments.')
        addr = gdb.parse_and_eval(argv[0]).cast(
            gdb.lookup_type('void').pointer())
        try:
            bytes = int(gdb.parse_and_eval(argv[1]))
        except ValueError:
            raise gdb.GdbError('Byte count numst be an integer value.')
        inferior = gdb.selected_inferior()
        align = gdb.parameter('hex-dump-align')
        width = gdb.parameter('hex-dump-width')
        if width == 0:
            width = 16
        mem = inferior.read_memory(addr, bytes)
        pr_addr = int(str(addr), 16)
        pr_offset = width
        if align:
            pr_offset = width - (pr_addr % width)
            pr_addr -= pr_addr % width
        for group in groups_of(mem, width, pr_offset):
            print '0x%x: ' % (pr_addr,) + '   '*(width - pr_offset),
            print ' '.join(['%02X' % (ord(g),) for g in group]) + 
                '   ' * (width - len(group) if pr_offset == width else 0) + ' ',
            print ' '*(width - pr_offset) +  ''.join(
                [g if isgraph(g) or g == ' ' else '.' for g in group])
            pr_addr += width
            pr_offset = width
class HexDumpAlign(gdb.Parameter):
    def __init__(self):
        super (HexDumpAlign, self).__init__('hex-dump-align',
                                            gdb.COMMAND_DATA,
                                            gdb.PARAM_BOOLEAN)
    set_doc = 'Determines if hex-dump always starts at an "aligned" address (see hex-dump-width'
    show_doc = 'Hex dump alignment is currently'
class HexDumpWidth(gdb.Parameter):
    def __init__(self):
        super (HexDumpWidth, self).__init__('hex-dump-width',
                                            gdb.COMMAND_DATA,
                                            gdb.PARAM_INTEGER)
    set_doc = 'Set the number of bytes per line of hex-dump'
    show_doc = 'The number of bytes per line in hex-dump is'
HexDump()
HexDumpAlign()
HexDumpWidth()

我意识到这可能不是最美丽和最优雅的解决方案,但它可以完成工作并作为初稿工作。 它可以包含在~/.gdbinit中,例如:

python
sys.path.insert(0, '/path/to/module/dir')
import hexdump
end

然后可以与上面的程序一起使用,如下所示:

(gdb) hex-dump buf 100
0x7fffffffdf00:  01 02 03 04 53 74 72 69 6E 67 20 44 61 74 61 AA  ....String Data.
0x7fffffffdf10:  BB CC 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
0x7fffffffdf20:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
0x7fffffffdf30:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
0x7fffffffdf40:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
0x7fffffffdf50:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
0x7fffffffdf60:  00 00 00 00                                      ....

还有其他一些很好的触摸:

(gdb) set hex-dump-align on
Determines if hex-dump always starts at an "aligned" address (see hex-dump-width
(gdb) hex-dump &buf[5] 95
0x7fffffffdf00:                 74 72 69 6E 67 20 44 61 74 61 AA       tring Data.
0x7fffffffdf10:  BB CC 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
0x7fffffffdf20:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
0x7fffffffdf30:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
0x7fffffffdf40:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
0x7fffffffdf50:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
0x7fffffffdf60:  00 00 00 00                                      ....
(gdb) set hex-dump-width 8
Set the number of bytes per line of hex-dump
(gdb) hex-dump &buf[5] 95
0x7fffffffdf00:                 74 72 69       tri
0x7fffffffdf08:  6E 67 20 44 61 74 61 AA  ng Data.
0x7fffffffdf10:  BB CC 00 00 00 00 00 00  ........
0x7fffffffdf18:  00 00 00 00 00 00 00 00  ........
0x7fffffffdf20:  00 00 00 00 00 00 00 00  ........
0x7fffffffdf28:  00 00 00 00 00 00 00 00  ........
0x7fffffffdf30:  00 00 00 00 00 00 00 00  ........
0x7fffffffdf38:  00 00 00 00 00 00 00 00  ........
0x7fffffffdf40:  00 00 00 00 00 00 00 00  ........
0x7fffffffdf48:  00 00 00 00 00 00 00 00  ........
0x7fffffffdf50:  00 00 00 00 00 00 00 00  ........
0x7fffffffdf58:  00 00 00 00 00 00 00 00  ........
0x7fffffffdf60:  00 00 00 00              ....

不承诺没有错误:)。 如果人们感兴趣,我可能会把它贴在 github 或其他东西上。

我只用 GDB 7.4 测试过它。

我自己的贡献,来自Employed Russian solution和Roger Lipscombe评论:

  • 使用 xxd,
  • 保留地址 (xxd -o(
  • 大小参数是可选的
  • 包括小型文档

脚本(使用 gdb 7.8.1 测试(:

define xxd
  if $argc < 2
    set $size = sizeof(*$arg0)
  else
    set $size = $arg1
  end
  dump binary memory dump.bin $arg0 ((void *)$arg0)+$size
  eval "shell xxd -o %d dump.bin; rm dump.bin", ((void *)$arg0)
end
document xxd
  Dump memory with xxd command (keep the address as offset)
  xxd addr [size]
    addr -- expression resolvable as an address
    size -- size (in byte) of memory to dump
            sizeof(*addr) is used by default
end

例子:

(gdb) p &m_data
$1 = (data_t *) 0x200130dc <m_data>
(gdb) p sizeof(m_data)
$2 = 32
(gdb) xxd &m_data 32
200130dc: 0300 0000 e87c 0400 0000 0000 0100 0000  .....|..........
200130ec: 0c01 0000 b831 0020 0100 0000 0100 0000  .....1. ........
(gdb) xxd &m_data
200130dc: 0300 0000 e87c 0400 0000 0000 0100 0000  .....|..........
200130ec: 0c01 0000 b831 0020 0100 0000 0100 0000  .....1. ........
(gdb) help xxd
  Dump memory with xxd command (keep the address as offset)
  xxd addr [size]
    addr -- expression resolvable as an address
    size -- size (in byte) of memory to dump
            sizeof(*addr) is used by default

用户致命错误解决方案的改编版本

  • 适用于 Python 3
  • 添加了十六进制标题
  • 长度参数可选
  • 重命名为 HD

例子

高清0xbfffe4f1

高清 0xbfffe4f1 500

import gdb
from curses.ascii import isgraph
def groups_of(iterable, size, first=0):
    first = first if first != 0 else size
    chunk, iterable = iterable[:first], iterable[first:]
    while chunk:
        yield chunk
        chunk, iterable = iterable[:size], iterable[size:]
class HexDump(gdb.Command):
    def __init__(self):
        super (HexDump, self).__init__ ('hd', gdb.COMMAND_DATA)
    def invoke(self, arg, from_tty):
        argv = gdb.string_to_argv(arg)
        addr = gdb.parse_and_eval(argv[0]).cast(
            gdb.lookup_type('void').pointer())
        if len(argv) == 2:
             try:
                 bytes = int(gdb.parse_and_eval(argv[1]))
             except ValueError:
                 raise gdb.GdbError('Byte count numst be an integer value.')
        else:
             bytes = 500
        inferior = gdb.selected_inferior()
        align = gdb.parameter('hex-dump-align')
        width = gdb.parameter('hex-dump-width')
        if width == 0:
            width = 16
        mem = inferior.read_memory(addr, bytes)
        pr_addr = int(str(addr), 16)
        pr_offset = width
        if align:
            pr_offset = width - (pr_addr % width)
            pr_addr -= pr_addr % width
        start=(pr_addr) & 0xff;

        print ('            ' , end="")
        print ('  '.join(['%01X' % (i&0x0f,) for i in range(start,start+width)]) , end="")
        print ('  ' , end="")       
        print (' '.join(['%01X' % (i&0x0f,) for i in range(start,start+width)]) )
        for group in groups_of(mem, width, pr_offset):
            print ('0x%x: ' % (pr_addr,) + '   '*(width - pr_offset), end="")
            print (' '.join(['%02X' % (ord(g),) for g in group]) + 
                '   ' * (width - len(group) if pr_offset == width else 0) + ' ', end="")    
            print (' '*(width - pr_offset) +  ' '.join(
                [chr( int.from_bytes(g, byteorder='big')) if isgraph( int.from_bytes(g, byteorder='big')   ) or g == ' ' else '.' for g in group]))
            pr_addr += width
            pr_offset = width
class HexDumpAlign(gdb.Parameter):
    def __init__(self):
        super (HexDumpAlign, self).__init__('hex-dump-align',
                                            gdb.COMMAND_DATA,
                                            gdb.PARAM_BOOLEAN)
    set_doc = 'Determines if hex-dump always starts at an "aligned" address (see hex-dump-width'
    show_doc = 'Hex dump alignment is currently'
class HexDumpWidth(gdb.Parameter):
    def __init__(self):
        super (HexDumpWidth, self).__init__('hex-dump-width',
                                            gdb.COMMAND_DATA,
                                            gdb.PARAM_INTEGER)
    set_doc = 'Set the number of bytes per line of hex-dump'
    show_doc = 'The number of bytes per line in hex-dump is'
HexDump()
HexDumpAlign()
HexDumpWidth()

不幸的是,@FatalError和@gunthor的版本对我不起作用,所以我自己写了另一个。这是它的样子:

(gdb) xxd hello_string 0xc
00000001_00000f87:                  48 656c 6c6f 0957 6f72         Hello.Wor
00000001_00000f90: 6c64 0a                                  ld.

较新版本的 xxd 支持 -o 标志,该标志允许指定要添加到显示的偏移量(将始终从 0000000 开始(。

如果xxd -o不可用,这里有一个替代品,可以正确对齐并显示xxd'd的位置的地址。

xxd命令:

define xxd
    dump binary memory /tmp/dump.bin $arg0 $arg0+$arg1
    eval "shell xxd-o %p /tmp/dump.bin", $arg0
end

可以说是丑陋的perl脚本xxd-o(xxd偏移量(:

#!/usr/bin/env perl
use IPC::Open2; 
$SIG{'__WARN__'} = sub{ die "$0: $!n" };
my $offset = shift // "0";
$offset = oct($offset) if $offset =~ /^0/;
my $base = $offset >= 2**32 ? 16 : 8;
my $zeroes = $offset % 16;
my $padding = 1 + int($zeroes / 2) + 2*$zeroes;
my $bytestr = "" x $zeroes;
{ local $/; $bytestr .= <> }
open2(*XXD_OUT, *XXD_IN, "xxd") or die "xxd is not available!";
print XXD_IN $bytestr; close XXD_IN;
if ($zeroes) {
    $_ = <XXD_OUT>;
    s/^(.{50}).{$zeroes}/$1 . (' ' x $zeroes)/ge;
    s/^([[:xdigit:]]+:).{$padding}/$1 . (' ' x $padding)/ge;
    my $newoff = sprintf("%0${base}x",hex($1)+$offset) =~ s/^(.{8})(.{8})$/$1_$2/r;
    s/^([[:xdigit:]]+):/$newoff:/g;
    print
}
while (<XXD_OUT>) {
    s/^([[:xdigit:]]+)(?=:)/sprintf("%0${base}x", hex($1)+$offset-$offset%16) =~ s[^(.{8})(.{8})$][$1_$2]r/ge;
    print
}

欢迎改进! :-(

最新更新