识别未知的结构化数据格式——可能是面向类的JSON



我目前正在使用Sophos UTM,将无线统计数据推送到另一个平台,并试图消化这种数据格式
它结构清晰,看起来像是面向类的JSON,但不能完全弄清楚它是什么,也不能完全弄明白如何将它转换为可用的东西。有什么想法吗?我曾想过用PHP编写一个JSON转换器,但我担心我可能只是错过了一块拼图。

{
'clients' => {
'0c:2c:54:xx:xx:xx' => {
'ap' => '',
'connected_time_sec' => 1720,
'connected_time_str' => '00:28:40',
'hwaddr' => '0c:2c:54:xx:xx:xx',
'ip' => '172.16.28.206',
'last_rxrate_bps' => '1048576',
'last_rxrate_str' => '1024.0 kbit/s',
'last_txrate_bps' => '6815744',
'last_txrate_str' => '6.5 Mbit/s',
'lastseen_str' => '2018-11-04 18:06:37',
'lastseen_ts' => 1541351197,
'mesh_id' => '',
'mesh_mode' => 'none',
'name' => 'HUAWEI_P20_Pro',
'radio_id' => '0',
'signal_per' => '39',
'ssid' => 'ssid',
'vendor' => 'unknown'
},
'44:d8:84:xx:xx:xx' => {
'ap' => 'A40001AXX8FXXXX',
'connected_time_sec' => 534992,
'connected_time_str' => '06:04:36:32',
'hwaddr' => '44:d8:84:xx:xx:xx',
'ip' => '172.16.28.149',
'last_rxrate_bps' => '1048576',
'last_rxrate_str' => '1024.0 kbit/s',
'last_txrate_bps' => '60607488',
'last_txrate_str' => '57.8 Mbit/s',
'lastseen_str' => '2018-11-04 20:44:28',
'lastseen_ts' => 1541360668,
'mesh_id' => '',
'mesh_mode' => 'none',
'name' => 'iMac-OBC',
'radio_id' => '0',
'signal_per' => '65',
'ssid' => 'ssid',
'vendor' => 'Apple'
}
},
'connected' => {
'A40001AXX8FXXXX' => {
'associated_clients' => [
  'ab:cd:ef:gh:ij:kl',
  '44:d8:84:xx:xx:xx',
],
'bss' => undef,
'id' => 'A40001AXX8FXXXX',
'ip' => '192.168.10.11',
'lan_mac' => '00:1a:8c:xx:xx:xx',
'location' => 'AP30',
'type' => 'AP30',
'wifi_mac' => '00:1a:8c:xx:xx:xx'
},
'A4000EASIJDFSDOI' => {
'associated_clients' => [],
'bss' => undef,
'id' => 'A4000EASIJDFSDOI',
'ip' => '192.168.10.12',
'lan_mac' => '00:1a:8c:xx:xx:xx',
'location' => 'AP30',
'type' => 'AP30',
'wifi_mac' => '00:1a:8c:xx:xx:xx'
}
},
'disconnected' => {},
'lastupdate' => 1541360678
}

您的样本数据在通往有效json城镇的道路上需要进行4次修复。

进行以下更换:

  1. =>:
  2. '"
  3. 删除所有后面跟有零个或多个空白字符的,,然后再删除一个]
  4. undef值用双引号括起来

代码:(演示(

$almostjson = <<<ALMOSTJSON
...your input string
ALMOSTJSON;
$json = preg_replace(["~=>~", "~'~", "~,(?=s*])~", "~:s+Kundef~"], [':', '"', '', '"$0"'], $almostjson);
var_export(json_decode($json, true));

在包含键值关系的字符串上调用regex函数很容易发生意外匹配。这种"解决方案"应被视为"创可贴",直到数据来源得到改善。如果字符串中存在错误的文本质量,则此解决方案将来可能会悄无声息地失败。

最新更新