摘要:
我需要用JQ从一个复杂的JSON对象中过滤PII数据。我不擅长解构主义和/或多遍的脚本编写。我想保留非PII属性而不是删除PII属性,因为如果后端添加了新的PII属性但不通知我,我想避免PII泄漏暴露。
在简单的情况下,我可以很容易地"重构";输入对象中所需的JSON对象,如下所示:
{
"data": {
"id": "123",
"pii": "sensitve"
}
"return-code": 200
}
jq '{data: {id: .data.id }, return-code: .return-code}'
一旦数组被添加到混合中,我就看不出如何使用这种方法来解决这个问题。
复杂对象的简化示例
输入:
{
"customers": [
{
"id": "00000000001",
"dateOfBirth": "sensitive DOB",
"preferences": [
{
"preference-id": "0001",
"pii-value": "senstive value 1"
},
{
"preference-id": "0002",
"pii-value": "senstive value 2"
}
]
},
{
"id": "00000000002",
"dateOfBirth": "sensitive DOB",
"preferences": [
{
"preference-id": "0003",
"pii-value": "senstive value 3"
},
{
"preference-id": "0004",
"pii-value": "senstive value 4"
}
]
}
]
}
所需输出:
{
"customers": [
{
"id": "00000000001",
"preferences": [
{
"preference-id": "0001"
},
{
"preference-id": "0002"
}
]
},
{
"id": "00000000002",
"preferences": [
{
"preference-id": "0003"
},
{
"preference-id": "0004"
}
]
}
]
}
尝试的阵列方法:
jq '{ customers: [ { id: .customers[].id, preferences: [ .customers[].preferences ] } ]}'
结果开始汇集不同客户的排列
{
"customers": [
{
"id": "00000000001",
"preferences": [
[
{
"preference-id": "0001",
"pii-value": "senstive value 1"
},
{
"preference-id": "0002",
"pii-value": "senstive value 2"
}
],
[
{
"preference-id": "0003",
"pii-value": "senstive value 3"
},
{
"preference-id": "0004",
"pii-value": "senstive value 4"
}
]
]
},
{
"id": "00000000002",
"preferences": [
[
{
"preference-id": "0001",
"pii-value": "senstive value 1"
},
{
"preference-id": "0002",
"pii-value": "senstive value 2"
}
],
[
{
"preference-id": "0003",
"pii-value": "senstive value 3"
},
{
"preference-id": "0004",
"pii-value": "senstive value 4"
}
]
]
}
]
}
我真的不认为这种方法会起作用,我对其他方法也不知所措。这是一个简化的例子,实际的JSON相当大,有许多不同嵌套级别的数组。
对我可能调查的方法有什么建议吗?
使用用户函数从JSON:中选择特定路径
def pick(paths):
. as $in
| reduce path(paths) as $path (null;
setpath($path; $in | getpath($path))
);
pick(.customers[] | .id, .preferences[]."preference-id")
在线演示