我想解析一个嵌套json到csv。数据看起来与此类似。
{"tables":[{"name":"PrimaryResult","columns":[{"name":"name","type":"string"},{"name":"id","type":"string"},{"name":"custom","type":"dynamic"}]"rows":[["Alpha","1","{"age":"23","number":"xyz"}]]]}
我想要的csv文件为:
name id age number
alpha 1 23 xyz
我试着:
jq -r ".tables | .[] | .columns | map(.name)|@csv" demo.json > demo.csv
jq -r ".tables | .[] | .rows |.[]|@csv" demo.json >> demo.csv
但我没有得到预期的结果。
输出:
name id custom
alpha 1 {"age":"23","number":"xyz}
Expected:
name id age number
alpha 1 23 xyz
假设有效的JSON输入:
{
"tables": [
{
"name": "PrimaryResult",
"columns": [
{ "name": "name", "type": "string" },
{ "name": "id", "type": "string" },
{ "name": "custom", "type": "dynamic" }
],
"rows": [
"Alpha",
"1",
"{"age":"23","number":"xyz"}"
]
}
]
}
和假设固定头:
jq -r '["name", "id", "age", "number"],
(.tables[].rows | [.[0,1], (.[2] | fromjson | .age, .number)])
| @csv' input.json
输出:
"name","id","age","number"
"Alpha","1","23","xyz"
如果任何假设是错误的,你需要澄清你的需求,例如
- 如何确定列名?
- 如果输入包含多个表会发生什么?
- 作为"动态"物体总是形状相同吗?或者它有时可以包含更少、更多或不同的列吗?
假设.rows
数组是一个包含行和字段的二维数组,并且有一个类型为"动态"的列;总是期望一个json编码的对象,其字段表示进一步的列,但可能会也可能不会总是出现在每行中。
然后您可以转置标题数组和行数组,以便按类型集成处理每个列,特别是从"动态"中收集所有键。在运行中键入,然后将其转回以获得基于行的CSV输出。
输入(为了便于说明,我添加了另一行):
{
"tables": [
{
"name": "PrimaryResult",
"columns": [
{
"name": "name",
"type": "string"
},
{
"name": "id",
"type": "string"
},
{
"name": "custom",
"type": "dynamic"
}
],
"rows": [
[
"Alpha",
"1",
"{"age":"23","number":"123"}"
],
[
"Beta",
"2",
"{"age":"45","word":"xyz"}"
]
]
}
]
}
过滤器:jq -r '
.tables[] | [.columns, .rows[]] | transpose | map(
if first.type == "string" then first |= .name
elif first.type == "dynamic" then
.[1:] | map(fromjson)
| (map(keys[]) | unique) as $keys
| [$keys, (.[] | [.[$keys[]]])] | transpose[]
else empty end
)
| transpose[] | @csv
'
输出:"name","id","age","number","word"
"Alpha","1","23","123",
"Beta","2","45",,"xyz"
演示