为每个数组元素生成单独的CSV记录



我有一个JSON:

{
"Country": "USA",
"State": "TX",
"Employees": [
{
"Name": "Name1",
"address": "SomeAdress1"
}
]
}
{
"Country": "USA",
"State": "FL",
"Employees": [
{
"Name": "Name2",
"address": "SomeAdress2"
},
{
"Name": "Name3",
"address": "SomeAdress3"
}
]
}
{
"Country": "USA",
"State": "CA",
"Employees": [
{
"Name": "Name4",
"address": "SomeAdress4"
}
]
}

我想使用jq以csv格式获得以下结果:

国家,州,姓名,地址

USA, TX, Name1, SomeAdress1
USA, FL, Name2, SomeAdress2
USA, FL, Name3, SomeAdress3
USA, CA, Name4, SomeAdress4

我有以下jq:

jq -r '.|[.Country,.State,(.Employees[]|.Name,.address)] | @csv'

,我得到以下与第二行有更多的列比需要的。我想把这些额外的列放在单独的行中:

"USA","TX","Name1","SomeAdress1"
"USA","FL","Name2","SomeAdress2","Name3","SomeAdress3"
"USA","CA","Name4","SomeAdress4"

我想要下面的结果:

"USA","TX","Name1","SomeAdress1"
"USA","FL","Name2","SomeAdress2"
"USA","FL","Name3","SomeAdress3"
"USA","CA","Name4","SomeAdress4"

您需要为每个员工生成一个单独的数组。

[.Country, .State] + (.Employees[] | [.Name, .address]) | @csv
<<p><一口>在线演示/一口>

您可以将根对象存储在一个变量中,然后展开Employees数组:

$ jq -r '. as $root | .Employees[]|[$root.Country, $root.State, .Name, .address] | @csv'
"USA","TX","Name1","SomeAdress1"
"USA","FL","Name2","SomeAdress2"
"USA","FL","Name3","SomeAdress3"
"USA","CA","Name4","SomeAdress4"

其他答案都很好,但我想谈谈为什么你的尝试不起作用,以及为什么它看起来应该起作用。

你想知道为什么:

jq -r '.|[.Country,.State,(.Employees[]|.Name,.address)] | @csv'

生产:

"USA","TX","Name1","SomeAdress1"
"USA","FL","Name2","SomeAdress2","Name3","SomeAdress3"
"USA","CA","Name4","SomeAdress4"

可能是因为:

jq '{Country:.Country,State:.State,Name:(.Employees[]|.Name)}'

生产:

{
"Country": "USA",
"State": "TX",
"Name": "Name1"
}
{
"Country": "USA",
"State": "FL",
"Name": "Name2"
}
{
"Country": "USA",
"State": "FL",
"Name": "Name3"
}
{
"Country": "USA",
"State": "CA",
"Name": "Name4"
}

事实证明,区别在于[...]{...}jq滤波器中的作用。在数组构造函数[...]中,方括号、逗号和所有的全部内容是一个过滤器,它被完全求值,并将所有结果组合到一个数组中。里面的每个逗号都是排序运算符,这意味着先从左边的过滤器中生成所有值,然后再从右边的过滤器中生成所有值。相比之下,{...}对象构造函数中的逗号是语法的部分,只是分隔对象的字段。如果任何字段表达式产生多个值,则生成多个完整对象。如果多个字段表达式产生多个值,那么对于每个组合产生的值,将得到一个完整的对象。

当你这样做的时候:

jq -r '.|[.Country,.State,(.Employees[]|.Name,.address)] | @csv'
^      ^                   ^
1      2                   3

的问题是标记为"1", "2"one_answers"3"都在做同样的事情,在左边计算所有过滤器的值,然后在右边计算所有过滤器的值。然后,数组构造函数捕获所有这些变量并生成单个数组。数组构造函数永远不会为一个输入创建多个数组。

因此,考虑到这一点,您需要确保展开.Employees[]的地方不在数组构造函数中。这里有另一个选项可以添加到您已经有的答案中:
jq -r '.Employee=.Employees[]|[.Country,.State,.Employee.Name,.Employee.address]|@csv'

演示甚至:

jq -r '.Employees[] as $e|[.Country,.State,$e.Name,$e.address]|@csv'

演示

最新更新