我有一些JSON文件,全部具有相同的结构(到处都是相同的键,相应的值可能会有所不同(。我想将与某些密钥关联的值收集到列表中,并将这些列表作为与这些密钥关联的值存储在新的JSON文件中。
作为例如,请考虑这三个文件,其中我对密钥number_items
和相应值感兴趣。第一个文件 -
[
{
"box_id": 1,
"number_items": 4
},
{
"box_id": 3,
"number_items": 15
},
{
"box_id": 6,
"number_items": 2
}
]
第二个文件 -
[
{
"box_id": 1,
"number_items": 7
},
{
"box_id": 3,
"number_items": 15
},
{
"box_id": 6,
"number_items": 4
}
]
第三文件 -
[
{
"box_id": 1,
"number_items": 5
},
{
"box_id": 3,
"number_items": 9
},
{
"box_id": 6,
"number_items": 0
}
]
这些应该合并为看起来像这样的东西 -
[
{
"box_id": 1,
"number_items": [
4,
7,
5
]
},
{
"box_id": 3,
"number_items": [
15,
15,
9
]
},
{
"box_id": 6,
"number_items": [
2,
4,
0
]
}
]
可以使用jq
完成此操作吗?如果不是,那将是一个好方法?请注意,实际场景由150个以上的文件组成,带有3个键,我想合并到列表中。
您可以通过简单地将它们全部传递给输入,将文件与相似的结构合并。它们的内容将按照它们的顺序进行流式传输。
然后,您可以将它们读入单个数组,将对象按box_id
分组,然后映射结果。
$ jq -n '
[inputs[]] | group_by(.box_id)
| map({box_id:.[0].box_id, number_items:map(.number_items)})
' input{1,2,3}.json
生产:
[
{
"box_id": 1,
"number_items": [
4,
7,
5
]
},
{
"box_id": 3,
"number_items": [
15,
15,
9
]
},
{
"box_id": 6,
"number_items": [
4,
2,
0
]
}
]
似乎在某些平台上将项目分组时似乎没有保留订单。就我而言,在Windows 64位版本上运行会产生此产品。因此,请注意,如果您想使用group_by
。当然,如果您想避免使用此过滤器,您可以采用其他方法,但这更方便使用。
我想收集与某些键相关的值
这是一个解决方案,它以相同的方式处理除分组密钥外的所有键。它还优雅地处理丢失键,并不取决于JQ的sort
的稳定性。该解决方案基于通用过滤器merge/0
,定义如下:
# Combine an array of objects into a single object, ans, with array-valued keys,
# such that for every key, k, in the i-th object of the input array, a,
# ans[k][i] = a[i][k]
# null is used as padding if a value is missing.
# Example:
# [{a:1, b:2}, {b:3, c:4}] | merge
# produces:
# {"a":[1,null],"b":[2,3],"c":[null,4]}
def merge:
def allkeys: map(keys) | add | unique;
allkeys as $allkeys
| reduce .[] as $in ({};
reduce $allkeys[] as $k (.;
. + {($k): (.[$k] + [$in[$k]]) } ));
然后可以将解决给定问题的解决方案提出为:
transpose | map(merge) | map( .box_id |= .[0] )
调用:
jq -s -f merge.jq input{1,2,3}.json
输出:如问题所示。
更健壮的解决方案
上面的解决方案假定每个文件中box_id
的顺序均匀性。OP要求似乎需要此假设,但是为了安全性和鲁棒性,将首先分类对象:
map(sort_by(.box_id)) | transpose | map( merge | (.box_id |= .[0]) )
请注意,这仍然假定任何输入文件中的box_id
值都不丢失。
更强大的解决方案
如果任何输入文件中可能会缺少某些box_id
值,则添加缺失值是适当的。这可以在以下过滤器的帮助下完成:
# Input: a matrix of objects (that is, an array of rows of objects),
# each of which is assumed to have a distinguished field, f,
# with distinct values on each row;
# Output: a rectangular matrix such that every row, r, of the output
# matrix includes the elements of the corresponding row of the input
# matrix, with additional elements as necessary so that (r |
# map(.id) | sort) is the same for all rows r.
#
def rectanglize(f):
def ids: [.[][] | f] | unique;
def it: . as $in | {} | (f = $in);
ids as $ids
| map( . + ( $ids - [.[]|f] | map(it) ) )
;
将所有内容整合在一起,主管道变为:
rectanglize(.id)
| map(sort_by(.box_id))
| transpose
| map( merge | .box_id |= .[0] )
取决于您尝试保存此新文件(本地VS服务器(的位置,有几种不同的方法。据我所知,如果不使用一个可用的插件(如何使用JavaScript将数据写入JSON文件(,则无法在本地保存文件。如果要将其保存到服务器,则使用JavaScript将其保存到服务器,最好使用背景语言完成。
这是将多个JSON文件的内容组合到所需格式中的一种方法。
// send json files you want combined, and a new file path and name (path/to/filename.json)
function combineJsonFiles(files, newFileName) {
var combinedJson = [];
// iterate through each file
$.each(files, function(key, fileName) {
// load json file
// wait to combine until loaded. without this 'when().done()', boxes would return 'undefined'
$.when(loadJsonFile(fileName)).done(function(boxes) {
// combine json from file with combinedJson array
combinedJson = combineJson(boxes, combinedJson);
// check if this is the last file
if (key == files.length-1) {
// puts into json format
combinedJson = JSON.stringify(combinedJson);
// your json is now ready to be saved to a file
}
});
});
}
function loadJsonFile(fileName) {
return $.getJSON(fileName);
}
function combineJson(boxes, combinedJson) {
// iterate through each box
$.each(boxes, function(key, box) {
// use grep to search if this box's id is already included
var matches = $.grep(combinedJson, function(e) { return e.box_id == box.box_id; });
// if there are no matches, add box to the combined file
if (matches.length == 0) {
var newBox = { box_id: box.box_id };
// iterate through properties of box
for (var property in box) {
// check to ensure that properties are not inherited from base class
if (box.hasOwnProperty(property)) {
// will ignore if property is box_id
if (property !== 'box_id') {
// box is reformatted to make the property type into array
newBox[property] = [box[property]];
}
}
}
combinedJson.push(newBox);
} else {
// select first match (there should never be more than one)
var match = matches[0];
// iterate through properties of box
for (var property in box) {
// check to ensure that properties are not inherited from base class
if (box.hasOwnProperty(property)) {
// will ignore if property is box_id
if (property !== 'box_id') {
// add property to the already existing box in the combined file
match[property].push(box[property]);
}
}
}
}
});
return combinedJson;
}
var jsonFiles = ['path/to/data.json', 'path/to/data2.json', 'path/to/data3.json'];
combineJsonFiles(jsonFiles, 'combined_json.json');
JSON输出看起来像:
[{"box_id":1,"number_items":[4,7,5]},{"box_id":3,"number_items":[15,15,9]},{"box_id":6,"number_items":[2,4,0]}]
希望这会有所帮助!