使用keys/id将json数据从一列附加到另一列

  • 本文关键字:一列 keys id 数据 使用 json r
  • 更新时间 :
  • 英文 :


我有三个像这样的表,它们有一个键和一个描述特定键的随机数据的字段:

> json1
key                                        field
1 hg8oxoi4 "components":{"a": "21","b": "12","c": "34"}
2 gic3bv14 "components":{"a": "78","b": "66","c": "54"}
3 yo47wglq  "components":{"a": "6","b": "12","c": "12"}
4 vibidd0l   "components":{"a": "45","b": "5","c": "1"}
> json2
key                                          field
1 hg8oxoi4 "last_recall": {"date": "012118","size": "43"}
2 vibidd0l "last_recall": {"date": "101618","size": "12"}
> json3
key                           field
1 hg8oxoi4 "other_fields":{"people": "11"}
2 gic3bv14 "other_fields":{"people": "10"}
3 yo47wglq  "other_fields":{"people": "4"}

将所有三个表合并为一个表的最佳方法是什么,确保所有键相互匹配,并处理哪些键有数据,哪些没有数据的差异?理想情况下,每个字段都会附加到另一个字段上,这样新表的字段列就是一个具有不同数据的json对象。

编辑:这是预期的输出。

> json4
key
1 hg8oxoi4
2 gic3bv14
3 yo47wglq
4 vibidd0l
                                                 field
1 {"components":{"a": "21","b": "12","c": "34"},"last_recall": {"date": "012118","size": "43"},"other_fields":{"people": "11"}}
2                                                {"components":{"a": "78","b": "66","c": "54"},"other_fields":{"people": "10"}}
3                                                  {"components":{"a": "6","b": "12","c": "12"},"other_fields":{"people": "4"}}
4                                   {"components":{"a": "45","b": "5","c": "1"},"last_recall": {"date": "101618","size": "12"}}

编辑2:json1和json2 的输出

> dput(json1)
structure(list(key = c("hg8oxoi4", "gic3bv14", "yo47wglq", "vibidd0l"
), field = c(""components":{"a": "21","b": "12","c": "34"}", 
""components":{"a": "78","b": "66","c": "54"}", 
""components":{"a": "6","b": "12","c": "12"}", 
""components":{"a": "45","b": "5","c": "1"}")), .Names = c("key", 
"field"), row.names = c(NA, -4L), class = "data.frame")
> dput(json2)
structure(list(key = c("hg8oxoi4", "vibidd0l"), field = c(""last_recall": {"date": "012118","size": "43"}", 
""last_recall": {"date": "101618","size": "12"}")), .Names = c("key", 
"field"), row.names = c(NA, -2L), class = "data.frame")

在将"数据集"放入list后,我们通过"key"对数据集进行merge

out <- Reduce(function(...) merge(..., all = TRUE, by = "key"), 
mget(ls(pattern ="^json\d+$")))

然后,paste非NA元素按行

out$field <- apply(out[-1], 1, function(x) paste(x[!is.na(x)], collapse=", "))
out[c("key", "field")]

最新更新