R将数据帧转换为按列名分组的嵌套json文件/对象



我想将数据帧转换为嵌套的json对象,并根据列名确定在何处创建嵌套json对象。

我做了一个玩具例子来解释这个问题。给定此数据帧:

df <- read.csv(textConnection(
"id,name,allergies.pollen,allergies.pet,attributes.height,attributes.gender
x,alice,no,yes,175,female
y,bob,yes,yes,180,male"))

或者以更可读的格式:

id  name allergies.pollen allergies.pet attributes.height attributes.gender
1  x alice               no           yes               175            female
2  y   bob              yes           yes               180              male

然后我想要以下json对象:

'[
{
"id": "x",
"name": "alice",
"allergies":
{
"pollen": "no",
"pet": "yes"
},
"attributes": 
{
"height": "175",
"gender": "female"
}
},
{
"id": "y",
"name": "bob",
"allergies":
{
"pollen": "yes",
"pet": "yes"
},
"attributes":
{
"height": "180",
"gender": "male"
}
}
]'

因此,它应该以固定的分隔符"自动地对列进行分组&";。

理想情况下,它也应该能够处理嵌套的嵌套对象,例如allergies.pet.catallergies.pet.dog

关于解决这个问题,我最好的想法是制作一个函数,递归调用jsonlite::toJSON并使用stringr::str_extract("^[^.]*")提取类别,但我还没能做到这一点。

这里有一个似乎可以工作的函数。唯一的故障是是否存在可能的冲突,例如allergies.petallergies.pet.car;虽然它没有错误,但它可能是非标准的。

新数据:

df <- read.csv(textConnection(
"id,name,allergies.pollen,allergies.pet,attributes.height,attributes.gender,allergies.pet.cat
x,alice,no,yes,175,female,quux
y,bob,yes,yes,180,male,unk"))

功能:

func <- function(x) {
grps <- split(names(x), gsub("[.].*", "", names(x)))
for (nm in names(grps)) {
if (length(grps[[nm]]) > 1 || !nm %in% grps[[nm]]) {
x[[nm]] <- setNames(subset(x, select = grps[[nm]]),
gsub("^[^.]+[.]", "", grps[[nm]]))
x[,setdiff(grps[[nm]], nm)] <- NULL
}
}
for (nm in names(x)) {
if (is.data.frame(x[[nm]])) {
x[[nm]] <- func(x[[nm]])
}
}
if (any(grepl("[.]", names(x)))) func(x) else x
}

看看这是如何将所有以.分隔的列嵌套到帧中的:

str(df)
# 'data.frame': 2 obs. of  7 variables:
#  $ id               : chr  "x" "y"
#  $ name             : chr  "alice" "bob"
#  $ allergies.pollen : chr  "no" "yes"
#  $ allergies.pet    : chr  "yes" "yes"
#  $ attributes.height: int  175 180
#  $ attributes.gender: chr  "female" "male"
#  $ allergies.pet.cat: chr  "quux" "unk"
newdf <- func(df)
str(newdf)
# 'data.frame': 2 obs. of  4 variables:
#  $ id        : chr  "x" "y"
#  $ name      : chr  "alice" "bob"
#  $ allergies :'data.frame':   2 obs. of  2 variables:
#   ..$ pollen: chr  "no" "yes"
#   ..$ pet   :'data.frame':    2 obs. of  2 variables:
#   .. ..$ pet: chr  "yes" "yes"
#   .. ..$ cat: chr  "quux" "unk"
#  $ attributes:'data.frame':   2 obs. of  2 variables:
#   ..$ height: int  175 180
#   ..$ gender: chr  "female" "male"

从这里,它直接进入jsonify:

jsonlite::toJSON(newdf, pretty = TRUE)
# [
#   {
#     "id": "x",
#     "name": "alice",
#     "allergies": {
#       "pollen": "no",
#       "pet": {
#         "pet": "yes",
#         "cat": "quux"
#       }
#     },
#     "attributes": {
#       "height": 175,
#       "gender": "female"
#     }
#   },
#   {
#     "id": "y",
#     "name": "bob",
#     "allergies": {
#       "pollen": "yes",
#       "pet": {
#         "pet": "yes",
#         "cat": "unk"
#       }
#     },
#     "attributes": {
#       "height": 180,
#       "gender": "male"
#     }
#   }
# ] 

相关内容

  • 没有找到相关文章

最新更新