R 到 使用 JSONLITE 的分层 JSON?



我的最终游戏是使用 D3js 从分层 JSON 文件创建树可视化。

我需要表示的层次结构是这个图,其中 A 有子项 B,C,D ;B有孩子E,F,G;C 有孩子 H, I ;D没有孩子。节点将有多个键:值对。为简单起见,我只列出了 3 个。

-- name:E
|   type:dkBlue
|   id: 005
|
|-- name:F
-- name:B ------|   type:medBlue 
|  type:blue    |   id: 006
|  id:002       |
|               |-- name:G
|                   type:ltBlue
name:A ----|                   id:007     
type:colors|
id:001     |-- name:C  ----|-- name:H
|   type:red    |   type:dkRed         
|   id:003      |    id:008
|               |  
|               |
|               |-- name:I
|                   type:medRed
|                   id:009
|-- name:D
type:green
id: 004

我在 R 中的源数据如下所示:

nodes <-read.table(header = TRUE, text = "
ID name type
001 A   colors
002 B   blue
003 C   red
004 D   green
005 E   dkBlue
006 F   medBlue
007 G   ltBlue
008 H   dkRed
009 I   medRed
")
links <- read.table(header = TRUE, text = "
startID  relation endID    
001      hasSubCat 002
001      hasSubCat 003
001      hasSubCat 004
002      hasSubCat 005
002      hasSubCat 006
002      hasSubCat 007
003      hasSubCat 008
003      hasSubCat 009
")

我必须将其转换为以下 JSON:

{"name": "A",
"type": "colors",
"id" : "001",
"children": [
{"name": "B",
"type": "blue",
"id"  : "002", 
"children": [
{"name": "E",
"type": "dkBlue",
"id"  : "003"},
{"name": "F", 
"type": "medBlue",
"id": "004"},
{"name": "G", 
"type": "ltBlue",
"id": "005"}
]},
{"name": "C",
"type": "red",
"id"  : "006", 
"children": [
{"name": "H",
"type": "dkRed",
"id"  : "007"},
{"name": "I", 
"type": "dkBlue",
"id": "008"}
]},
{"name": "D",
"type": "green",
"id"  : "009"}
]}  

我将不胜感激你能提供的任何帮助!

[更新 2017-04-18]

根据 Ian 的参考资料,我查看了 R 的 data.tree。如果我按如下所示重新构建数据,则可以重新创建层次结构。请注意,我已经丢失了每个节点之间的关系类型(hasSubcat),其值可能因现实生活中的每个链接/边缘而异。如果我能得到一个可行的层次结构,我愿意(暂时)放手。数据树的修订数据:

df <-read.table(header = TRUE, text = "
paths  type     id 
A      colors   001
A/B    blue     002
A/B/E  dkBlue   005
A/B/F  medBlue  006
A/B/G  ltBlue   007
A/C    red      003
A/C/H  dkRed    008
A/C/I  medRed   009
A/D    green    004
")
myPaths <- as.Node(df, pathName = "paths")
myPaths$leafCount / (myPaths$totalCount - myPaths$leafCount)
print(myPaths, "type", "id", limit = 25)

打印显示我在原始帖子中勾勒的层次结构,甚至包含每个节点的键:值。好!

levelName    type id
1 A          colors  1
2  ¦--B        blue  2
3  ¦   ¦--E  dkBlue  5
4  ¦   ¦--F medBlue  6
5  ¦   °--G  ltBlue  7
6  ¦--C         red  3
7  ¦   ¦--H   dkRed  8
8  ¦   °--I  medRed  9
9  °--D       green  4

我再次不知道如何将其从树转换为嵌套 JSON。与大多数示例一样,此处的示例 https://ipub.com/data-tree-to-networkd3/假定键:值对仅在叶节点上,而不是分支节点上。我认为答案是创建一个嵌套列表以输入 JSONIO 或 JSONLITE,我不知道该怎么做。

data.tree非常有用,可能是实现目标的更好方法。 为了好玩,我将提交一种更迂回的方式来使用igraphd3r来实现您的嵌套JSON

nodes <-read.table(header = TRUE, text = "
ID name type
001 A   colors
002 B   blue
003 C   red
004 D   green
005 E   dkBlue
006 F   medBlue
007 G   ltBlue
008 H   dkRed
009 I   medRed
")
links <- read.table(header = TRUE, text = "
startID  relation endID    
001      hasSubCat 002
001      hasSubCat 003
001      hasSubCat 004
002      hasSubCat 005
002      hasSubCat 006
002      hasSubCat 007
003      hasSubCat 008
003      hasSubCat 009
")
library(d3r)
library(dplyr)
library(igraph)
# make it an igraph
gf <- graph_from_data_frame(links[,c(1,3,2)],vertices = nodes)
# if we know that this is a tree with root as "A"
#  we can do something like this
df_tree <- dplyr::bind_rows(
lapply(
all_shortest_paths(gf,from="A")$res,
function(x){data.frame(t(names(unclass(x))), stringsAsFactors=FALSE)}
)
)
# we can discard the first column
df_tree <- df_tree[,-1]
# then make df_tree[1,1] as 1 (A)
df_tree[1,1] <- "A"
# now add node attributes to our data.frame
df_tree <- df_tree %>%
# let's get the last non-NA in each row so we can join with nodes
mutate(
last_non_na = apply(df_tree, MARGIN=1, function(x){tail(na.exclude(x),1)})
) %>%
# now join with nodes
left_join(
nodes,
by = c("last_non_na" = "name")
) %>%
# now remove last_non_na column
select(-last_non_na)
# use d3r to nest as we would like
nested <- df_tree %>%
d3_nest(value_cols = c("ID", "type"))

考虑顺级别向下走,将数据帧列迭代转换为多嵌套列表:

library(jsonlite)
...
df2list <- function(i) as.vector(nodes[nodes$name == i,])
# GRANDPARENT LEVEL
jsonlist <- as.list(nodes[nodes$name=='A',])
# PARENT LEVEL       
jsonlist$children <- lapply(c('B','C','D'), function(i) as.list(nodes[nodes$name == i,]))
# CHILDREN LEVEL
jsonlist$children[[1]]$children <- lapply(c('E','F','G'), df2list)
jsonlist$children[[2]]$children <- lapply(c('H','I'), df2list)
toJSON(jsonlist, pretty=TRUE)

但是,使用此方法时,您会注意到单长度元素的某些内部子元素括在括号中。由于 R 在字符向量中不能具有复杂类型,因此整个对象必须是在括号中输出的列表类型。

因此,请考虑使用嵌套gsub清理额外的括号,该括号仍呈现有效的 json:

output <- toJSON(jsonlist, pretty=TRUE)
gsub('"\]n', '"n', gsub('"\],n', '",n', gsub('": \["', '": "', output)))

最终输出

{
"ID": "001",
"name": "A",
"type": "colors",
"children": [
{
"ID": "002",
"name": "B",
"type": "blue",
"children": [
{
"ID": "005",
"name": "E",
"type": "dkBlue"
},
{
"ID": "006",
"name": "F",
"type": "medBlue"
},
{
"ID": "007",
"name": "G",
"type": "ltBlue"
}
]
},
{
"ID": "003",
"name": "C",
"type": "red",
"children": [
{
"ID": "008",
"name": "H",
"type": "dkRed"
},
{
"ID": "009",
"name": "I",
"type": "medRed"
}
]
},
{
"ID": "004",
"name": "D",
"type": "green"
}
]
} 

一个不错的,如果有点难以理解,这样做的方法是使用自引用函数,如下所示......

nodes <- read.table(header = TRUE, colClasses = "character", text = "
ID name type
001 A   colors
002 B   blue
003 C   red
004 D   green
005 E   dkBlue
006 F   medBlue
007 G   ltBlue
008 H   dkRed
009 I   medRed
")
links <- read.table(header = TRUE, colClasses = "character", text = "
startID  relation endID    
001      hasSubCat 002
001      hasSubCat 003
001      hasSubCat 004
002      hasSubCat 005
002      hasSubCat 006
002      hasSubCat 007
003      hasSubCat 008
003      hasSubCat 009
")
convert_hier <- function(linksDf, nodesDf, sourceId = "startID", 
targetId = "endID", nodesID = "ID") {
makelist <- function(nodeid) {
child_ids <- linksDf[[targetId]][which(linksDf[[sourceId]] == nodeid)]
if (length(child_ids) == 0) 
return(as.list(nodesDf[nodesDf[[nodesID]] == nodeid, ]))
c(as.list(nodesDf[nodesDf[[nodesID]] == nodeid, ]), 
children = list(lapply(child_ids, makelist)))
}
ids <- unique(c(linksDf[[sourceId]], linksDf[[targetId]]))
rootid <- ids[! ids %in% linksDf[[targetId]]]
jsonlite::toJSON(makelist(rootid), pretty = T, auto_unbox = T)
}
convert_hier(links, nodes)

几点注意事项...

  1. 我在read.table命令中添加了colClasses = "character",以便 ID 号不会强制转换为没有前导零的整数,并且字符串不会转换为因子。
  2. 我将所有内容都包装在convert_hier函数中,以便更容易适应其他场景,但真正的魔力在于makelist函数。

相关内容

  • 没有找到相关文章

最新更新