我的最终游戏是使用 D3js 从分层 JSON 文件创建树可视化。
我需要表示的层次结构是这个图,其中 A 有子项 B,C,D ;B有孩子E,F,G;C 有孩子 H, I ;D没有孩子。节点将有多个键:值对。为简单起见,我只列出了 3 个。
-- name:E
| type:dkBlue
| id: 005
|
|-- name:F
-- name:B ------| type:medBlue
| type:blue | id: 006
| id:002 |
| |-- name:G
| type:ltBlue
name:A ----| id:007
type:colors|
id:001 |-- name:C ----|-- name:H
| type:red | type:dkRed
| id:003 | id:008
| |
| |
| |-- name:I
| type:medRed
| id:009
|-- name:D
type:green
id: 004
我在 R 中的源数据如下所示:
nodes <-read.table(header = TRUE, text = "
ID name type
001 A colors
002 B blue
003 C red
004 D green
005 E dkBlue
006 F medBlue
007 G ltBlue
008 H dkRed
009 I medRed
")
links <- read.table(header = TRUE, text = "
startID relation endID
001 hasSubCat 002
001 hasSubCat 003
001 hasSubCat 004
002 hasSubCat 005
002 hasSubCat 006
002 hasSubCat 007
003 hasSubCat 008
003 hasSubCat 009
")
我必须将其转换为以下 JSON:
{"name": "A",
"type": "colors",
"id" : "001",
"children": [
{"name": "B",
"type": "blue",
"id" : "002",
"children": [
{"name": "E",
"type": "dkBlue",
"id" : "003"},
{"name": "F",
"type": "medBlue",
"id": "004"},
{"name": "G",
"type": "ltBlue",
"id": "005"}
]},
{"name": "C",
"type": "red",
"id" : "006",
"children": [
{"name": "H",
"type": "dkRed",
"id" : "007"},
{"name": "I",
"type": "dkBlue",
"id": "008"}
]},
{"name": "D",
"type": "green",
"id" : "009"}
]}
我将不胜感激你能提供的任何帮助!
[更新 2017-04-18]
根据 Ian 的参考资料,我查看了 R 的 data.tree。如果我按如下所示重新构建数据,则可以重新创建层次结构。请注意,我已经丢失了每个节点之间的关系类型(hasSubcat),其值可能因现实生活中的每个链接/边缘而异。如果我能得到一个可行的层次结构,我愿意(暂时)放手。数据树的修订数据:
df <-read.table(header = TRUE, text = "
paths type id
A colors 001
A/B blue 002
A/B/E dkBlue 005
A/B/F medBlue 006
A/B/G ltBlue 007
A/C red 003
A/C/H dkRed 008
A/C/I medRed 009
A/D green 004
")
myPaths <- as.Node(df, pathName = "paths")
myPaths$leafCount / (myPaths$totalCount - myPaths$leafCount)
print(myPaths, "type", "id", limit = 25)
打印显示我在原始帖子中勾勒的层次结构,甚至包含每个节点的键:值。好!
levelName type id
1 A colors 1
2 ¦--B blue 2
3 ¦ ¦--E dkBlue 5
4 ¦ ¦--F medBlue 6
5 ¦ °--G ltBlue 7
6 ¦--C red 3
7 ¦ ¦--H dkRed 8
8 ¦ °--I medRed 9
9 °--D green 4
我再次不知道如何将其从树转换为嵌套 JSON。与大多数示例一样,此处的示例 https://ipub.com/data-tree-to-networkd3/假定键:值对仅在叶节点上,而不是分支节点上。我认为答案是创建一个嵌套列表以输入 JSONIO 或 JSONLITE,我不知道该怎么做。
data.tree
非常有用,可能是实现目标的更好方法。 为了好玩,我将提交一种更迂回的方式来使用igraph
和d3r
来实现您的嵌套JSON
。
nodes <-read.table(header = TRUE, text = "
ID name type
001 A colors
002 B blue
003 C red
004 D green
005 E dkBlue
006 F medBlue
007 G ltBlue
008 H dkRed
009 I medRed
")
links <- read.table(header = TRUE, text = "
startID relation endID
001 hasSubCat 002
001 hasSubCat 003
001 hasSubCat 004
002 hasSubCat 005
002 hasSubCat 006
002 hasSubCat 007
003 hasSubCat 008
003 hasSubCat 009
")
library(d3r)
library(dplyr)
library(igraph)
# make it an igraph
gf <- graph_from_data_frame(links[,c(1,3,2)],vertices = nodes)
# if we know that this is a tree with root as "A"
# we can do something like this
df_tree <- dplyr::bind_rows(
lapply(
all_shortest_paths(gf,from="A")$res,
function(x){data.frame(t(names(unclass(x))), stringsAsFactors=FALSE)}
)
)
# we can discard the first column
df_tree <- df_tree[,-1]
# then make df_tree[1,1] as 1 (A)
df_tree[1,1] <- "A"
# now add node attributes to our data.frame
df_tree <- df_tree %>%
# let's get the last non-NA in each row so we can join with nodes
mutate(
last_non_na = apply(df_tree, MARGIN=1, function(x){tail(na.exclude(x),1)})
) %>%
# now join with nodes
left_join(
nodes,
by = c("last_non_na" = "name")
) %>%
# now remove last_non_na column
select(-last_non_na)
# use d3r to nest as we would like
nested <- df_tree %>%
d3_nest(value_cols = c("ID", "type"))
考虑顺级别向下走,将数据帧列迭代转换为多嵌套列表:
library(jsonlite)
...
df2list <- function(i) as.vector(nodes[nodes$name == i,])
# GRANDPARENT LEVEL
jsonlist <- as.list(nodes[nodes$name=='A',])
# PARENT LEVEL
jsonlist$children <- lapply(c('B','C','D'), function(i) as.list(nodes[nodes$name == i,]))
# CHILDREN LEVEL
jsonlist$children[[1]]$children <- lapply(c('E','F','G'), df2list)
jsonlist$children[[2]]$children <- lapply(c('H','I'), df2list)
toJSON(jsonlist, pretty=TRUE)
但是,使用此方法时,您会注意到单长度元素的某些内部子元素括在括号中。由于 R 在字符向量中不能具有复杂类型,因此整个对象必须是在括号中输出的列表类型。
因此,请考虑使用嵌套gsub
清理额外的括号,该括号仍呈现有效的 json:
output <- toJSON(jsonlist, pretty=TRUE)
gsub('"\]n', '"n', gsub('"\],n', '",n', gsub('": \["', '": "', output)))
最终输出
{
"ID": "001",
"name": "A",
"type": "colors",
"children": [
{
"ID": "002",
"name": "B",
"type": "blue",
"children": [
{
"ID": "005",
"name": "E",
"type": "dkBlue"
},
{
"ID": "006",
"name": "F",
"type": "medBlue"
},
{
"ID": "007",
"name": "G",
"type": "ltBlue"
}
]
},
{
"ID": "003",
"name": "C",
"type": "red",
"children": [
{
"ID": "008",
"name": "H",
"type": "dkRed"
},
{
"ID": "009",
"name": "I",
"type": "medRed"
}
]
},
{
"ID": "004",
"name": "D",
"type": "green"
}
]
}
一个不错的,如果有点难以理解,这样做的方法是使用自引用函数,如下所示......
nodes <- read.table(header = TRUE, colClasses = "character", text = "
ID name type
001 A colors
002 B blue
003 C red
004 D green
005 E dkBlue
006 F medBlue
007 G ltBlue
008 H dkRed
009 I medRed
")
links <- read.table(header = TRUE, colClasses = "character", text = "
startID relation endID
001 hasSubCat 002
001 hasSubCat 003
001 hasSubCat 004
002 hasSubCat 005
002 hasSubCat 006
002 hasSubCat 007
003 hasSubCat 008
003 hasSubCat 009
")
convert_hier <- function(linksDf, nodesDf, sourceId = "startID",
targetId = "endID", nodesID = "ID") {
makelist <- function(nodeid) {
child_ids <- linksDf[[targetId]][which(linksDf[[sourceId]] == nodeid)]
if (length(child_ids) == 0)
return(as.list(nodesDf[nodesDf[[nodesID]] == nodeid, ]))
c(as.list(nodesDf[nodesDf[[nodesID]] == nodeid, ]),
children = list(lapply(child_ids, makelist)))
}
ids <- unique(c(linksDf[[sourceId]], linksDf[[targetId]]))
rootid <- ids[! ids %in% linksDf[[targetId]]]
jsonlite::toJSON(makelist(rootid), pretty = T, auto_unbox = T)
}
convert_hier(links, nodes)
几点注意事项...
- 我在
read.table
命令中添加了colClasses = "character"
,以便 ID 号不会强制转换为没有前导零的整数,并且字符串不会转换为因子。 - 我将所有内容都包装在
convert_hier
函数中,以便更容易适应其他场景,但真正的魔力在于makelist
函数。