R:在R中创建递归二叉树



我对编写递归二叉树算法很感兴趣。给定以下数据,其中我已经对协变x进行了排序

mydata <- data.frame(x = c(10, 20, 25, 35), y = c(-10.5, 6.5, 7.5, -7.5))
> mydata
x     y
1 10 -10.5
2 20   6.5
3 25   7.5
4 35  -7.5

我将拆分树,使左侧节点始终包含父节点的第一个实例,右侧节点包含父节点中的其余实例(奇怪的拆分方式,但请耐心等待(。从本质上讲,我希望我的树看起来像这样,最大高度为3。

[-10.5, 6.5, 7.5, -7.5]
/         
[-10.5]        [6.5, 7.5, -7.5]
/      
[6.5]       [7.5, -7.5]

我希望函数的最终输出返回一个包含所有节点的列表:

> final_tree
[[1]]
[[1]][[1]]
x     y
1 10 -10.5
2 20   6.5
3 25   7.5
4 35  -7.5

[[2]]
[[2]][[1]]
x     y
1 10 -10.5

[[2]][[2]]
x     y
1 20   6.5
2 25   7.5
3 35  -7.5

[[3]]
[[3]][[1]]
NULL
[[3]][[2]]
NULL
[[3]][[3]]
x     y
1 20   6.5

[[3]][[4]]
x     y
1 25   7.5
2 35  -7.5

到目前为止,我拥有的是:

# Initialize empty tree
create_empty_tree <- function(max_height) sapply(1:max_height, function(k) replicate(2**(k-1),c()))
# Create empty tree with max_height = 3
tree_struc <- create_empty_tree(max_height = 3)
grow_tree <- function(node_parent, max_height, tree_struc, height){
# Sort x
sorted_x <- sort(node_parent$x)
# Fix best split index at 1
best_split_ind <- 1
# Assign instances to left or right nodes
group <- ifelse(node_parent$x <= node_parent$x[best_split_ind], "left", "right")
node_left <- node_parent[which(group == "left"), ]
node_right <- node_parent[which(group == "right"), ]
# Recursive call on left and right nodes
if(height < max_height){
tree_struc[[height]] <- node_parent
tree_struc[[height + 1]][[1]] <- grow_tree(node_parent = node_left, max_height = max_height, tree_struc = tree_struc, height = height + 1)
tree_struc[[height + 1]][[2]] <- grow_tree(node_parent = node_right, max_height = max_height, tree_struc = tree_struc, height = height + 1)
}
return(tree_struc)
}
grow_tree(node_parent = mydata, max_height = 3, tree_struc = tree_struc, height = 1)

生成的树不正确。我认为这与我如何递归调用左右子节点上的函数有关。有人能给我指正确的方向吗?

也许你可以试试下面的函数grow_tree

create_empty_tree <- function(max_height) sapply(1:max_height, function(k) replicate(2**(k-1),c()))
grow_tree <- function(node_parent,max_height = nrow(node_parent)) {
tree_struc <- create_empty_tree(max_height)
for (i in 1:length(tree_struc)) {
tree_struc[[i]][[2**(i-1)]] <- `row.names<-`(tail(node_parent,nrow(node_parent)-i+1),NULL)
if (i > 1) tree_struc[[i]][[2**(i-1)-1]] <- `row.names<-`(node_parent[i-1,],NULL)
}
tree_struc
}

示例

> grow_tree(mydata,3)
[[1]]
[[1]][[1]]
x     y
1 10 -10.5
2 20   6.5
3 25   7.5
4 35  -7.5

[[2]]
[[2]][[1]]
x     y
1 10 -10.5
[[2]][[2]]
x    y
1 20  6.5
2 25  7.5
3 35 -7.5

[[3]]
[[3]][[1]]
NULL
[[3]][[2]]
NULL
[[3]][[3]]
x   y
1 20 6.5
[[3]][[4]]
x    y
1 25  7.5
2 35 -7.5

> grow_tree(mydata)
[[1]]
[[1]][[1]]
x     y
1 10 -10.5
2 20   6.5
3 25   7.5
4 35  -7.5

[[2]]
[[2]][[1]]
x     y
1 10 -10.5
[[2]][[2]]
x    y
1 20  6.5
2 25  7.5
3 35 -7.5

[[3]]
[[3]][[1]]
NULL
[[3]][[2]]
NULL
[[3]][[3]]
x   y
1 20 6.5
[[3]][[4]]
x    y
1 25  7.5
2 35 -7.5

[[4]]
[[4]][[1]]
NULL
[[4]][[2]]
NULL
[[4]][[3]]
NULL
[[4]][[4]]
NULL
[[4]][[5]]
NULL
[[4]][[6]]
NULL
[[4]][[7]]
x   y
1 25 7.5
[[4]][[8]]
x    y
1 35 -7.5

最新更新