r-使用循环基于列拆分数据集



我一直在尝试获得一个循环,该循环基于列值将数据集拆分为多个数据集。然而,数据集的格式是我以前没有处理过的(即一个包含列表和data.tables的列表(

table1 <- data.table::data.table(Scenario = 
c(rep(
c("A", "B", "C", "D"), 
4)),
A = c(
rep("x", 4), rep("b", 4), rep("s", 4),
rep("u", 4)),
Correlation = c(1, 0.125, 0.1, 0, 
0.125, 1, 0.2, 0, 
0.1, 0.2,   1, 0, 
0,     0,   0, 1),
Matrix = "IM",
stringsAsFactors = FALSE,
check.names = FALSE)
table2 <- data.table::data.table(Scenario = 
c(rep(
c("A", "B", "C", "D"), 
4)),
A = c(
rep("x", 4), rep("b", 4), rep("s", 4),
rep("u", 4)),
Correlation = c(1, 0.125, 0.1, 0, 
0.125, 1, 0.2, 0, 
0.1, 0.2,   1, 0, 
0,     0,   0, 1),
Matrix = "IM",
stringsAsFactors = FALSE,
check.names = FALSE)
table3 <- data.table::data.table(Scenario = 
c(rep(
c("A", "B", "C", "D"), 
4)),
A = c(
rep("x", 4), rep("b", 4), rep("s", 4),
rep("u", 4)),
Correlation = c(1, 0.125, 0.1, 0, 
0.125, 1, 0.2, 0, 
0.1, 0.2,   1, 0, 
0,     0,   0, 1),
Matrix = "IM",
stringsAsFactors = FALSE,
check.names = FALSE)
list1 <- list("a" = "2019", "b" = "2020", "c" = "2021")
list2 <- list("a" = "test", "b" = "test", "c" = "test")
input_data <- list("table1" = table1, "table2" = table2, "table3" = table3, 
"list1"=list1, "list2" = list2)

我需要一个循环,根据场景列中的所有唯一实例来拆分此数据集。第一数据集(对于场景值"A"(可通过再现:

table1 <- data.table::data.table(Scenario = 
c(rep(
c("A"), 
4)),
A = c(
rep("x", 1), rep("b", 1), rep("s", 1),
rep("u", 1)),
Correlation = c(1, 0.125, 0.1, 0 ),
Matrix = "IM",
stringsAsFactors = FALSE,
check.names = FALSE)
table2 <- data.table::data.table(Scenario = 
c(rep(
c( "A"), 
4)),
A = c(
rep("x", 1), rep("b", 1), rep("s", 1),
rep("u", 1)),
Correlation = c(1, 0.125, 0.1, 0),
Matrix = "IM",
stringsAsFactors = FALSE,
check.names = FALSE)
table3 <- data.table::data.table(Scenario = 
c(rep(
c("A"), 
4)),
A = c(
rep("x", 1), rep("b", 1), rep("s", 1),
rep("u", 1)),
Correlation = c(1, 0.125, 0.1, 0),
Matrix = "IM",
stringsAsFactors = FALSE,
check.names = FALSE)
list1 <- list("a" = "2019", "b" = "2020", "c" = "2021")
list2 <- list("a" = "test", "b" = "test", "c" = "test")
input_data <- list("table1" = table1, "table2" = table2, "table3" = table3, 
"list1"=list1, "list2" = list2)

如果需要更多信息,请告诉我。

您可以编写一个封装lapply的函数,利用inherits来检查列表中每个对象的类型。如果对象从data.frame继承并包含一个名为Scenario的列,那么您可以简单地对其进行子集设置。不是数据帧或数据表的项,或者没有名为Scenario的列的项保持不变:

get_scenario <- function(S) {
lapply(input_data, function(x) {
if(!inherits(x, "data.frame")) 
return(x) 
else if(!"Scenario" %in% names(x))
return(x)

return(x[x$Scenario == S,])
})
}

这允许:

get_scenario("A")
#> $table1
#>    Scenario A Correlation Matrix
#> 1:        A x       1.000     IM
#> 2:        A b       0.125     IM
#> 3:        A s       0.100     IM
#> 4:        A u       0.000     IM
#> 
#> $table2
#>    Scenario A Correlation Matrix
#> 1:        A x       1.000     IM
#> 2:        A b       0.125     IM
#> 3:        A s       0.100     IM
#> 4:        A u       0.000     IM
#> 
#> $table3
#>    Scenario A Correlation Matrix
#> 1:        A x       1.000     IM
#> 2:        A b       0.125     IM
#> 3:        A s       0.100     IM
#> 4:        A u       0.000     IM
#> 
#> $list1
#> $list1$a
#> [1] "2019"
#> 
#> $list1$b
#> [1] "2020"
#> 
#> $list1$c
#> [1] "2021"
#> 
#> 
#> $list2
#> $list2$a
#> [1] "test"
#> 
#> $list2$b
#> [1] "test"
#> 
#> $list2$c
#> [1] "test"

如果你想把所有的子组都作为一个超级列表,你可以这样做:

lapply(c("A", "B", "C"), get_scenario)

相关内容

  • 没有找到相关文章