将列名传递给 R 中的函数以使用函数排除某些列

  • 本文关键字:函数 排除 r function
  • 更新时间 :
  • 英文 :


我有一个长格式的数据集,其中包含列 a) 按年变化的百分比,以及 b) 按年绝对变化的列,此外 c) 其他数据。

我需要编写一个函数,根据我调用difference的 TRUE 或 FALSE 参数的值,排除名称中包含PERC_CHANGE&EURABS_CHANGE&EUR的列,然后返回结果数据帧

这是可重现的代码块:

df=structure(list(SCENARIO = c("BC", "BC", "BC", "BC"), INSTITUTE = c("BCR", 
"BCR", "BCR", "BCR"), METHOD_DEC = c("BIL", "CARLA", 
"CARLA", "CARLA"), CLASS = c("SME", "BANK", "CORPORATE", 
"SME"), EUR_Y_2021 = c(13446986L, 0L, 0L, 0L), EUR_Y_2022 = c(16460885L, 
133047L, 728991L, 665L), ABS_CHANGE_N_2021 = c(0L, 0L, 0L, 0L
), ABS_CHANGE_N_2022 = c(1815796L, -1039290L, 2768626L, -499L
), PERC_CHANGE_N_2022 = c(0.0227073699984259, -0.00992854123296549, 
0.0608814672317806, -0.233723653395784), PERC_CHANGE_N_2023 = c(0.0722801890040687, 
-0.0115649941812915, 0.145799497480829, -0.402341920374707)), class = c("grouped_df", 
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -4L), groups = structure(list(
CLASS = c("BANK", "CORPORATE", "SME", "SME"), METHOD_DEC = c("CARLA", 
"CARLA", "BIL", "CARLA"), INSTITUTE = c("BCR", "BCR", 
"BCR", "BCR"), SCENARIO = c("BC", "BC", "BC", "BC"), .rows = structure(list(
2L, 3L, 1L, 4L), ptype = integer(0), class = c("vctrs_list_of", 
"vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -4L), .drop = TRUE))

这就是我的想法:

test_func <- function(df, difference) {
if (difference==TRUE) {
df=df %>% select(-contains("ABS_CHANGE" | contains("EUR"))
} else {                 
df=df %>% select(-contains("PERC_CHANGE" | contains("EUR"))
}
}
return (df)
test_func(df,difference=FALSE)

您错过了几个括号(接近contains)并错放了大括号(未返回任何内容)。根据您的代码 - 尝试:

test_func <- function(df, difference) {
if (difference==TRUE) {
df=df %>% select(-(contains("ABS_CHANGE") | contains("EUR")))
} else {          
df=df %>% select(-(contains("PERC_CHANGE") | contains("EUR")))
}
return (df)
}

输出:

# A tibble: 4 × 6
# Groups:   CLASS, METHOD_DEC, INSTITUTE, SCENARIO [4]
SCENARIO INSTITUTE METHOD_DEC CLASS     PERC_CHANGE_N_2022 PERC_CHANGE_N_2023
<chr>    <chr>     <chr>      <chr>                  <dbl>              <dbl>
1 BC       BCR       BIL        SME                  0.0227              0.0723
2 BC       BCR       CARLA      BANK                -0.00993            -0.0116
3 BC       BCR       CARLA      CORPORATE            0.0609              0.146 
4 BC       BCR       CARLA      SME                 -0.234              -0.402 
# A tibble: 4 × 6
# Groups:   CLASS, METHOD_DEC, INSTITUTE, SCENARIO [4]
SCENARIO INSTITUTE METHOD_DEC CLASS     ABS_CHANGE_N_2021 ABS_CHANGE_N_2022
<chr>    <chr>     <chr>      <chr>                 <int>             <int>
1 BC       BCR       BIL        SME                       0           1815796
2 BC       BCR       CARLA      BANK                      0          -1039290
3 BC       BCR       CARLA      CORPORATE                 0           2768626
4 BC       BCR       CARLA      SME                       0              -499

更新:添加了输出。

由于唯一的变化是"ABS"与"PERC",我们可以将函数写为

test_func <- function(df, difference = TRUE) {
nm <- if(difference) 'ABS_CHANGE' else 'PERC_CHANGE'
df %>%
select(-contains(nm), -contains("EUR"))
}

-测试

> test_func(df)
# A tibble: 4 × 6
# Groups:   CLASS, METHOD_DEC, INSTITUTE, SCENARIO [4]
SCENARIO INSTITUTE METHOD_DEC CLASS     PERC_CHANGE_N_2022 PERC_CHANGE_N_2023
<chr>    <chr>     <chr>      <chr>                  <dbl>              <dbl>
1 BC       BCR       BIL        SME                  0.0227              0.0723
2 BC       BCR       CARLA      BANK                -0.00993            -0.0116
3 BC       BCR       CARLA      CORPORATE            0.0609              0.146 
4 BC       BCR       CARLA      SME                 -0.234              -0.402 
> test_func(df, FALSE)
# A tibble: 4 × 6
# Groups:   CLASS, METHOD_DEC, INSTITUTE, SCENARIO [4]
SCENARIO INSTITUTE METHOD_DEC CLASS     ABS_CHANGE_N_2021 ABS_CHANGE_N_2022
<chr>    <chr>     <chr>      <chr>                 <int>             <int>
1 BC       BCR       BIL        SME                       0           1815796
2 BC       BCR       CARLA      BANK                      0          -1039290
3 BC       BCR       CARLA      CORPORATE                 0           2768626
4 BC       BCR       CARLA      SME                       0              -499

它可以是带有列名子字符串的参数,而不是不同的逻辑参数。 在这种情况下,我们不需要任何if/else

test_function <- function(df, col_sub) {
df %>%
select(-contains(col_sub), -contains("EUR"))
}

然后测试

test_function(df, "ABS_CHANGE")
test_function(df, "PERC_CHANGE")