我有一个函数,用户在其中输入列和数据帧的字符串向量作为参数,它返回带有新列的数据帧,其中列的元素连接如下:
数据帧:
df <- data.frame(x = c("A","B","C","D","E"), y = c("1","2","3","4","5"),
z = c("Test1","Test2", "Test3","Test4","Test5"),
w =c("B1","B2","B3","B4","B5"))
如果用户将矢量定义为vec <- c("x","y")
,则输出应为:
newcol <- function(df, vec){
df <- df %>% mutate(newcolumn = paste(get("x"),get("y"), sep = ","))
return (df)
}
newcol(df, vec)
x y z w newcolumn
1 A 1 Test1 B1 A,1
2 B 2 Test2 B2 B,2
3 C 3 Test3 B3 C,3
4 D 4 Test4 B4 D,4
5 E 5 Test5 B5 E,5
如果vec <- c("x","y", "z")
,则输出应如下:
newcol <- function(df, vec){
df <- df %>% mutate(newcolumn = paste(get("x"),get("y"), get("z"), sep = ","))
return (df)
}
newcol(df, vec)
x y z w newcolumn
1 A 1 Test1 B1 A,1,Test1
2 B 2 Test2 B2 B,2,Test2
3 C 3 Test3 B3 C,3,Test3
4 D 4 Test4 B4 D,4,Test4
5 E 5 Test5 B5 E,5,Test5
我想知道这种串联是如何动态完成的。
使用粘贴与!!!如图所示。
newcol <- function(df, vec){
df %>% mutate(newcolumn = paste(!!!.[vec], sep = ","))
}
newcol(df, c("x", "y", "z"))
## x y z w newcolumn
## 1 A 1 Test1 B1 A,1,Test1
## 2 B 2 Test2 B2 B,2,Test2
## 3 C 3 Test3 B3 C,3,Test3
## 4 D 4 Test4 B4 D,4,Test4
## 5 E 5 Test5 B5 E,5,Test5
这也是有效的,并且没有包依赖关系。
newcol <- function(df, vec){
cbind(df, newcolumn = apply(df[vec], 1, paste, collapse = ","))
}
如果逗号后面跟着空格是可以的,那么这是有效的:
newcol <- function(df, vec){
cbind(df, newcolumn = apply(df[vec], 1, toString))
}
使用unite
:
newcol <- function(df, vec){
df <- df %>% unite("newcol", vec, sep = ",", remove = F)
return (df)
}
vec <- c("x","z")
newcol(df,vec)
输出:
newcol x y z w
1 A,Test1 A 1 Test1 B1
2 B,Test2 B 2 Test2 B2
3 C,Test3 C 3 Test3 B3
4 D,Test4 D 4 Test4 B4
5 E,Test5 E 5 Test5 B5
如果你想真正聪明一点,你可以使用rlang
和tidyselect原则将参数作为名称而不是字符串传递:
df <- data.frame(x = c("A","B","C","D","E"), y = c("1","2","3","4","5"),
z = c("Test1","Test2", "Test3","Test4","Test5"),
w =c("B1","B2","B3","B4","B5"))
library(rlang)
library(dplyr)
newcol <- function(df, ...) {
vec <- enquos(...)
df <- df %>% mutate(newcolumn = paste(!!!vec, sep = ","))
return(df)
}
df |>
newcol(x, y)
#> x y z w newcolumn
#> 1 A 1 Test1 B1 A,1
#> 2 B 2 Test2 B2 B,2
#> 3 C 3 Test3 B3 C,3
#> 4 D 4 Test4 B4 D,4
#> 5 E 5 Test5 B5 E,5
df |>
newcol(x, y, z)
#> x y z w newcolumn
#> 1 A 1 Test1 B1 A,1,Test1
#> 2 B 2 Test2 B2 B,2,Test2
#> 3 C 3 Test3 B3 C,3,Test3
#> 4 D 4 Test4 B4 D,4,Test4
#> 5 E 5 Test5 B5 E,5,Test5