动态组合r个数据帧列



我有一个函数,用户在其中输入列和数据帧的字符串向量作为参数,它返回带有新列的数据帧,其中列的元素连接如下:

数据帧:

df <- data.frame(x = c("A","B","C","D","E"), y = c("1","2","3","4","5"),
z = c("Test1","Test2", "Test3","Test4","Test5"),
w =c("B1","B2","B3","B4","B5"))

如果用户将矢量定义为vec <- c("x","y"),则输出应为:

newcol <- function(df, vec){
df <- df %>% mutate(newcolumn = paste(get("x"),get("y"), sep = ","))
return (df)
}
newcol(df, vec)
x y     z  w newcolumn
1 A 1 Test1 B1       A,1
2 B 2 Test2 B2       B,2
3 C 3 Test3 B3       C,3
4 D 4 Test4 B4       D,4
5 E 5 Test5 B5       E,5

如果vec <- c("x","y", "z"),则输出应如下:

newcol <- function(df, vec){
df <- df %>% mutate(newcolumn = paste(get("x"),get("y"), get("z"), sep = ","))
return (df)
}
newcol(df, vec)
x y     z  w newcolumn
1 A 1 Test1 B1 A,1,Test1
2 B 2 Test2 B2 B,2,Test2
3 C 3 Test3 B3 C,3,Test3
4 D 4 Test4 B4 D,4,Test4
5 E 5 Test5 B5 E,5,Test5

我想知道这种串联是如何动态完成的。

使用粘贴与!!!如图所示。

newcol <- function(df, vec){
df %>% mutate(newcolumn = paste(!!!.[vec], sep = ","))
}
newcol(df, c("x", "y", "z"))
##   x y     z  w newcolumn
## 1 A 1 Test1 B1 A,1,Test1
## 2 B 2 Test2 B2 B,2,Test2
## 3 C 3 Test3 B3 C,3,Test3
## 4 D 4 Test4 B4 D,4,Test4
## 5 E 5 Test5 B5 E,5,Test5

这也是有效的,并且没有包依赖关系。

newcol <- function(df, vec){
cbind(df, newcolumn = apply(df[vec], 1, paste, collapse = ","))
}

如果逗号后面跟着空格是可以的,那么这是有效的:

newcol <- function(df, vec){
cbind(df, newcolumn = apply(df[vec], 1, toString))
}

使用unite:

newcol <- function(df, vec){
df <- df %>% unite("newcol", vec, sep = ",", remove = F)
return (df)
}
vec <- c("x","z")
newcol(df,vec)

输出:

newcol x y     z  w
1 A,Test1 A 1 Test1 B1
2 B,Test2 B 2 Test2 B2
3 C,Test3 C 3 Test3 B3
4 D,Test4 D 4 Test4 B4
5 E,Test5 E 5 Test5 B5

如果你想真正聪明一点,你可以使用rlang和tidyselect原则将参数作为名称而不是字符串传递:

df <- data.frame(x = c("A","B","C","D","E"), y = c("1","2","3","4","5"),
z = c("Test1","Test2", "Test3","Test4","Test5"),
w =c("B1","B2","B3","B4","B5"))
library(rlang)
library(dplyr)
newcol <- function(df, ...) {
vec <- enquos(...)
df <- df %>% mutate(newcolumn = paste(!!!vec, sep = ","))
return(df)
}
df |> 
newcol(x, y)
#>   x y     z  w newcolumn
#> 1 A 1 Test1 B1       A,1
#> 2 B 2 Test2 B2       B,2
#> 3 C 3 Test3 B3       C,3
#> 4 D 4 Test4 B4       D,4
#> 5 E 5 Test5 B5       E,5
df |> 
newcol(x, y, z)
#>   x y     z  w newcolumn
#> 1 A 1 Test1 B1 A,1,Test1
#> 2 B 2 Test2 B2 B,2,Test2
#> 3 C 3 Test3 B3 C,3,Test3
#> 4 D 4 Test4 B4 D,4,Test4
#> 5 E 5 Test5 B5 E,5,Test5

相关内容

最新更新