我有这个数据帧:
Name Pr VP Tr Me Sa Ar
Alicia 1 0 0 0 1 0
Bonnie 0 1 1 0 0 0
Cathy 1 1 1 1 1 1
Daphne 1 0 0 0 0 1
Elena 0 0 0 1 1 1
Faye 0 0 0 0 0 1
我想制作这个数据帧,它添加了一列,每列的名称为每行1:
Name Pr VP Tr Me Sa Ar Nominations
Alicia 1 0 0 0 1 0 Pr, Ar
Bonnie 0 1 1 0 0 0 VP, Tr
Cathy 1 1 1 1 1 1 Pr, VP, Tr, Me, Sa, Ar
Daphne 1 0 0 0 0 1 Pr, Ar
Elena 0 0 0 1 1 1 Me, Sa, Ar
Faye 0 0 0 0 0 1 Ar
我更喜欢tidyverse,但R基数也很有用。
我们可以在具有apply
、MARGIN = 1
和paste
的行上循环,其中"x"是1
df1$Nominations <- apply(df1[-1], 1, function(x) toString(names(x)[x == 1]))
df1$Nominations
#[1] "Pr, Sa" "VP, Tr" "Pr, VP, Tr, Me, Sa, Ar" "Pr, Ar"
#[5] "Me, Sa, Ar" "Ar"
或者使用tidyverse
,用pivot_longer
重塑为"长"格式,按"名称"summarise
按paste
分组,其中"值"为1,并与原始数据集连接
library(dplyr)
library(tidyr)
df1 %>%
pivot_longer(cols = -Name) %>%
group_by(Name) %>%
summarise(Nominations = toString(name[as.logical(value)])) %>%
right_join(df1) %>%
select(names(df1), everything())
# A tibble: 6 x 8
# Name Pr VP Tr Me Sa Ar Nominations
# <chr> <int> <int> <int> <int> <int> <int> <chr>
#1 Alicia 1 0 0 0 1 0 Pr, Sa
#2 Bonnie 0 1 1 0 0 0 VP, Tr
#3 Cathy 1 1 1 1 1 1 Pr, VP, Tr, Me, Sa, Ar
#4 Daphne 1 0 0 0 0 1 Pr, Ar
#5 Elena 0 0 0 1 1 1 Me, Sa, Ar
#6 Faye 0 0 0 0 0 1 Ar
数据
df1 <- structure(list(Name = c("Alicia", "Bonnie", "Cathy", "Daphne",
"Elena", "Faye"), Pr = c(1L, 0L, 1L, 1L, 0L, 0L), VP = c(0L,
1L, 1L, 0L, 0L, 0L), Tr = c(0L, 1L, 1L, 0L, 0L, 0L), Me = c(0L,
0L, 1L, 0L, 1L, 0L), Sa = c(1L, 0L, 1L, 0L, 1L, 0L), Ar = c(0L,
0L, 1L, 1L, 1L, 1L)), class = "data.frame", row.names = c(NA,
-6L))
为示例制作我自己的数据,并使用data.table包:
library(data.table)
dt1 <- data.table(
"name" = LETTERS[1:5],
"V1" = c(1,0,0,1,0),
"V2" = c(1,0,1,0,1))
dt1[melt(dt1, id.vars = "name")[value == 1, .(.(variable)), keyby = name], on = "name"]
使用Akrun 的数据
df1[melt(df1, id.vars = "Name")[value == 1, .(.(variable)), keyby = Name], on = "Name"]
提供
Name Pr VP Tr Me Sa Ar V1
1: Alicia 1 0 0 0 1 0 Pr,Sa
2: Bonnie 0 1 1 0 0 0 VP,Tr
3: Cathy 1 1 1 1 1 1 Pr,VP,Tr,Me,Sa,Ar
4: Daphne 1 0 0 0 0 1 Pr,Ar
5: Elena 0 0 0 1 1 1 Me,Sa,Ar
6: Faye 0 0 0 0 0 1 Ar
它使用name作为id生成一个长的data.table,对有1的列进行子集设置,并列出原始列名(变量(的值,然后连接回原始列名。