假设我有一个看起来像这样的数据集
ID| Street | street type| Crime|
1 Main ST N
2 Main ST Y
3 Pleasant AVE Y
4 Pleasant AVE Y
5 Harris BLVD N
6 Lincoln Road Y
7 Lincoln Road Y
8 Lincoln Road Y
9 Breezy Ave Y
10 Breezy Ave N
11 Rose ST N
12 Rose ST N
13 Rose ST N
我想要这样的东西注意在列"Street"之间建立一个组和";Crime"。如果至少有一个&;y &;出现在"犯罪"中;为同一组的所有部分分配一个Y。
ID| Street | street type| Crime|Flag
1 Main ST N Y
2 Main ST Y Y
3 Pleasant AVE Y Y
4 Pleasant AVE Y Y
5 Harris BLVD N N
6 Lincoln Road Y Y
7 Lincoln Road Y Y
8 Lincoln Road Y Y
9 Breezy Ave Y Y
10 Breezy Ave N Y
11 Rose ST N N
12 Rose ST N N
13 Rose ST N N
tryany
df %>%
group_by(Street) %>%
mutate(Flag = as.character(any(Crime == "Y"))) %>%
mutate(Flag = recode(Flag, "TRUE" = "Y", "FALSE" = "N"))
Street Crime Flag
<chr> <chr> <chr>
1 Main N Y
2 Main Y Y
3 Pleasant Y Y
4 Pleasant Y Y
5 Harris N N
library(dplyr)
df %>%
group_by(Street) %>%
mutate(Crime = if_else(any(Crime == "Y"),"Y","N"))
# A tibble: 13 x 4
# Groups: Street [6]
ID Street street_type Crime
<int> <chr> <chr> <chr>
1 1 Main ST Y
2 2 Main ST Y
3 3 Pleasant AVE Y
4 4 Pleasant AVE Y
5 5 Harris BLVD N
6 6 Lincoln Road Y
7 7 Lincoln Road Y
8 8 Lincoln Road Y
9 9 Breezy Ave Y
10 10 Breezy Ave Y
11 11 Rose ST N
12 12 Rose ST N
13 13 Rose ST N
——数据structure(list(ID = 1:13, Street = c("Main", "Main", "Pleasant",
"Pleasant", "Harris", "Lincoln", "Lincoln", "Lincoln", "Breezy",
"Breezy", "Rose", "Rose", "Rose"), street_type = c("ST", "ST",
"AVE", "AVE", "BLVD", "Road", "Road", "Road", "Ave", "Ave", "ST",
"ST", "ST"), Crime = c("N", "Y", "Y", "Y", "N", "Y", "Y", "Y",
"Y", "N", "N", "N", "N")), class = "data.frame", row.names = c(NA,
-13L))
或者使用max
,因为Y
大于N
:
df %>%
group_by(Street) %>%
mutate(Flag = max(Crime))
这是这里最短的一个,并保持你原来的数据结构:
ID Street street_type Crime Flag
<int> <chr> <chr> <chr> <chr>
1 1 Main ST N Y
2 2 Main ST Y Y
3 3 Pleasant AVE Y Y
4 4 Pleasant AVE Y Y
5 5 Harris BLVD N N
6 6 Lincoln Road Y Y
7 7 Lincoln Road Y Y
8 8 Lincoln Road Y Y
9 9 Breezy Ave Y Y
10 10 Breezy Ave N Y
11 11 Rose ST N N
12 12 Rose ST N N
13 13 Rose ST N N
这是一个基本R选项-
df$Flag <- 'N'
df$Flag[df$Street %in% unique(df$Street[df$Crime == 'Y'])] <- 'Y'
df
# ID Street street_type Crime Flag
#1 1 Main ST N Y
#2 2 Main ST Y Y
#3 3 Pleasant AVE Y Y
#4 4 Pleasant AVE Y Y
#5 5 Harris BLVD N N
#6 6 Lincoln Road Y Y
#7 7 Lincoln Road Y Y
#8 8 Lincoln Road Y Y
#9 9 Breezy Ave Y Y
#10 10 Breezy Ave N Y
#11 11 Rose ST N N
#12 12 Rose ST N N
#13 13 Rose ST N N
初始化Flag
列为'N'
。将Flag
列更改为'Y'
,Street
的值与'Y'
的值相同。
在base R
中使用transform
transform(df1, Flag = c("N", "Y")[1 + Street %in% Street[Crime == "Y"]])
ID Street street_type Crime Flag
1 1 Main ST N Y
2 2 Main ST Y Y
3 3 Pleasant AVE Y Y
4 4 Pleasant AVE Y Y
5 5 Harris BLVD N N
6 6 Lincoln Road Y Y
7 7 Lincoln Road Y Y
8 8 Lincoln Road Y Y
9 9 Breezy Ave Y Y
10 10 Breezy Ave N Y
11 11 Rose ST N N
12 12 Rose ST N N
13 13 Rose ST N N