r-根据某些条件,按组删除长数据集中的行



我有这个df:

library(lubridate)
Date <- c("2020-10-01", "2020-10-02", "2020-10-03", "2020-10-04", 
"2020-10-01", "2020-10-02", "2020-10-03", "2020-10-04",
"2020-10-01", "2020-10-02", "2020-10-03", "2020-10-04")
Date <- as_date(Date)
Country <- c("USA", "USA", "USA", "USA", 
"Mexico", "Mexico", "Mexico", "Mexico",
"Japan", "Japan", "Japan","Japan")
Value_A <- c(0,40,0,0,25,29,34,0,20,25,27,0)
df<- data.frame(Date, Country, Value_A)
view(df)
Date      Country Value_A
<date>     <chr>     <dbl>
1 2020-10-01 USA           0
2 2020-10-02 USA          40
3 2020-10-03 USA           0
4 2020-10-04 USA           0
5 2020-10-01 Mexico       25
6 2020-10-02 Mexico       29
7 2020-10-03 Mexico       34
8 2020-10-04 Mexico        0
9 2020-10-01 Japan        20
10 2020-10-02 Japan        25
11 2020-10-03 Japan        27
12 2020-10-04 Japan         0

我试图删除包含零的行,但前提是这些零位于Country列每组的最后两行。因此,结果将是:

Date      Country Value_A
<date>     <chr>     <dbl>
1 2020-10-01 USA           0
2 2020-10-02 USA          40
5 2020-10-01 Mexico       25
6 2020-10-02 Mexico       29
7 2020-10-03 Mexico       34
9 2020-10-01 Japan        20
10 2020-10-02 Japan        25
11 2020-10-03 Japan        27

如果有人能帮忙,我很感激:(

我们可以使用tidyverse包进行一些操作以获得结果。我们使用group_byCountry,并按Date降序排序。之后,我们生成row_numbers。最后,我们根据您描述的条件进行过滤:

library(tidyverse)
df %>%
group_by(Country) %>%
arrange(desc(Date)) %>%
mutate(rn = row_number()) %>%
filter(!(Value_A == 0 & rn <= 2))
#   Date       Country Value_A    rn
# 1 2020-10-03 Mexico       34     2
# 2 2020-10-03 Japan        27     2
# 3 2020-10-02 USA          40     3
# 4 2020-10-02 Mexico       29     3
# 5 2020-10-02 Japan        25     3
# 6 2020-10-01 USA           0     4
# 7 2020-10-01 Mexico       25     4
# 8 2020-10-01 Japan        20     4

另一种方法是使用rank(desc(Date))

library(tidyverse)
df %>%
group_by(Country) %>%
mutate(rank_date = rank(desc(Date))) %>%
filter(!(rank_date <= 2 & Value_A == 0))
#   Date       Country Value_A rank_date
# 1 2020-10-01 USA           0         4
# 2 2020-10-02 USA          40         3
# 3 2020-10-01 Mexico       25         4
# 4 2020-10-02 Mexico       29         3
# 5 2020-10-03 Mexico       34         2
# 6 2020-10-01 Japan        20         4
# 7 2020-10-02 Japan        25         3
# 8 2020-10-03 Japan        27         2

最新更新