如果任何受影响列中的任何值大于r中的5,则如何按组筛选并将所有值移动到新列中



我有一个这样的Datafaame:

 dt <- tibble(
 TRIAL = c("A", "A", "A", "B", "B", "B", "C", "C", "C","D","D","D"),
 RL = c(1, NA, 3, 1, 6, 3, 2, 3, 1, 0, 1.5, NA),
 SL = c(6, 1.5, 1, 0, 0, 1, 1, 2, 0, 1, 1.5, NA),
 HC = c(0, 1, 5, 6,7, 8, 9, 3, 4, 5, 4, 2)
 )

# A tibble: 12 x 4
  TRIAL    RL    SL    HC
  <chr> <dbl> <dbl> <dbl>
  1 A       1     6       0
  2 A      NA     1.5     1
  3 A       3     1       5
  4 B       1     0       6
  5 B       6     0       7
  6 B       3     1       8
  7 C       2     1       9
  8 C       3     2       3
  9 C       1     0       4
 10 D       0     1       5
 11 D       1.5   1.5     4
 12 D      NA    NA       2

我想通过TRIAL对数据帧进行分组,并按组检查RL和SL中的值,如果任一列中的值大于5,则将该特定组的RL和SL的所有值分别移动到RLCT和SLCT。

# A tibble: 12 x 6
TRIAL    HC  RLCT  SLCT    SL    RL
<chr> <dbl> <dbl> <dbl> <dbl> <dbl>
 1 A         0     1   6    NA    NA  
 2 A         1    NA   1.5  NA    NA  
 3 A         5     3   1    NA    NA  
 4 B         6     1   0    NA    NA  
 5 B         7     6   0    NA    NA  
 6 B         8     3   1    NA    NA  
 7 C         9    NA  NA     1     3  
 8 C         3    NA  NA     3     5  
 9 C         4    NA  NA     1     1  
10 D         5    NA  NA     1     0  
11 D         4    NA  NA     1.5   1.5
12 D         2    NA  NA    NA    NA 

当我运行下面的代码时,我没有得到预期的输出

dt0 <- dt %>%
   mutate(RLCT = NA,
          SLCT = NA) %>%
   group_by(TRIAL) %>%
   filter(!any(RL > 5.0 | SL > 5.0))
dt1 <- dt %>%
   group_by(TRIAL) %>%
   filter(any(RL > 5.0 | SL > 5.0)) %>%
   mutate(RLCT = RL,
          SLCT = SL) %>%
   rbind(dt0, .) %>%
   mutate(RL = ifelse(!is.na(RLCT), NA, RL),
          SL = ifelse(!is.na(SLCT), NA, SL)) %>% arrange(TRIAL)
 

这是我得到的

# A tibble: 9 x 6
# Groups:   TRIAL [3]
   TRIAL    RL    SL    HC  RLCT  SLCT
   <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
 1 A        NA    NA     0     1   6  
 2 A        NA    NA     1    NA   1.5
 3 A        NA    NA     5     3   1  
 4 B        NA    NA     6     1   0  
 5 B        NA    NA     7     6   0  
 6 B        NA    NA     8     3   1  
 7 C         2     1     9    NA  NA  
 8 C         3     2     3    NA  NA  
 9 C         1     0     4    NA  NA  

您可以定义一个列来存储条件,将RLSL替换为across中的ifelse

dt %>%
  group_by(TRIAL) %>%
  mutate(cond = any(RL > 5.0 | SL > 5.0, na.rm = TRUE),
         across(c(RL, SL), ~ ifelse(cond, ., NA), .names = "{.col}CT"),
         across(c(RL, SL), ~ ifelse(!cond, ., NA)),
         cond = NULL)

结果:

# A tibble: 12 x 6
# Groups:   TRIAL [4]
   TRIAL    RL    SL    HC  RLCT  SLCT
   <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
 1 A      NA    NA       0     1   6  
 2 A      NA    NA       1    NA   1.5
 3 A      NA    NA       5     3   1  
 4 B      NA    NA       6     1   0  
 5 B      NA    NA       7     6   0  
 6 B      NA    NA       8     3   1  
 7 C       2     1       9    NA  NA  
 8 C       3     2       3    NA  NA  
 9 C       1     0       4    NA  NA  
10 D       0     1       5    NA  NA  
11 D       1.5   1.5     4    NA  NA  
12 D      NA    NA       2    NA  NA  

对于dplyr,您可以使用group_modify():

library(dplyr)
dt %>%
  group_by(TRIAL) %>%
  group_modify(~ {
    if(any(select(.x, c(RL, SL)) > 5, na.rm = TRUE)) {
      rename_with(.x, ~ paste0(.x, 'CT'), c(RL, SL))
    } else {
      .x
    }
  })
输出
# A tibble: 12 × 6
# Groups:   TRIAL [4]
   TRIAL  RLCT  SLCT    HC    RL    SL
   <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
 1 A         1   6       0  NA    NA  
 2 A        NA   1.5     1  NA    NA  
 3 A         3   1       5  NA    NA  
 4 B         1   0       6  NA    NA  
 5 B         6   0       7  NA    NA  
 6 B         3   1       8  NA    NA  
 7 C        NA  NA       9   2     1  
 8 C        NA  NA       3   3     2  
 9 C        NA  NA       4   1     0  
10 D        NA  NA       5   0     1  
11 D        NA  NA       4   1.5   1.5
12 D        NA  NA       2  NA    NA

相关内容

  • 没有找到相关文章

最新更新