r-保留一个重复的行



我有一个数据集,在细胞系、途径、药物列中有重复的行,但活性列有不同的输出。例如,在下面数据框的前两行中,除了活性外,从细胞、药物到途径的所有内容都是相同的,第一行在活性列中具有RESISTANT,第二行在活性栏中具有SENSITIVE。我希望保留第二行,该行在活动中具有SENSITIVE输出。

你能帮我怎么做吗。我想对数据帧中具有类似输出的所有行执行此操作,我想保留第二个重复的行。

**cell**  **drug**         **pathway**   **activity**
AU656     5-FLORO          OTHER          RESISTANT
AU656     5-FLORO          OTHER          SENSITIVE
AU656     ALISERTIB        MITOSIS        INTERMEDIATE
AU656     ALISERTIB        MITOSIS        RESISTANT
AU656     AFITINIB         EGFR           SENSITIVE
AU656     AZD6482          PI3K           INTERMEDIATE
AU656     DORAMAPIMOD      JNK            INTERMEDIATE
AU656     DORAMAPIMOD      JNK            SENSITIVE

我们根据细胞、药物、途径和slice对第二行(如果存在(进行分组,取min最小值为2和组大小(n()(,因此对于组大小为1,它返回第一行

library(dplyr)
df1 %>%
group_by(cell, drug, pathway) %>%
slice(min(2, n())) %>%
ungroup

-输出

# A tibble: 5 × 4
cell  drug        pathway activity    
<chr> <chr>       <chr>   <chr>       
1 AU656 5-FLORO     OTHER   SENSITIVE   
2 AU656 AFITINIB    EGFR    SENSITIVE   
3 AU656 ALISERTIB   MITOSIS RESISTANT   
4 AU656 AZD6482     PI3K    INTERMEDIATE
5 AU656 DORAMAPIMOD JNK     SENSITIVE   

数据

df1 <- structure(list(cell = c("AU656", "AU656", "AU656", "AU656", "AU656", 
"AU656", "AU656", "AU656"), drug = c("5-FLORO", "5-FLORO", "ALISERTIB", 
"ALISERTIB", "AFITINIB", "AZD6482", "DORAMAPIMOD", "DORAMAPIMOD"
), pathway = c("OTHER", "OTHER", "MITOSIS", "MITOSIS", "EGFR", 
"PI3K", "JNK", "JNK"), activity = c("RESISTANT", "SENSITIVE", 
"INTERMEDIATE", "RESISTANT", "SENSITIVE", "INTERMEDIATE", "INTERMEDIATE", 
"SENSITIVE")), class = "data.frame", row.names = c(NA, -8L))

相关内容

  • 没有找到相关文章

最新更新