r-如何对我的两个表进行复杂的多列绑定



我有一个具有列类型的单列数据帧,其中包含所有可能的"类型":

comment       type
used         enter 
used         open
used         close
used         update
not_used     delete

我从数据库中获取数据帧。但在该数据帧中;"类型";可能是。下面是该表的示例:

ID    date            type           value
a1    2020-09-01       enter          18
a1    2020-09-01       close          15
a1    2020-09-02       enter          4
a2    2020-09-01       close          10
b1    2020-09-02       update         10

正如您所看到的,ID a1只有两种类型:输入和关闭。a2只有关闭,b1只有更新。

我想用那种方式把这两张表绑定起来,所以"类型";不在我的表中的每个ID和日期的值为零。那么,如何绑定这两个表来获得这个:

comment            ID    date            type           value
used               a1    2020-09-01       enter          18
used               a1    2020-09-01       open           0
used               a1    2020-09-01       close          15
used               a1    2020-09-01       update         0
not_used           a1    2020-09-01       delete         0
used               a1    2020-09-02       enter          4
used               a1    2020-09-02       open           0
used               a1    2020-09-02       close          0
used               a1    2020-09-02       update         0
not_used           a1    2020-09-02       delete         0
used               a2    2020-09-01       enter          0
used               a2    2020-09-01       open           0
used               a2    2020-09-01       close          10
used               a2    2020-09-01       update         0
not_used           a2    2020-09-01       delete         0
used               b1    2020-09-01       enter          0
used               b1    2020-09-01       open           0
used               b1    2020-09-01       close          0
used               b1    2020-09-01       update         10
not_used           b1    2020-09-01       delete         0

正如你所看到的,我还保留了专栏";评论";。我怎么能那样做?

Thise不保留列";评论":

library(dplyr)
library(tidyr)
df2 %>%
mutate(type = factor(type, levels = df1$type)) %>%
group_by(ID, date) %>%
complete(type, fill = list(value = 0))

试试这个:

library(tidyverse)
#Code
new <- df2 %>%
mutate(type = factor(type, levels = df1$type)) %>%
group_by(ID, date) %>%
complete(type, fill = list(value = 0)) %>%
left_join(df1)

输出:

# A tibble: 20 x 5
# Groups:   ID, date [4]
ID    date       type   value comment 
<chr> <chr>      <chr>  <dbl> <chr>   
1 a1    2020-09-01 enter     18 used    
2 a1    2020-09-01 open       0 used    
3 a1    2020-09-01 close     15 used    
4 a1    2020-09-01 update     0 used    
5 a1    2020-09-01 delete     0 not_used
6 a1    2020-09-02 enter      4 used    
7 a1    2020-09-02 open       0 used    
8 a1    2020-09-02 close      0 used    
9 a1    2020-09-02 update     0 used    
10 a1    2020-09-02 delete     0 not_used
11 a2    2020-09-01 enter      0 used    
12 a2    2020-09-01 open       0 used    
13 a2    2020-09-01 close     10 used    
14 a2    2020-09-01 update     0 used    
15 a2    2020-09-01 delete     0 not_used
16 b1    2020-09-02 enter      0 used    
17 b1    2020-09-02 open       0 used    
18 b1    2020-09-02 close      0 used    
19 b1    2020-09-02 update    10 used    
20 b1    2020-09-02 delete     0 not_used

我认为一个简单的解决方案是在之后执行full_join

df2 %>%
mutate(type = factor(type, levels = df1$type)) %>%
group_by(ID, date) %>%
complete(type, fill = list(value = 0)) %>% 
full_join(df1, by = "type")

相关内容

  • 没有找到相关文章

最新更新