R-如何总结自ID和大型数据框架以来第一次日期的天数和`的天数

dataframe df1总结了整个时间(ID(的检测(Date(。作为一个简短的例子：

df1<- data.frame(ID= c(1,2,1,2,1,2,1,2,1,2),
                 Date= ymd(c("2016-08-21","2016-08-24","2016-08-23","2016-08-29","2016-08-27","2016-09-02","2016-09-01","2016-09-09","2016-09-01","2016-09-10")))
df1
   ID       Date
1   1 2016-08-21
2   2 2016-08-24
3   1 2016-08-23
4   2 2016-08-29
5   1 2016-08-27
6   2 2016-09-02
7   1 2016-09-01
8   2 2016-09-09
9   1 2016-09-01
10  2 2016-09-10

我想总结Number of days since the first detection of the individual(Ndays(和Number of days that the individual has been detected since the first time it was detected(Ndifdays(。

此外，我想在此摘要表中包含一个称为Prop的变量，该变量仅将Ndifdays划分为Ndays。

我期望的摘要表是：

> Result
  ID Ndays Ndifdays  Prop
1  1    11        4 0.360 # Between 21st Aug and 01st Sept there is 11 days.
2  2    17        5 0.294 # Between 24th Aug and 10st Sept there is 17 days.

有人知道该怎么做吗？

您可以使用dplyr

中的各种汇总功能实现

library(dplyr)
df1 %>%
   group_by(ID) %>%
   summarise(Ndays =  as.integer(max(Date) - min(Date)), 
             Ndifdays = n_distinct(Date), 
             Prop = Ndifdays/Ndays)
#     ID Ndays Ndifdays  Prop
#   <dbl> <int>    <int> <dbl>
#1     1    11        4 0.364
#2     2    17        5 0.294

data.table版本将是

library(data.table)
df12 <- setDT(df1)[, .(Ndays = as.integer(max(Date) - min(Date)), 
                       Ndifdays = uniqueN(Date)), by = ID]
df12$Prop <- df12$Ndifdays/df12$Ndays

和 aggregate

的基础r

df12 <- aggregate(Date~ID, df1, function(x) c(max(x) - min(x), length(unique(x))))
df12$Prop <- df1$Ndifdays/df1$Ndays

按'id'分组后，获取'date''的 diff或 range创建'ndays'，然后用 n_distinct获取唯一的'date'数字，除以数字nd Dicting获得" Prop"

library(dplyr)    
df1 %>%
   group_by(ID) %>%
   summarise(Ndays =  as.integer(diff(range(Date))), 
         Ndifdays = n_distinct(Date), 
         Prop = Ndifdays/Ndays)
# A tibble: 2 x 4
#     ID Ndays Ndifdays  Prop
#  <dbl> <int>    <int> <dbl>
#1     1    11        4 0.364
#2     2    17        5 0.294

相关内容

最新更新

热门标签：