我有一个这样的数据表:
df<-data.frame("Date"=c("2010-05-27","2010-05-27","2010-07-08","2010-07-09","2010-07-09"),"Access1"=c(5,5,NA,7,7),"Access2"=c(6,6,NA,7,7),"Hour1"=c(0.55,0.55,NA,0.01,0.01),"Hour2"=c(0.55,0.55,NA,0.01,0.01))
Date Access1 Access2 Hour1 Hour2
1 2010-05-27 5 6 0.55 0.55
2 2010-05-27 5 6 0.55 0.55
3 2010-07-08 NA NA NA NA
4 2010-07-09 7 7 0.01 0.01
5 2010-07-09 7 7 0.01 0.01
我想将其重塑为垂直表格,并通过Access
和Date
Hour
求和
Date Access Hour
1 2010-05-27 5 1.10
2 2010-05-27 6 1.10
3 2010-07-08 NA NA
4 2010-07-09 7 0.04
我尝试了gather
和melt
,但它不起作用。你能建议我一个解决方法吗?
library(reshape2) ; library(dplyr)
dat <- melt(df, id= c("Date", "Hour1", "Hour2")) %>% #melt "Accesses" only
mutate(sumHour = Hour1 + Hour2) %>% #sum "Hours"
distinct() %>%
group_by(Date, value) %>%
summarise(Hour = sum(sumHour)) %>%
rename(Access = value) %>% #rename for wanted output
arrange(Date) #order by date
这应该给你想要的输出。
我改变了我的解决方案,但现在它应该可以工作了。我只是简单地剪切了两列 Access2 和 Hour2,并将它们作为附加行放在求和之前。
library(dplyr)
df %>%
select(Date, Access = Access1, Hour = Hour1) %>%
rbind(df %>% select(Date, Access = Access2, Hour = Hour2)) %>%
group_by(Date,Access) %>%
summarise(Hour = sum(Hour))
你可以尝试一个整洁的宇宙
library(tidyverse)
df %>%
rownames_to_column() %>%
gather(k, v, -Date, -rowname) %>%
separate(k, into = c("type", "num"), sep = "(?<=[A-Za-z])(?=[0-9])") %>%
spread(type, v) %>%
group_by(Date, Access) %>%
summarise(Hour = sum(Hour))
# A tibble: 4 x 3
# Groups: Date [3]
Date Access Hour
<fct> <dbl> <dbl>
1 2010-05-27 5 1.1
2 2010-05-27 6 1.1
3 2010-07-08 NA NA
4 2010-07-09 7 0.04