我有以下格式的数据
number_of_tickets : "01-01-2019", "02-01-2019", "03-01-2019"......
date : 1500 , 1200 , "2000......
这是过去两年的数据,我需要计算每年和每月开放的总门票,大约低于
Jan Feb Mar....
2019 20570 18702 35078
2020 19794 11325 42723......
我正在尝试使用包lubridate
和deplyr
来总结、变异和许多其他事情,但没有取得任何进展。
任何帮助都将不胜感激!!!!
感谢
我想你正在寻找这个
df <- data.frame(
number_of_tickets = c(1500, 1200, 2000, 1000, 2000, 3000),
date = c("01-01-2019", "02-01-2019", "03-01-2019",
"01-01-2020", "02-01-2020", "03-01-2020"))
df$date <- as.Date(df$date, format = c("%d-%m-%Y"))
head(df)
df$month <- format(df$date, "%m")
df$year <- format(df$date, "%y")
head(df)
aggregate(number_of_tickets ~ month + year,
data = df,
sum)
最后一次调用的输出是
month year number_of_tickets
1 01 19 4700
2 01 20 6000
HTH-
使用tidyverse和lubridate,您也可以继续
df <- data.frame(
number_of_tickets = c(1500, 1200, 2000, 1000, 2000, 3000),
date = c("01-01-2019", "02-01-2019", "03-02-2019",
"01-01-2020", "02-01-2020", "03-02-2020"))
library(lubridate)
library(tidyverse)
df %>% mutate(month = month(as.Date(date, format = "%d-%m-%Y")),
year = year(as.Date(date, format = "%d-%m-%Y"))) %>%
pivot_wider(id_cols = "year", names_from = month, values_from = number_of_tickets, values_fn = sum)
# A tibble: 2 x 3
year `1` `2`
<dbl> <dbl> <dbl>
1 2019 2700 2000
2 2020 3000 3000
带有pivottabler库
library(pivottabler)
library(lubridate)
df$date <- as.Date(df$date, format = "%d-%m-%Y")
df$Month <- month(df$date)
df$Year <- year(df$date)
qpvt(df, rows = "Month",
columns = "Year",
calculations = "sum(number_of_tickets)")
2019 2020 Total
1 2700 3000 5700
2 2000 3000 5000
Total 4700 6000 10700