我有长格式的月度同比数据,我试图用两列spread
。我见过的唯一例子包括一个key
。
> dput(df)
structure(list(ID = c("a", "a", "a", "a", "a", "a", "a", "a",
"a", "b", "b", "b", "b", "b", "b", "b", "b", "b"), Year = c(2015L,
2015L, 2015L, 2016L, 2016L, 2016L, 2017L, 2017L, 2017L, 2015L,
2015L, 2015L, 2016L, 2016L, 2016L, 2017L, 2017L, 2017L), Month = c(1L,
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L,
3L), Value = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 6L, 7L, 8L,
9L, 10L, 11L, 12L, 13L, 14L)), .Names = c("ID", "Year", "Month",
"Value"), class = "data.frame", row.names = c(NA, -18L))
我正在尝试将其转换为数据格式,其中年份为第 2:5 列,每month
每ID
一行
ID Month 2015 2016 2017
a 1 1 2 3
a 2 1 2 3
a 3 1 2 3
a 1 6 9 12
a 2 7 10 13
a 3 8 11 14
我尝试了以下错误:
by_month_over_years = spread(df,key = c(Year,Month), Value)
Error: `var` must evaluate to a single number or a column name, not an integer vector
library(tidyr)
library(dplyr)
df %>% group_by(ID) %>% spread(Year, Value)
# A tibble: 6 x 5
# Groups: ID [2]
ID Month `2015` `2016` `2017`
<chr> <int> <int> <int> <int>
1 a 1 1 2 3
2 a 2 1 2 3
3 a 3 1 2 3
4 b 1 6 9 12
5 b 2 7 10 13
6 b 3 8 11 14
library(reshape2) # or data.table, for dcast
dcast(df, ID + Month ~ Year)
# ID Month 2015 2016 2017
# 1 a 1 1 2 3
# 2 a 2 1 2 3
# 3 a 3 1 2 3
# 4 b 1 6 9 12
# 5 b 2 7 10 13
# 6 b 3 8 11 14
这是一个带有reshape
的base R
选项
reshape(df, idvar = c('ID', 'Month'), direction = 'wide', timevar = 'Year')
# ID Month Value.2015 Value.2016 Value.2017
#1 a 1 1 2 3
#2 a 2 1 2 3
#3 a 3 1 2 3
#10 b 1 6 9 12
#11 b 2 7 10 13
#12 b 3 8 11 14