我正在寻找一种重塑这种数据集的方法:
mydata<-data.frame(var=rep(c("A","B","C"),each=3),code=rep(c("x","y","z"),3),yearA=1:9,yearB=10:18,yearC=20:28)
例如:
var code yearA yearB yearC
A x 1 10 20
A y 2 11 21
A z 3 12 22
B x 4 13 23
B y 5 14 24
B z 6 15 25
C x 7 16 26
C y 8 17 27
C z 9 18 28
到此:
code year var.A var.B var.C
x yearA 1 4 7
x yearB 10 13 16
x yearC 20 23 26
y yearA 4 5 8
y yearB 13 14 17
y yearC 23 24 27
z yearA 3 6 9
z yearB 12 15 18
z yearC 22 25 28
我尝试融化,重塑。。但结果不是我想要的。知道吗?
thks
library(reshape2)
mydata.melt <- melt(mydata)
mydata.dcast <- dcast(mydata.melt, code+variable~var)
mydata.dcast
使用重新整形2进行回答。
这是Hadley新的tidyr(重塑2的替代和部分替代品(包的另一个解决方案
library("tidyr")
library("dplyr")
mydata <- data.frame(var=rep(c("A","B","C"), each=3),
code=rep(c("x","y","z"),3),
yearA=1:9, yearB=10:18, yearC=20:28)
mydata %>%
gather(year, value, yearA:yearC) %>%
mutate(var = paste0("var", ".", var)) %>%
spread(var, value)
答案与Jot eN几乎相同,但在显示recast
可以做什么方面略有不同。请记住,通常不会做melt();recast()
,因为后者包含前者。
>mfoo<-melt(mydata)
>mfoo
var code variable value
1 A x yearA 1
2 A y yearA 2
3 A z yearA 3
4 B x yearA 4
5 B y yearA 5
6 B z yearA 6
7 C x yearA 7
8 C y yearA 8
9 C z yearA 9
10 A x yearB 10
11 A y yearB 11
12 A z yearB 12
13 B x yearB 13
14 B y yearB 14
15 B z yearB 15
16 C x yearB 16
17 C y yearB 17
18 C z yearB 18
19 A x yearC 20
20 A y yearC 21
21 A z yearC 22
22 B x yearC 23
23 B y yearC 24
24 B z yearC 25
25 C x yearC 26
26 C y yearC 27
27 C z yearC 28
>recast(mfoo,code+variable~var)
Using var, code, variable as id variables
$data
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 10 13 16
[3,] 20 23 26
[4,] 2 5 8
[5,] 11 14 17
[6,] 21 24 27
[7,] 3 6 9
[8,] 12 15 18
[9,] 22 25 28
$labels
$labels[[1]]
code variable
1 x yearA
2 x yearB
3 x yearC
4 y yearA
5 y yearB
6 y yearC
7 z yearA
8 z yearB
9 z yearC
$labels[[2]]
var
1 A
2 B
3 C
所以你所要做的就是cbind
,前两个列表元素。不要气馁:melt
和recast
需要一些时间才能适应。我总是要重新教自己如何组织formula
以获得所需的输出。
这是笨重的base
R重塑:
d <- read.table(text='var code yearA yearB yearC
A x 1 10 20
A y 2 11 21
A z 3 12 22
B x 4 13 23
B y 5 14 24
B z 6 15 25
C x 7 16 26
C y 8 17 27
C z 9 18 28', header=TRUE, stringsAsFactors=FALSE)
long <- reshape(d, dir='long', varying=list(3:5), idvar=c('code', 'var'),
timevar='year', v.names='v', times=c('A', 'B', 'C'))
reshape(long, idvar=c('code', 'year'), timevar='var')
# code year v.A v.B v.C
# x.A.A x A 1 4 7
# y.A.A y A 2 5 8
# z.A.A z A 3 6 9
# x.A.B x B 10 13 16
# y.A.B y B 11 14 17
# z.A.B z B 12 15 18
# x.A.C x C 20 23 26
# y.A.C y C 21 24 27
# z.A.C z C 22 25 28