在 R 中转换数据结构

  • 本文关键字:数据结构 转换 r
  • 更新时间 :
  • 英文 :


我正在查看的数据结构是:

head(data)
  ID Gender      Location   Generation                           Question Response
1  2   Male South America Generation X Q0. Vote in the upcoming election?    0: No
2  2   Male South America Generation X                     Q1. Pulse Rate    0: No
3  2   Male South America Generation X                     Q2. Metabolism    0: No
4  2   Male South America Generation X                  Q3.Blood Pressure   1: Yes
5  2   Male South America Generation X                    Q4. Temperature    0: No
6  2   Male South America Generation X         Q5. Galvanic Skin Response   1: Yes

此数据框中的列标题如下所示:

> colnames(data)
[1] "ID"         "Gender"     "Location"   "Generation" "Question"   "Response" 

标题的Question包含提出的问题,Responses也是如此。我想怎么看是:

> colnames(final_data)
 [1] "ID"                                 "Gender"                            
 [3] "Location"                           "Generation"                        
 [5] "Q0. Vote in the upcoming election?" "Q1. Pulse Rate"                    
 [7] "Q134. Good Job Skills"              "Q135. Sense of Humor"              
 [9] "Q136. Intelligence"                 "Q137.Can Play Jazz"                
[11] "Q138.Likes the Beatles"             "Q139. Snobbiness"                  
[13] "Q140.Ability to lift heavy objects" "Q141.Grace under pressure"         
[15] "Q142.Grace on the dance floor"      "Q143.Likes animals"                
[17] "Q144.Makes good coffee"             "Q145.Eats all his/her vegetables"  
[19] "Q2. Metabolism"                     "Q3.Blood Pressure"                 
[21] "Q4. Temperature"                    "Q5. Galvanic Skin Response"        
[23] "Q6. Breathing"                      "Q7. Perspiration"                  
[25] "Q8.Pupil Dilation"                  "Q9. Adrenaline Production" 

目前,我有在一行中记录每个ID的属性的数据。从本质上讲,这意味着每行只有一个唯一 ID 的属性。

我在这里看到了另一个问题,但未能理解它。谁能帮忙?

我不明白你实际上希望数据最终看起来如何。我的猜测是您希望数据具有ID,性别,位置,生成作为前4列,然后将问题转换为列名称,其答案作为这些列下的值。 为此,您只需在 R reshape2包中使用 meltdcast 函数

x=melt(data,id=c("ID","Gender","Location","Generation"))
#this will melt the data frame telling R that these 4 variables are your primary keys
final_data=dcast(x, ID + Gender + Location + Generation ~ Question, value.var="Response")

我认为这将解决问题

最新更新