我使用R通过"tidycensus"提取人口普查数据,但它将同一地理位置的不同变量拉成行,而不是使用单行地理位置和多个变量列。
我尝试过各种转置、聚集和扩散函数,但无法将扩散值折叠成一行。我的代码如下:
Median_Inc<-get_acs(geography="County Subdivision",table=B06011,state="MA",county="Middlesex","Essex","Suffolk","Plymouth","Norfolk","Worcester")
它生成一个表:
2500901260 Amesbury Town city, Essex County, Massachusetts B06011_001 37891
2500901260 Amesbury Town city, Essex County, Massachusetts B06011_002 37402
2500901260 Amesbury Town city, Essex County, Massachusetts B06011_003 47925
2500901260 Amesbury Town city, Essex County, Massachusetts B06011_004 NA
2500901260 Amesbury Town city, Essex County, Massachusetts B06011_005 27303
我期望得到这些结果,但我要做的是生成一个表,其中所有值都只有一行,列是变量名,比如:
GEOID NAME B06011_001 B06011_002 B06011_003 B06011_004 B06011_005
2500901260 Amesbury Town city, Essex County, Massachusetts 37891 37402 47925 NA 27303
我没有更改get_acs
函数,但只需少量操作,您就可以获得所需的内容。
原始数据命名选项卡:
Num City County State Code value
1 2500901260 Amesbury Town city Essex County Massachusetts B06011_001 37891
2 2500901260 Amesbury Town city Essex County Massachusetts B06011_002 37402
3 2500901260 Amesbury Town city Essex County Massachusetts B06011_003 47925
4 2500901260 Amesbury Town city Essex County Massachusetts B06011_004 NA
5 2500901260 Amesbury Town city Essex County Massachusetts B06011_005 27303
要有列名:
colnames(tab) <- c("Num", "City", "County", "State", "Code", "value")
操作后:
library(reshape2)
data_wide <- dcast(tab, Num + City + County + State ~ Code, value.var="value")
Num City County State B06011_001 B06011_002 B06011_003 B06011_004 B06011_005
1 2500901260 Amesbury Town city Essex County Massachusetts 37891 37402 47925 NA 27303