经合组织数据的STRATUM很长,为简单起见,我输入了这个名称,并希望将其简化为更简短,更精确的命名,如下面的代码所示。
pisaMas[,`:=`
(SchoolType = c(ifelse(STRATUM == "National Secondary School", "Public",
ifelse(STRATUM == "Religious School", "Religious",
ifelse(STRATUM == "MOE Technical School", "Technical",0)))))]
pisaMas[,table(SchoolType)]
我想知道是否有一个简单的方法来解决这个问题,使用 data.table 包。
data.table
的当前开发版本有新的函数fcase
(以 SQLCASE WHEN
为模型(用于这种情况:
pisaMas[ , SchoolType := fcase(
STRATUM == "National Secondary School", "Public",
STRATUM == "Religious School", "Religious",
STRATUM == "MOE Technical School", "Technical",
default = ''
)]
pisaMas[ , table(SchoolType)]
要获取开发版本,请尝试
install.packages(
'data.table', type = 'source',repos = 'http://Rdatatable.github.io/data.table'
)
如果简单安装不起作用,您可以查看安装 wiki 以获取更多详细信息:
https://github.com/Rdatatable/data.table/wiki/Installation
你也可以用查找表来解决这个问题,详见这个问答:
https://stackoverflow.com/a/36391018/3576984
这是我经过一些思考后得出的。
#' First I create a function (rname.SchType) that have oldname and newname using else if:
rname.SchType <- function(x){
if (is.na(x)) NA
else if (x == "MYS - stratum 01: MOE National Secondary School\Other States")"Public"
else if(x == "MYS - stratum 02: MOE Religious School\Other States")"Religious"
else if(x == "MYS - stratum 03: MOE Technical School\Other States")"Technical"
else if(x == "MYS - stratum 04: MOE Fully Residential School")"SBP"
else if(x == "MYS - stratum 05: non-MOE MARA Junior Science College\Other States")"MARA"
else if(x == "MYS - stratum 06: non-MOE Other Schools\Other States")"Private"
else if(x == "MYS - stratum 07: Perlis non-“MOE Fully Residential Schools”")"Perlis Fully Residential"
else if(x == "MYS - stratum 08: Wilayah Persekutuan Putrajaya non-“MOE Fully Residential Schools”")"Putrajaya Fully Residential"
else if(x == "MYS - stratum 09: Wilayah Persekutuan Labuan non-“MOE Fully Residential Schools”")"Labuan Fully Residential"
}
通过使用我刚刚创建的函数,我只用一行代码就通过 data.table 过去了它,通过在 data.table 中应用基本 R(sapply(,因此设法避免了代码混乱并且看起来更简单:
pisaMalaysia[,`:=`(jenisSekolah = sapply(STRATUM,rname.SchType))]
我想我终于得到了上面问题的答案!这个答案克服了@Roland提到的"未矢量化"的问题,谢谢先生!在我看来,它要快得多,即使我花了几周的时间才理解这个概念并在网络上找到正确的问题!
首先,我创建一个新的 data.table,它由 2 列组成,一列具有原始名称,第二列是学校所需的名称。
lookUpStratum <- data.table(STRATUM=c("MYS - stratum 01: MOE National Secondary School\Other States",
"MYS - stratum 02: MOE Religious School\Other States",
"MYS - stratum 03: MOE Technical School\Other States",
"MYS - stratum 04: MOE Fully Residential School",
"MYS - stratum 05: non-MOE MARA Junior Science College\Other States",
"MYS - stratum 06: non-MOE Other Schools\Other States",
"MYS - stratum 07: Perlis non-“MOE Fully Residential Schools”",
"MYS - stratum 08: Wilayah Persekutuan Putrajaya non-“MOE Fully Residential Schools”",
"MYS - stratum 09: Wilayah Persekutuan Labuan non-“MOE Fully Residential Schools”"),
SCH.TYPE=c("Public",
"Religious",
"Technical",
"SBP",
"MARA",
"Private",
"Perlis Fully Residential",
"Putrajaya Fully Residential",
"Labuan Fully Residential"))
答案在于setDT
(强制列表和data.frame通过引用到data.table(。
使用我在这里阅读的这行代码,它看起来有点长,但它解决了我的问题!老实说,在我理解下面最短的代码之前,我先了解这一点。
setDT(pisaMalaysia)[,SCH.TYPE := lookUpStratum$SCH.TYPE[match(pisaMalaysia$STRATUM,lookUpStratum$STRATUM)]]
几分钟后,我终于设法在这里理解了这段代码并生成了这段代码:
setDT(pisaMalaysia)[lookUpStratum,SCH.TYPE1 := i.SCH.TYPE, on = c(STRATUM = "STRATUM")]
我从这里的同一篇文章中得到了这些答案。
要检查是否一切相同:
table(pisaMalaysia$SCH.TYPE)
table(pisaMalaysia$SCH.TYPE1)
#' original data
pisaMalaysia[,table(STRATUM)]
结果:
> table(pisaMalaysia$SCH.TYPE)
Labuan Fully Residential MARA Perlis Fully Residential
54 122 78
Private Public Putrajaya Fully Residential
385 4929 78
Religious SBP Technical
273 2661 281
> table(pisaMalaysia$SCH.TYPE1)
Labuan Fully Residential MARA Perlis Fully Residential
54 122 78
Private Public Putrajaya Fully Residential
385 4929 78
Religious SBP Technical
273 2661 281
> pisaMalaysia[,table(STRATUM)]
STRATUM
MYS - stratum 01: MOE National Secondary School\Other States
4929
MYS - stratum 02: MOE Religious School\Other States
273
MYS - stratum 03: MOE Technical School\Other States
281
MYS - stratum 04: MOE Fully Residential School
2661
MYS - stratum 05: non-MOE MARA Junior Science College\Other States
122
MYS - stratum 06: non-MOE Other Schools\Other States
385
MYS - stratum 07: Perlis non-“MOE Fully Residential Schools”
78
MYS - stratum 08: Wilayah Persekutuan Putrajaya non-“MOE Fully Residential Schools”
78
MYS - stratum 09: Wilayah Persekutuan Labuan non-“MOE Fully Residential Schools”
54
谢谢!希望这也对其他人有所帮助。