有没有一个整洁/最简单的方法来这个数据.表 R 代码?



经合组织数据的STRATUM很长,为简单起见,我输入了这个名称,并希望将其简化为更简短,更精确的命名,如下面的代码所示。

pisaMas[,`:=`
(SchoolType = c(ifelse(STRATUM == "National Secondary School", "Public", 
ifelse(STRATUM == "Religious School", "Religious", 
ifelse(STRATUM == "MOE Technical School", "Technical",0)))))]
pisaMas[,table(SchoolType)]

我想知道是否有一个简单的方法来解决这个问题,使用 data.table 包。

data.table的当前开发版本有新的函数fcase(以 SQLCASE WHEN为模型(用于这种情况:

pisaMas[ , SchoolType := fcase(
STRATUM == "National Secondary School", "Public", 
STRATUM == "Religious School", "Religious", 
STRATUM == "MOE Technical School", "Technical",
default = ''
)]
pisaMas[ , table(SchoolType)]

要获取开发版本,请尝试

install.packages(
'data.table', type = 'source',repos = 'http://Rdatatable.github.io/data.table'
)

如果简单安装不起作用,您可以查看安装 wiki 以获取更多详细信息:

https://github.com/Rdatatable/data.table/wiki/Installation

你也可以用查找表来解决这个问题,详见这个问答:

https://stackoverflow.com/a/36391018/3576984

这是我经过一些思考后得出的。

#' First I create a function (rname.SchType) that have oldname and newname using else if:
rname.SchType <- function(x){
if (is.na(x)) NA
else if (x == "MYS - stratum 01: MOE National Secondary School\Other States")"Public"
else if(x == "MYS - stratum 02: MOE Religious School\Other States")"Religious" 
else if(x == "MYS - stratum 03: MOE Technical School\Other States")"Technical"
else if(x == "MYS - stratum 04: MOE Fully Residential School")"SBP"
else if(x == "MYS - stratum 05: non-MOE MARA Junior Science College\Other States")"MARA"
else if(x == "MYS - stratum 06: non-MOE Other Schools\Other States")"Private"
else if(x == "MYS - stratum 07: Perlis non-“MOE Fully Residential Schools”")"Perlis Fully Residential"
else if(x == "MYS - stratum 08: Wilayah Persekutuan Putrajaya non-“MOE Fully Residential Schools”")"Putrajaya Fully Residential"
else if(x == "MYS - stratum 09: Wilayah Persekutuan Labuan non-“MOE Fully Residential Schools”")"Labuan Fully Residential"
}

通过使用我刚刚创建的函数,我只用一行代码就通过 data.table 过去了它,通过在 data.table 中应用基本 R(sapply(,因此设法避免了代码混乱并且看起来更简单:

pisaMalaysia[,`:=`(jenisSekolah = sapply(STRATUM,rname.SchType))]

我想我终于得到了上面问题的答案!这个答案克服了@Roland提到的"未矢量化"的问题,谢谢先生!在我看来,它要快得多,即使我花了几周的时间才理解这个概念并在网络上找到正确的问题!

首先,我创建一个新的 data.table,它由 2 列组成,一列具有原始名称,第二列是学校所需的名称。

lookUpStratum <- data.table(STRATUM=c("MYS - stratum 01: MOE National Secondary School\Other States",
"MYS - stratum 02: MOE Religious School\Other States",
"MYS - stratum 03: MOE Technical School\Other States",
"MYS - stratum 04: MOE Fully Residential School",
"MYS - stratum 05: non-MOE MARA Junior Science College\Other States",
"MYS - stratum 06: non-MOE Other Schools\Other States",
"MYS - stratum 07: Perlis non-“MOE Fully Residential Schools”",
"MYS - stratum 08: Wilayah Persekutuan Putrajaya non-“MOE Fully Residential Schools”",
"MYS - stratum 09: Wilayah Persekutuan Labuan non-“MOE Fully Residential Schools”"),
SCH.TYPE=c("Public",
"Religious",
"Technical",
"SBP",
"MARA",
"Private",
"Perlis Fully Residential",
"Putrajaya Fully Residential",
"Labuan Fully Residential"))

答案在于setDT(强制列表和data.frame通过引用到data.table(。

使用我在这里阅读的这行代码,它看起来有点长,但它解决了我的问题!老实说,在我理解下面最短的代码之前,我先了解这一点。

setDT(pisaMalaysia)[,SCH.TYPE := lookUpStratum$SCH.TYPE[match(pisaMalaysia$STRATUM,lookUpStratum$STRATUM)]]

几分钟后,我终于设法在这里理解了这段代码并生成了这段代码:

setDT(pisaMalaysia)[lookUpStratum,SCH.TYPE1 := i.SCH.TYPE, on = c(STRATUM = "STRATUM")]

我从这里的同一篇文章中得到了这些答案。

要检查是否一切相同:

table(pisaMalaysia$SCH.TYPE)
table(pisaMalaysia$SCH.TYPE1)
#' original data
pisaMalaysia[,table(STRATUM)]

结果:

> table(pisaMalaysia$SCH.TYPE)
Labuan Fully Residential                        MARA    Perlis Fully Residential 
54                         122                          78 
Private                      Public Putrajaya Fully Residential 
385                        4929                          78 
Religious                         SBP                   Technical 
273                        2661                         281 
> table(pisaMalaysia$SCH.TYPE1)
Labuan Fully Residential                        MARA    Perlis Fully Residential 
54                         122                          78 
Private                      Public Putrajaya Fully Residential 
385                        4929                          78 
Religious                         SBP                   Technical 
273                        2661                         281 
> pisaMalaysia[,table(STRATUM)]
STRATUM
MYS - stratum 01: MOE National Secondary School\Other States 
         4929 
MYS - stratum 02: MOE Religious School\Other States 
          273 
MYS - stratum 03: MOE Technical School\Other States 
          281 
MYS - stratum 04: MOE Fully Residential School 
         2661 
MYS - stratum 05: non-MOE MARA Junior Science College\Other States 
          122 
MYS - stratum 06: non-MOE Other Schools\Other States 
          385 
MYS - stratum 07: Perlis non-“MOE Fully Residential Schools” 
           78 
MYS - stratum 08: Wilayah Persekutuan Putrajaya non-“MOE Fully Residential Schools” 
           78 
MYS - stratum 09: Wilayah Persekutuan Labuan non-“MOE Fully Residential Schools” 
           54 

谢谢!希望这也对其他人有所帮助。

最新更新