如何在数据集的 R 工作室中用数字替换单词



我在这里看到了一个与我自己的问题类似的问题:"将特定的列词替换为数字或空白",但似乎没有一个解决方案对我的情况有帮助。

我想做的是转换:

Question    Response
1           Sometimes
2           Almost Always
3           Sometimes
4           Almost Never
5           Often

到:

Question    Response
    1           2
    2           4
    3           2
    4           1
    5           3

其中几乎从不 = 1,有时 = 2,经常 = 3,几乎总是 = 4。

我通过Excel导入了数据,它位于名为STAI22的数据框中(我认为(。

我试过了:

STAI22[STAI22$Response == "Almost never",]$Response = 1
STAI22[STAI22$Response == "sometimes",]$Response = 2
STAI22[STAI22$Response == "often",]$Response = 3
STAI22[STAI22$Response == "Almost always",]$Response = 4

但我收到错误消息:

 STAI22[STAI22$Response == "Almost Always",]$Response = "4"
Warning message:
In `[<-.factor`(`*tmp*`, iseq, value = "4") :
  invalid factor level, NA generated
> STAI22[STAI22$Response == "Often",]$Response = "3"
Error in `[<-.data.frame`(`*tmp*`, STAI22$Response == "Often", , value = list( : 
  missing values are not allowed in subscripted assignments of data frames
> STAI22[STAI22$Response == "Sometimes",]$Response = "2"
Error in `[<-.data.frame`(`*tmp*`, STAI22$Response == "Sometimes", , value = list( : 
  missing values are not allowed in subscripted assignments of data frames
> STAI22[STAI22$Response == "Almost Never",]$Response = "1"
Error in `[<-.data.frame`(`*tmp*`, STAI22$Response == "Almost Never",  : 
  missing values are not allowed in subscripted assignments of data frames

它对我的数据没有任何影响!

您可以使用dplyr中的case_when

DPLYR 版本 0.5.0

df <- read.table(text="Question    Response
1           Sometimes
2           'Almost Always'
3           Sometimes
4           'Almost Never'
5           Often",header=TRUE, stringsAsFactors=FALSE)
library(dplyr)
df%>%
  mutate(Response=case_when(
    .$Response=="Sometimes" ~ 2,
    .$Response=="Almost Always" ~ 4,
    .$Response=="Almost Never" ~ 1,
    .$Response=="Often" ~ 3
      ))
  Question Response
1        1        2
2        2        4
3        3        2
4        4        1
5        5        3

DPLYR 版本 0.7.0

df <- read.table(text="Question    Response
1           Sometimes
2           'Almost Always'
3           Sometimes
4           'Almost Never'
5           Often",header=TRUE, stringsAsFactors=FALSE)
library(dplyr)
df%>%
  mutate(Response=case_when(
    Response=="Sometimes" ~ 2,
    Response=="Almost Always" ~ 4,
    Response=="Almost Never" ~ 1,
    Response=="Often" ~ 3
      ))

是的!通过几个不同的答案,我终于设法做到了(为了那些像我在R一样垃圾的人,我将对我所做的事情做一个荒谬的简化解释(:

我从一个数据框开始:

Question    Response
1           Somewhat
2           Very much so
3           Somewhat
4           Not at all
5           Moderately so

我创建了一个查找表:

lookup <- c("Not at all" = 1, "Somewhat" = 2, "Moderately so" = 3, "Very much so" = 4)

为我的数据集创建了一个新列:

Datasetname["Response2"] <- NA #Just fills the column with NA
Question    Response         Response2
1           Somewhat            NA
2           Very much so        NA
3           Somewhat            NA
4           Not at all          NA
5           Moderately so       NA

然后将新值添加到该新列:

Datasetname$Response2 <- Datasetname[STAI$Response]
Question    Response            Response2
1           Somewhat            2
2           Very much so        4
3           Somewhat            2
4           Not at all          1
5           Moderately so       3

万岁!

感谢大家的建议 - 由于某种原因,这种方式是唯一对我有用的方式(我可能误解了一些建议(

最新更新