我在将一些STATA代码翻译成R代码时遇到了一些麻烦:
占据代码:
gen joint_gpw = sbud_jpw * q44 if sbud_jpw < 888 & q44 < 888
gen sbud_gpw_all = sbud_gpw if sbud_gpw < 888
replace sbud_gpw_all = q31 if sbud_gpw_all ==. & q31 < 888
replace sbud_gpw_all = joint_gpw if sbud_gpw_all ==. & joint_gpw !=.
replace sbud_gpw_all = 888 if q16_1 == 0 & sbud_gpw_all ==.
replace sbud_gpw_all = 888 if (sbud_gpw == 888 & q31 == 888 & sbud_jpw == 888 & q44 == 888) & sbud_gpw_all ==.
replace sbud_gpw_all = 999 if (sbud_gpw == 999 | q31 == 999 | sbud_jpw == 999 | q44 == 999 | (q44 !=. & sbud_jpw == 888)) & sbud_gpw_all ==.
下面是我试过的R代码:
dat%>%
dplyr::mutate(joint_gpw = ifelse((sbud_jpw<888 & q44<888),sbud_jpw * q44,NA))%>%
dplyr::mutate(sbud_gpw_all = ifelse(sbud_gpw < 888,sbud_gpw,NA))%>%
dplyr::mutate(sbud_gpw_all = ifelse((sbud_gpw_all= NA & q31<888),q31,NA))%>%
dplyr::mutate(sbud_gpw_all = ifelse((sbud_gpw_all = NA & joint_gpw != NA),joint_gpw,NA))%>%
dplyr::mutate(sbud_gpw_all) = ifelse((q16_1 = 0 & sbud_gpw_all = NA),888,NA)%>%
dplyr::mutate(sbud_gpw_all) = ifelse((sbud_gpw = 888 & q31 = 888 & sbud_jpw = 888 & q44 = 888) & sbud_gpw_all = NA,888,NA)%>%
dplyr::mutate(sbud_gpw_all) = ifelse(((sbud_gpw = 999 | q31 = 999 | sbud_jpw = 999 | q44 = 999 | (q44 != NA & sbud_jpw == 888)) & sbud_gpw_all = NA)),999,NA)
之前出现的错误:
Error: unexpected '=' in:
" dplyr::mutate(sbud_gpw_all) = ifelse((q16_1 = 0 & sbud_gpw_all = NA),888,NA)%>%
dplyr::mutate(sbud_gpw_all) = ifelse((sbud_gpw = 888 & q31 = 888 & sbud_jpw = 888 & q44 = 888) & sbud_gpw_all ="
我想知道这两组码是否等价?我非常感谢所有的帮助!谢谢! !
错误源于最后三行sbud_gpw_all
后面的右括号)
另外,尽管没有抛出错误,但您正在用每个突变覆盖sbud_gpw_all
。我不知道Stata,你没有提供一个最小的可重复的例子,但我有一种感觉,你的代码可以这样工作:
dat %>%
mutate(
joint_gpw = if_else(sbud_jpw < 888 & q44 < 888, sbud_jpw * q44, NA_real_),
sbud_gpw_all = case_when(
sbud_gpw < 888 ~ sbud_gpw,
q31 < 888 ~ q31,
!is.na(joint_gpw) ~ joint_gpw,
q16_1 == 0 ~ 888,
sbud_gpw == 888 & q31 == 888 & sbud_jpw == 888 & q44 == 888 ~ 888,
sbud_gpw == 999 | q31 == 999 | sbud_jpw == 999 | q44 == 999 | (!is.na(q44) & sbud_jpw == 888) ~ 999
)
)
这将首先使用sbud_jpw < 888 & q44 < 888
创建dplyr::if_else()
列joint_gpw
。之后,有一组条件(在~
之前)被依次检查。第一个匹配行,提供值(在~
操作符后面)。
请注意,正如Sotos在评论中指出的那样,R中的NA
s是用is.na(x)
检查的,而不是用==
/!=
,因为它们总是返回NA
。我省略了大多数行的NA
检查,因为这些检查隐含在case_when()
的顺序性质中——只要一个条件匹配,后面的条件就不再求值。NA_real_
为数字NA
值。使用if_else()
和case_when()
,您必须明确数据类型。