我试图在数据框中创建一个列,这取决于当前数据框中的值。这是我从
开始的数据帧的head()
编辑:这是我开始时为练习排除不必要列的数据框架。除了这两列,它还有很多其他的列:
> head(df)
# A tibble: 6 x 2
Responded `Response Rate`
<chr> <chr>
1 0% 0%
2 0% 0%
3 0% 0%
4 100% 100%
5 0% 0%
6 100% 0%
我想要一个名为"完成率"使用以下条件填充值:
如果Responded
为0%
,则值应为NA
(或NULL
- R中以无数据者为准)
else,取Response Rate
的值。,输出应为:
> head(df)
# A tibble: 6 x 3
Responded `Response Rate` `Completion Rate`
<chr> <chr> <chr>
1 0% 0% NA
2 0% 0% NA
3 0% 0% NA
4 100% 100% 100%
5 0% 0% NA
6 100% 0% 0%
我尝试使用mutate
和replace
在没有任何临时步骤的情况下创建新列,没有任何乐趣。如果有人能示范一下怎么做,那就太好了。
然后我尝试通过首先创建一个列来构建Completion Rate
:
df$"Completion Rate" <- df$`Response Rate`
,然后替换NA
应该使用以下代码的列中的值:
df <- mutate(df, replace("Completion Rate", Responded == 0, NA, response_df$`Response Rate`))
出现以下错误:
> response_df <- mutate(response_df, replace("Completion Rate", Responded == 0, NA, response_df$`Response Rate`))
Error: Problem with `mutate()` input `..1`.
i `..1 = replace("Completion Rate", Responded == 0, NA, response_df$`Response Rate`)`.
x unused argument (response_df$`Response Rate`)
Run `rlang::last_error()` to see where the error occurred.
运行额外建议的错误检查代码:
> rlang::last_error()
<error/dplyr:::mutate_error>
Problem with `mutate()` input `..1`.
i `..1 = replace("Completion Rate", Responded == 0, NA, response_df$`Response Rate`)`.
x unused argument (response_df$`Response Rate`)
Backtrace:
1. dplyr::mutate(...)
6. base::.handleSimpleError(...)
7. dplyr:::h(simpleError(msg, call))
> rlang::last_trace()
<error/dplyr:::mutate_error>
Problem with `mutate()` input `..1`.
i `..1 = replace("Completion Rate", Responded == 0, NA, response_df$`Response Rate`)`.
x unused argument (response_df$`Response Rate`)
Backtrace:
x
1. +-dplyr::mutate(...)
2. +-dplyr:::mutate.data.frame(...)
3. | -dplyr:::mutate_cols(.data, ..., caller_env = caller_env())
4. | +-base::withCallingHandlers(...)
5. | -mask$eval_all_mutate(quo)
6. -base::.handleSimpleError(...)
7. -dplyr:::h(simpleError(msg, call))
<error/simpleError>
unused argument (response_df$`Response Rate`)
我尝试使用0%
和"0%"
。我试着参考Completion Rate
而不是Response Rate
的"else"replace
的论证。我尝试了= 0
而不是== 0
。这些给出了不同的错误。
使用ifelse
-
library(dplyr)
df %>%
mutate(Completion_Rate = ifelse(Responded == '0%', NA, Response_Rate))
# Responded Response_Rate Completion_Rate
#1 0% 0% <NA>
#2 0% 0% <NA>
#3 0% 0% <NA>
#4 100% 100% 100%
#5 0% 0% <NA>
#6 100% 0% 0%
以可重复的格式提供数据更容易提供帮助-
df <- structure(list(Responded = c("0%", "0%", "0%", "100%", "0%",
"100%"), Response_Rate = c("0%", "0%", "0%", "100%", "0%", "0%"
)), row.names = c(NA, -6L), class = "data.frame")
您可以使用来自tidyverse的dplyr
library(dplyr)
df <- data.frame(Responded = c(0,0,0,100,0,100),
`Response Rate` = c(0,0,0,100,0,0))
print(df)
Responded `Response Rate`
1 0 0
2 0 0
3 0 0
4 100 100
5 0 0
6 100 0
df <- df %>%
mutate(`Completion Rate` <- ifelse(Responded==0, NA, `Response Rate`))
print(df)
Responded `Response Rate` `Completion Rate`
1 0 0 NA
2 0 0 NA
3 0 0 NA
4 100 100 100
5 0 0 NA
6 100 0 0
或者在字符串百分比
中有值library(dplyr)
df <- data.frame(Responded = c('0%','0%','0%','100%','0%','100%'),
`Response Rate` = c('0%','0%','0%','100%','0%','0%'))
print(df)
Responded `Response Rate`
1 0% 0%
2 0% 0%
3 0% 0%
4 100% 100%
5 0% 0%
6 100% 0%
df <- df %>%
mutate(`Completion Rate` = ifelse(Responded=='0%', NA, `Response Rate`))
Responded `Response Rate` `Completion Rate`
1 0% 0% <NA>
2 0% 0% <NA>
3 0% 0% <NA>
4 100% 100% 100%
5 0% 0% <NA>
6 100% 0% 0%
>