我希望没有人对我感到厌倦。下面我给出了一个完全虚构的问题的例子。只有在数据框架中包含数字列和字符串列的混合后才会出现错误,即使函数中排除了数字列(下面没有显示)。该函数在列c中查找d的内容,反之亦然,使用grepl,然后应该从ifelse中创建一个新的"yes"one_answers"no"语句列。我需要ifelse在测试后返回"是"one_answers"否",但这是给出错误的部分。
a <- c(5:10)
b <- c(105:110)
c <- c("a","b","c","d","e","f")
d <- c("aa","bc","cd","ff","ee", "gf")
df <- data.frame(a,b,c,d)
newfunction <- function(x, col1, col2, col3, col4){ifelse(((sapply(lapply(x[[col4]], grepl,
x[[col3]]),any)) | (sapply(lapply(x[[col3]], grepl, x[[col4]]),any))), (11 - x[[col1]]), (1 -
x[[col2]]))}
df$new <- apply(df, 1, newfunction, "a", "b", "c", "d")
Error in 11 - x[[col1]] : non-numeric argument to binary operator
进入问题。解决这个问题的最佳方法是使用包dplyr
中的函数case_when
。我使用了stringr
中的str_detect
函数。
> library(tidyverse)
> library(stringr)
>
> a <- c(5:10)
> b <- c(105:110)
> cc <- c("a","b","c","d","e","f")
> d <- c("aa","bc","cd","ff","ee", "gf")
> df <- data.frame(a, b, cc, d)
>
>
>
> df %>% mutate(case_when((str_detect(cc, d) | str_detect(d, cc)) ~ 'Yes',
+ TRUE ~ 'No'))
a b cc d case_when(...)
1 5 105 a aa Yes
2 6 106 b bc Yes
3 7 107 c cd Yes
4 8 108 d ff No
5 9 109 e ee Yes
6 10 110 f gf Yes
这个错误是由于您使用apply
而导致的,它强制矩阵为字符矩阵(将数值转换为字符)。例如,
> apply(df, 2, function(x) x)
a b cc d
[1,] " 5" "105" "a" "aa"
[2,] " 6" "106" "b" "bc"
[3,] " 7" "107" "c" "cd"
[4,] " 8" "108" "d" "ff"
[5,] " 9" "109" "e" "ee"
[6,] "10" "110" "f" "gf"
一些指针,因为你似乎是新的r。首先,不要使用c
作为一个名称,它是非常常用的函数来组合元素到一个向量。其次,函数的编写方式很难读懂。你应该把它分成几个步骤,这样更容易。
您还可以向量化grepl
以简化代码:
Vgrepl <- Vectorize(grepl)
TF <- Vgrepl(pattern=df$c, x=df$d) | Vgrepl(pattern=df$d, x=df$c)
df$comp <- ifelse(TF, "Yes", "No")
df
# a b c d comp
# 1 5 105 a aa Yes
# 2 6 106 b bc Yes
# 3 7 107 c cd Yes
# 4 8 108 d ff No
# 5 9 109 e ee Yes
# 6 10 110 f gf Yes