我正在尝试创建一个不区分大小写的类似%in%
的函数。到目前为止,我已经创建了这个:
qs <- function(str, vec, na.rm = TRUE) {
stopifnot("str must have length 1" = length(str) == 1)
if(na.rm) {
any(stringi::stri_detect_fixed(str, na.omit(vec), max_count = 1, opts_fixed = list(case_insensitive = TRUE)))
} else {
any(stringi::stri_detect_fixed(str, vec, max_count = 1, opts_fixed = list(case_insensitive = TRUE)))
}
对于我的用例,我需要将其矢量化,这样我就可以做到:
vecqs <- Vectorize(qs, "str")
然而,我读到Vectorize
相当慢。我也一直在阅读有关data.table::chmatch
和fastmatch
包的内容。这两者都实现了它们自己的%in%
类型的函数(chmatch
表示data.table
(。这些会很好,但我不知道如何使chmatch
不区分大小写。
基于上面的评论,我得出了下面相对简单的解决方案:
#' Quick Search
#'
#' Quick case-insensitive search of strings in a character vector
#'
#' @param str a character vector: the values to be matched
#' @param vec a character vector: the values to be matched against
#'
#' @details Utilizes code{data.table::`%chin%`} to rapidly complete a case-insensitive search
#' through a character vector to return a logical vector of string detections.
#' Will always return TRUE or FALSE for each position of code{str} regardless of NA missing values
#' in either provided vector. NA in code{str} will never match an NA value in code{vec}.
#'
#' @return a logical vector of length code{length(str)}
#'
#' @export
#' @importFrom data.table %chin%
#'
#' @examples
#' x <- c("apple","banana","cherry",NA)
#' "apple" %qsin% x
#' c("APPLE","BANANA","coconut", NA) %qsin% x
#'
`%qsin%` <- function(str, vec) {
tolower(str) %chin% na.omit(tolower(vec))
}