我正在尝试替换列表中的值word,在listpositions指定的索引上,通过抽样存在于第三个列表字母中的值.
下面是我的列表的示例:
word <- c("A","E","C","A","R","O","P")
positions <- c(1,5,3,7)
letters <- c("A","B","C","D","E","F")
一个重要的细节是中的值[position]采样后不应该保持不变,这可能是因为字母中的值重叠。词和
我现在使用的代码是:
for (i in 1:length(positions)){
temp <- word[[positions[i]]]
word[[positions[i]]] <- sample(letters, 1)
while (word[[positions[i]]] == temp) {
word[[positions[i]]] <- sample(letters, 1)
}
}
虽然这可以工作,但我意识到它效率极低,因为我在列表中更改值的顺序无关紧要。我一直在尝试使用"应用"这个词。一个函数族来解决这个问题,但是我很难找到一个解决方案。
非常感谢您的关注!
你可以这样做:
word[positions] <- sapply(word[positions],
(w) sample(setdiff(letters, w), 1))
在sapply
中,您总是从letters
中删除当前单词,因此保证有一个不同的单词是sample
d。
还要注意letters
是一个内置的R常量(包含小写英文字母,参见?letters
),因此通常不建议将此名称用于用户定义的变量。
由于重复采样的概率很小,因此向量化重复采样将非常高效。
rreplace <- function(x, y, i) {
v <- x
while(length(i)) {
x[i] <- sample(y, length(i), 1)
i <- i[v[i] == x[i]]
}
x
}
word <- c("A","E","C","A","R","O","P")
positions <- c(1,5,3,7)
letters <- c("A","B","C","D","E","F")
rreplace(word, letters, positions)
#> [1] "C" "E" "D" "A" "A" "O" "F"
一个更大的基准测试示例:
word <- sample(LETTERS, 1e5, 1)
letters <- LETTERS[1:15]
positions <- sample(length(word), 1e4)
# check that the correct words were replaced
word2 <- rreplace(word, letters, positions)
all(word[positions] != word2[positions])
#> [1] TRUE
all(word[-positions] == word2[-positions])
#> [1] TRUE
microbenchmark::microbenchmark(rreplace = rreplace(word, letters, positions),
RobertHacken = sapply(word[positions], function(w) sample(setdiff(letters, w), 1)))
#> Unit: milliseconds
#> expr min lq mean median uq max neval
#> rreplace 1.1181 1.33035 1.665195 1.61985 1.8919 3.9958 100
#> RobertHacken 104.9374 145.25685 151.923915 156.94295 165.9491 198.5219 100