r语言 - 如何检查 data.table 的单独行中的值是否相同 - r - How to check if values in individiual rows of a data.table are identical 小贝子编程网

假设我有以下data.table：

dt <- data.table(a = 1:2, b = 1:2, c = c(1, 1))
# dt
#    a b c
# 1: 1 1 1
# 2: 2 2 1

创建第四列d指示每行中预先存在的值都相同，以便生成的 data.table 如下所示的最快方法是什么？

# dt
#    a b c              d
# 1: 1 1 1      identical
# 2: 2 2 1  not_identical

我想避免使用duplicated函数，并希望坚持使用identical或类似的函数，即使这意味着遍历每行中的项目。

uniqueN可以按行分组并创建一个逻辑表达式 (== 1)

library(data.table)
dt[, d := c("not_identical", "identical")[(uniqueN(unlist(.SD)) == 1) +
1], 1:nrow(dt)]

-输出

dt
#   a b c             d
#1: 1 1 1     identical
#2: 2 2 1 not_identical

或者另一种有效的方法可能是与第一列进行比较，并使用rowSums创建一个表达式

dt[, d := c("identical", "not_identical")[1 + rowSums(.SD[[1]] != .SD) > 0 ] ]

这是另一个使用var的data.table选项

dt[, d := ifelse(var(unlist(.SD)) == 0, "identical", "non_identical"), seq(nrow(dt))]

这给了

> dt
a b c             d
1: 1 1 1     identical
2: 2 2 1 non_identical

r语言 - 如何检查 data.table 的单独行中的值是否相同