我对frank
函数感到困惑。这里的文档说:
仅适用于列表、数据帧和数据表。要计算的列 排名基于。不要引用列名。如果。。。不见了,全部 默认情况下会考虑列。按列降序排序 顺序前缀 a "-",例如 Frank(x, a, -b, c(。当 b 为 类型字符也是如此。
所以我有我的数据:
structure(list(product = c("Product 1", "Product 1", "Product 1",
"Product 1", "Product 1", "Product 5", "Product 5", "Product 5",
"Product 5", "Product 5"), policyID = c("A738-33", "A738-33",
"A738-33", "A738-33", "A738-33", "A738-33", "A738-33",
"A738-33", "A738-33", "A738-33"), startYear = c(2014,
2015, 2016, 2017, 2018, 2014, 2015, 2016, 2017, 2018), total = c("30000",
"30000", "30000", "30000", "30000", "10000", "10000", "10000",
"10000", "10000"), daily = c("150", "150", "150", "150", "150",
"80", "80", "80", "80", "80")), class = c("data.table", "data.frame"
), row.names = c(NA, -10L), .internal.selfref = <pointer: 0x7feec50126e0>, sorted = "product")
我想按列total
和daily
对这些数据进行排序。所以我这样做了:
> setDT(testDT)
> frankv(testDT, totallimit, rbddaily, ties.method="dense")
Error in colnamesInt(x, cols, check_dups = TRUE) :
argument specifying columns specify non existing column(s): cols[1]='30000'
奇怪的是,当我确实使用引号时,与文档所说的完全相反,我得到的结果:
frankv(testDT, cols=c("totallimit", "rbddaily"), ties.method="dense")
我还尝试将thin集成到data.table中,另一件奇怪的事情发生了。从我拥有的 10 行数据中,我获得了 100 行。
testDT[,.(rank = frankv(testDT, cols=c("limit", "daily"), ties.method="dense")), by = c("policyID", "product", "startYear")]
我做错了什么,我该如何解决这个问题?文档没有多大帮助,也许我错过了一些东西......
对于frank
,您不应该引用,但对于frankv
(您使用的函数(,您应该:
library(data.table)
frank(testDT, total, daily, ties.method="dense")
[1] 2 2 2 2 2 1 1 1 1 1
frankv(testDT, cols=c("total", "daily"), ties.method="dense")
[1] 2 2 2 2 2 1 1 1 1 1