关于第二段代码的调整,我需要一些帮助。代码的目的是做同样的事情。第一个代码正确地生成了输出表。然而,在第二个代码中,它使用data.table
函数没有。第二段代码来自这里:如何使用数据调整代码功能以适应规范。在问题中完成的示例中,结果是正确的,但是当我测试我的新数据库时,它没有给出预期的结果。第二个代码的输出表也必须是coef = 14
和Result = 1
<库/strong>
library(dplyr)
library(tidyverse)
library(lubridate)
library(data.table)
数据库strong>
df1<-structure(list(Id = 8, date1 = structure(1649376000, tzone = "UTC", class = c("POSIXct",
"POSIXt")), date2 = structure(1649376000, tzone = "UTC", class = c("POSIXct",
"POSIXt")), Week = "Friday", DT = "0", Category = "ABC",
GR = 1, DayR1 = 0, DayM000 = 13, DayM001 = 13,
DayM002 = 14, DayM003 = 14, DayM004 = 13, DayM005 = 13, DayM006 = 13,
DayM007 = 12, DayM008 = 12, DayM009 = 12, coef = 14), class = "data.frame", row.names = c(NA,
-1L))
Id date1 date2 Week DT Category GR DayR1 DayM000 DayM001 DayM002 DayM003 DayM004 DayM005 DayM006 DayM007 DayM008 DayM009 coef
1 8 2022-04-08 2022-04-08 Friday 0 ABC 1 0 13 13 14 14 13 13 13 12 12 12 14
第一个代码(结果是正确的)
df1%>% mutate(across(starts_with("Day"), ~coef - .),
across(contains("date"), ymd),
datedif = parse_number(as.character(date2-date1)))%>%
rename_with(~str_replace(.,'(?<=[A-Z])0+(?=.)', ""), starts_with('Day')) %>%
rowwise %>%
mutate(Result= if (str_c('DayM', datedif) %in% names(.)) get(str_c('DayM', datedif)) else coef) %>%
ungroup() %>%
select(coef, Result)%>%data.frame()
coef Result
1 14 1
第二段代码(Usingdata.table)功能)。结果错误
dr_names <- grep("^Day", names(df1), value = TRUE)
date_names <- grep("date", names(df1), value = TRUE)
setDT(df1)[, (dr_names) := lapply(.SD, function(x) coef - x), .SDcols = dr_names
][, (date_names) := lapply(.SD, as.IDate), .SDcols = date_names
][, datedif := date2 - date1]
setnames(df1, dr_names, sub("([A-Z])0+", "\1", dr_names))
df1[, .(coef, Result = fcoalesce(as.matrix(.SD)[cbind(.I,
match(paste0('DayM', datedif), names(.SD)))], coef)), .SDcols = patterns("^DayM\d+")]%>%data.frame()
coef Result
1 14 14
问题是在sub
代码与setnames
。该代码匹配大写字母([A-Z])
后的一个或多个0 (0+
),并删除替换中的0。在之前的数据集中,它之所以有效是因为名称模式不同。这里,对于'dr_names'
> dr_names
[1] "DayR1" "DayM000" "DayM001" "DayM002" "DayM003" "DayM004" "DayM005" "DayM006" "DayM007" "DayM008" "DayM009"
因此,对于'DayM000',它返回DayM
,这与返回DayM0
的paste0("DayM", datedif)
不能正确匹配。更改模式以保留最后一位数字将修复它
setnames(df1, dr_names, sub("([A-Z])0+(.)", "\1\2", dr_names))
现在,我们运行代码
df1[, .(coef, Result = fcoalesce(as.matrix(.SD)[cbind(.I,
match(paste0('DayM', datedif), names(.SD)))], coef)),
.SDcols = patterns("^DayM\d+")]%>%
data.frame()
coef Result
1 14 1