这只是一个基本的循环,可以做您想做的事情。它不是特别有效,但我想不出一个好方法来使用矢量化使它更快。
overflow <- 0
for (i in 1:nrow(d)) {
if (d$kWh[i] + overflow > 20) {
d$limit_kWh[i] <- 20
overflow <- d$kWh[i] + overflow - 20
}
else {
d$limit_kWh[i] <- d$kWh[i] + overflow
overflow <- 0
}
}
基本上大于20的值(如果有的话)存储在overflow
变量中,该变量在每个条目更新一次。
实际上,这里有一个大约快2倍的方法,它更多地依赖于矢量化。它涉及到创建一个overflow
向量,其中包含从前一个日期开始的溢出量。
overflow <- numeric(nrow(d))
for (i in 2:nrow(d)) {
overflow[i] <- max(d$kWh[i-1] + overflow[i-1] - 20, 0)
}
d$limit_kWh <- pmin(d$kWh + overflow, 20)
一种方法将Reduce
与accumulate
一起使用。方法与@Noah给出的答案相同。
x$limit_kWh <- pmin(20, x$kWh + head(Reduce(function(x, y)
{max(0, x + y - 20)}, x$kWh, 0, accumulate = TRUE), -1))
x
# interval starting kWh limit_kWh
#1 2021-01-01 19:00 12.2 12.2
#2 2021-01-01 19:30 14.7 14.7
#3 2021-01-01 20:00 20.2 20.0
#4 2021-01-01 20:30 30.7 20.0
#5 2021-01-01 21:00 36.3 20.0
#6 2021-01-01 21:30 36.7 20.0
#7 2021-01-01 22:00 30.1 20.0
#8 2021-01-01 22:30 26.3 20.0
#9 2021-01-01 23:00 18.1 20.0
#10 2021-01-01 23:30 15.8 20.0
#11 2021-01-02 00:00 11.4 20.0
#12 2021-01-02 00:30 10.2 20.0
#13 2021-01-02 01:00 11.9 20.0
#14 2021-01-02 01:30 12.3 20.0
#15 2021-01-02 02:00 9.1 20.0
#16 2021-01-02 02:30 8.6 17.7
#17 2021-01-02 03:00 8.3 8.3
#18 2021-01-02 03:30 10.1 10.1
数据:
x <- read.table(header = TRUE, check.names = FALSE,
text = '"interval starting" kWh
"2021-01-01 19:00" 12.2
"2021-01-01 19:30" 14.7
"2021-01-01 20:00" 20.2
"2021-01-01 20:30" 30.7
"2021-01-01 21:00" 36.3
"2021-01-01 21:30" 36.7
"2021-01-01 22:00" 30.1
"2021-01-01 22:30" 26.3
"2021-01-01 23:00" 18.1
"2021-01-01 23:30" 15.8
"2021-01-02 00:00" 11.4
"2021-01-02 00:30" 10.2
"2021-01-02 01:00" 11.9
"2021-01-02 01:30" 12.3
"2021-01-02 02:00" 9.1
"2021-01-02 02:30" 8.6
"2021-01-02 03:00" 8.3
"2021-01-02 03:30" 10.1')
我采用@Noah的基本逻辑并将其放入数据步()中。它是相同的结果,并且并不比for循环更有效。但它更容易阅读。
输入数据如下:
# Input data
dt <- read.table(header = TRUE, text = '
interval_starting kWh
"2021-01-01 19:00" 12.2
"2021-01-01 19:30" 14.7
"2021-01-01 20:00" 20.2
"2021-01-01 20:30" 30.7
"2021-01-01 21:00" 36.3
"2021-01-01 21:30" 36.7
"2021-01-01 22:00" 30.1
"2021-01-01 22:30" 26.3
"2021-01-01 23:00" 18.1
"2021-01-01 23:30" 15.8
"2021-01-02 00:00" 11.4
"2021-01-02 00:30" 10.2
"2021-01-02 01:00" 11.9
"2021-01-02 01:30" 12.3
"2021-01-02 02:00" 9.1
"2021-01-02 02:30" 8.6
"2021-01-02 03:00" 8.3
"2021-01-02 03:30" 10.1')
数据步骤如下:
library(libr)
# Run datastep
res <- datastep(dt,
retain = list(overflow = 0),
calculate = {limit = 20},
drop = c("limit", "overflow"),
{
if (kWh + overflow > limit) {
limit_kWh <- limit
overflow <- kWh + overflow - limit
} else {
limit_kWh <- kWh + overflow
overflow <- 0
}
})
结果如下:
# View results
res
# interval_starting kWh limit_kWh
# 1 2021-01-01 19:00 12.2 12.2
# 2 2021-01-01 19:30 14.7 14.7
# 3 2021-01-01 20:00 20.2 20.0
# 4 2021-01-01 20:30 30.7 20.0
# 5 2021-01-01 21:00 36.3 20.0
# 6 2021-01-01 21:30 36.7 20.0
# 7 2021-01-01 22:00 30.1 20.0
# 8 2021-01-01 22:30 26.3 20.0
# 9 2021-01-01 23:00 18.1 20.0
# 10 2021-01-01 23:30 15.8 20.0
# 11 2021-01-02 00:00 11.4 20.0
# 12 2021-01-02 00:30 10.2 20.0
# 13 2021-01-02 01:00 11.9 20.0
# 14 2021-01-02 01:30 12.3 20.0
# 15 2021-01-02 02:00 9.1 20.0
# 16 2021-01-02 02:30 8.6 17.7
# 17 2021-01-02 03:00 8.3 8.3
# 18 2021-01-02 03:30 10.1 10.1