r语言 - 计算嵌套数据帧的差异并在另一个矩阵列表中划分



我将一个数据帧分成两个月(六月和七月)。然后我基于ID嵌套了这两个不同的数据帧。嵌套的数据帧包含一个ID列和一个data列。

data列包含一个列表,该列表表示ID的数据,该数据在一个月内被分成三个10天间隔。例如,对于IDA,列表显示[[1]]为一个月内的第一个10天,[[2]]为第二个10天,[[3]]为第三个10天。

对于下一个组件,我想要向下查看每个ID的每个列表,并计算nested_junenested_july中最小jDate之间的差异,如下图所示,n1,n2n3。然后将这些差异组合成一个矩阵m1

最后,我有一个包含两个矩阵l1的列表,我想将列表中的每个矩阵除以m1

是否有更有效的方法来计算list中矩阵的差和除法?

library(lubridate)
library(dplyr)
library(tidyr)
library(purrr)
f = function(data){
data %>% mutate(
new = floor_date(data$date, "10 days"),
new = if_else(day(new) == 31, new - days(10), new)
) %>% 
group_split(new)
}
ID <-  rep(c("A","B","C", "D"), 1000)
date <-  rep_len(seq(dmy("01-01-2010"), dmy("31-12-2013"), by = "days"), 500)
x <-  runif(length(date), min = 60000, max = 80000)
y <-  runif(length(date), min = 800000, max = 900000)
df <- data.frame(date = date, 
x = x,
y =y,
ID)
df$jDate <- julian(as.Date(df$date), origin = as.Date("1970-01-01"))
df$Month <- month(df$date)
df_june <- filter(df, Month == c("6"))
df_july <- filter(df, Month == c("7"))
nested_june <- tibble(
df_june
) %>% group_by(ID) %>%
nest() %>% 
mutate(data = map(data, f))
nested_july <- tibble(
df_july
) %>% group_by(ID) %>%
nest() %>% 
mutate(data = map(data, f))
# Create list of matrices
t1 <- c(100,150,200)
t2 <- c(200,250,350)
t3 <- c(300,350, 400)
mat <- cbind(t1,t2, t3)
t1 <- c(150,150,200)
t2 <- c(250,250,350)
t3 <- c(350,350, 400)
mat2 <- cbind(t1,t2, t3)
l1 <- list(list(mat), list(mat2))
## Hoping to get a function for everything below here ##
# Calculate difference in days from the first day of one interval to the first 
# day of the second interval and repeat with the other intervals. 
n1 <- c(((min(nested_july[[2]][[1]][[1]]$jDate))- min(nested_june[[2]][[1]][[1]]$jDate)),
((min(nested_july[[2]][[1]][[1]]$jDate))- min(nested_june[[2]][[1]][[2]]$jDate)),
((min(nested_july[[2]][[1]][[1]]$jDate))- min(nested_june[[2]][[1]][[3]]$jDate)))
n2 <- c(((min(nested_july[[2]][[1]][[2]]$jDate))- min(nested_june[[2]][[1]][[1]]$jDate)),
((min(nested_july[[2]][[1]][[2]]$jDate))- min(nested_june[[2]][[1]][[2]]$jDate)),
((min(nested_july[[2]][[1]][[2]]$jDate))- min(nested_june[[2]][[1]][[3]]$jDate)))
n3 <-  c(((min(nested_july[[2]][[1]][[3]]$jDate))- min(nested_june[[2]][[1]][[1]]$jDate)),
((min(nested_july[[2]][[1]][[3]]$jDate))- min(nested_june[[2]][[1]][[2]]$jDate)),
((min(nested_july[[2]][[1]][[3]]$jDate))- min(nested_june[[2]][[1]][[3]]$jDate)))
m1 <- cbind(n1,n2,n3)

# Expected output as matrices
l1[[1]][[1]]/m1
l1[[2]][[1]]/m1

尝试使用lapply

lapply(l1, function(sub) {sub <- lapply(sub, `/`, m1)
sub})

与产出

[[1]]
[[1]][[1]]
t1        t2        t3
[1,]  3.333333  4.761905  6.000000
[2,]  8.333333  8.333333  9.210526
[3,] 20.000000 15.909091 13.333333

[[2]]
[[2]][[1]]
t1        t2        t3
[1,]  5.000000  5.952381  7.000000
[2,]  8.333333  8.333333  9.210526
[3,] 20.000000 15.909091 13.333333

要创建matrix,我们可以执行

library(tidyr)
library(purrr)
library(dplyr)
m2 <-  crossing(i1 = seq_len(ncol(l1[[1]][[1]])),
i2 = seq_len(ncol(l1[[1]][[1]]))) %>% 
transmute(new =map2_dbl(i1, i2, 
~ min(nested_july[[2]][[1]][[.x]]$jDate) - 
min(nested_june[[2]][[1]][[.y]]$jDate))) %>% 
pull(new) %>%
matrix(ncol = 3)

检查


> m2
[,1] [,2] [,3]
[1,]   30   42   50
[2,]   18   30   38
[3,]   10   22   30

OP的m1

> m1
n1 n2 n3
[1,] 30 42 50
[2,] 18 30 38
[3,] 10 22 30

相关内容

  • 没有找到相关文章

最新更新