R中的 for循环嵌套在另一个循环中



我有一个像这个代码创建的数据集:

data <- data.frame(ID=c(rep("01",3),rep("02",3)), x=c("abc","abc","cde","abc","abc","abc"), t1=c(1,6,9,1,6,11), t2=c(4,7,12,3,10,12))

和另一个由:

创建的
data2 <- data.frame(ID=c(rep("01",12),rep("02",12)), t = rep(1:12,2), x= c(rep(NA,24)))

,我想要得到的是:

data_final <- data.frame(ID=c(rep("01",12),rep("02",12)), t = rep(1:12,2), x= c(rep("abc",4),NA,"abc","abc",NA,rep("cde",4),rep("abc",3),rep(NA,2),rep("abc",7)))

这段代码只适用于一个ID,我不知道如何处理更多ID:

for (i in 1:nrow(data)) {
data2$x[data$t1[i]:data$t2[i]] <- data$x[i]
}

任何想法?

下面是我如何使用tidyverse元包中的tidyr/dplyr

library(tidyverse)
data_expanded <- data %>%
uncount(t2 - t1 + 1, .id = "row") %>%
mutate(t = t1 + row - 1) %>%   # convert to t 
select(ID, t, x)               # only need columns ID, t, and x
data2 %>%
select(-x) %>%                 # we'll add x from data_expanded next
left_join(data_expanded)

首先,将data转换为明确列出与其相关的每个IDt的表。我使用tidyr::uncount来复制t1t2范围内每个t的每一行。

一旦这个表准备好了,它就是一个到原始文件的简单连接。

结果:

Joining, by = c("ID", "t")
ID  t    x
1  01  1  abc
2  01  2  abc
3  01  3  abc
4  01  4  abc
5  01  5 <NA>
6  01  6  abc
7  01  7  abc
8  01  8 <NA>
9  01  9  cde
10 01 10  cde
11 01 11  cde
12 01 12  cde
13 02  1  abc
14 02  2  abc
15 02  3  abc
16 02  4 <NA>
17 02  5 <NA>
18 02  6  abc
19 02  7  abc
20 02  8  abc
21 02  9  abc
22 02 10  abc
23 02 11  abc
24 02 12  abc

如果您想在基本R中使用loop,下面的代码将满足您的要求。

# loop each row of data2
for (i in 1: dim(data2)[1]){
IDi <- data2[i, 1] # id of ith row
IDi_idx <-(IDi == data$ID) # use this IDi to find the index (same ID) in data 
seldf_i <- data[IDi_idx, ] # subset data using IDi_idx
# if data2$x is NA, then check if data2[i, 2] is in the range [t1:t2] by each row of seldf_i; yes assigns according x, no with NA
for (j in 1: dim(seldf_i)[1]){
if (is.na(data2[i, 3])){
data2[i, 3] <- ifelse(data2[i, 2] %in% (seldf_i[j, 3]:seldf_i[j, 4]), 
as.character(seldf_i[j, 2]), NA)}
}
}
data2

data.table选项

data_final <- setDT(data)[
,
.(t = seq(t1, t2)), .(ID, x, seq(nrow(data)))
][
setDT(data2),
on = .(ID, t)
][
,
.(ID, t, x)
][]

> data_final
ID  t    x
1: 01  1  abc
2: 01  2  abc
3: 01  3  abc
4: 01  4  abc
5: 01  5 <NA>
6: 01  6  abc
7: 01  7  abc
8: 01  8 <NA>
9: 01  9  cde
10: 01 10  cde
11: 01 11  cde
12: 01 12  cde
13: 02  1  abc
14: 02  2  abc
15: 02  3  abc
16: 02  4 <NA>
17: 02  5 <NA>
18: 02  6  abc
19: 02  7  abc
20: 02  8  abc
21: 02  9  abc
22: 02 10  abc
23: 02 11  abc
24: 02 12  abc
ID  t    x

最新更新