我有以下示例数据集显示熊是在陆地上还是在海冰上:
Bear.ID Region
1 A Land
2 A Land
3 A Land
4 A Ice
5 A Ice
6 A Ice
7 A Ice
8 B Land
9 B Ice
10 B Land
11 B Land
12 B Ice
13 B Ice
14 B Ice
15 B Ice
我的目标是创建另一个列,显示每只熊离开到海冰上。这个出发日期被定义为熊在冰面上的第一行,而不是熊在陆地上的那一行。
在我的示例数据集中,列是这样的:
Bear.ID Region Departure?
1 A Land Not Departure
2 A Land Not Departure
3 A Land Not Departure
4 A Ice Departure
5 A Ice Not Departure
6 A Ice Not Departure
7 A Ice Not Departure
8 B Land Not Departure
9 B Ice Not Departure
10 B Land Not Departure
11 B Land Not Departure
12 B Ice Departure
13 B Ice Not Departure
14 B Ice Not Departure
15 B Ice Not Departure
我如何在R中做到这一点?"Departure"one_answers"不离开";如果方便的话,也可以用TRUE和FALSE替换。
So I think this might help you, I am checking if my current land is 'ice', the one before is 'land' and the one after is also 'ice'
library(dplyr)
data.frame(
bear = c("A","A","A","B","B","B","B","B"),
region = c("Land","Ice","Ice","Land","Ice","Land","Ice","Ice")
) %>%
group_by(bear) %>%
mutate(
departure = case_when(
region == 'Ice' & lag(region) == "Land" & lead(region) == "Ice" ~ "Departure",
TRUE ~ "Not departure"
)
)
# A tibble: 8 x 3
# Groups: bear [2]
bear region departure
<chr> <chr> <chr>
1 A Land Not departure
2 A Ice Departure
3 A Ice Not departure
4 B Land Not departure
5 B Ice Not departure
6 B Land Not departure
7 B Ice Departure
8 B Ice Not departure
我已经生成了一个示例数据框架。
library(data.table)
library(tidyverse)
id <-c(1, 1, 1, 2, 2)
date <- c("2022-10-01 22:22:01","2022-11-05 22:22:01","2022-08-18 12:48:16","2022-11-19 20:57:44","2022-12-19 20:57:44")
date_1 <- c("2022-11-01 22:22:01","2022-11-02 22:22:01","2022-11-03 12:48:16","2022-11-04 20:57:44","2022-11-05 20:57:44")
date_2 <- c("2022-12-01 22:22:01","2022-12-02 22:22:01","2022-12-03 12:48:16","2022-12-04 20:57:44","2022-12-05 20:57:44")
df <- data.table(id,date,date_1, date_2)
在这一步中,我格式化所有的日期列。
#### format date
df$date <- as.POSIXct(df$date)
df$date_1 <- as.POSIXct(df$date_1)
df$date_2 <- as.POSIXct(df$date_2)
检查日期列是否在时间范围内
#### check if date is in time range
df$Departure_Date <- ifelse(df$date <= df$date_2 & df$date >= df$date_1, "Departure", "Not Departure")
结果如下:
id date date_1 date_2 Departure.Date
1: 1 2022-10-01 22:22:01 2022-11-01 22:22:01 2022-12-01 22:22:01 Not Departure
2: 1 2022-11-05 22:22:01 2022-11-02 22:22:01 2022-12-02 22:22:01 Departure
3: 1 2022-08-18 12:48:16 2022-11-03 12:48:16 2022-12-03 12:48:16 Not Departure
4: 2 2022-11-19 20:57:44 2022-11-04 20:57:44 2022-12-04 20:57:44 Departure
5: 2 2022-12-19 20:57:44 2022-11-05 20:57:44 2022-12-05 20:57:44 Not Departure