r-根据多个位置的分层警报系统,聚合每日数据以提供一个警报输出



我目前正在处理一个大型数据集,该数据集记录多个位置的每日数据,我希望将每日数据汇总为一个输出,在当天给出最大警告级别(类别为红色/黄色/无(。

考虑以下设置:

location = c(rep("A", 4), rep("B", 4), rep("C", 4), rep("D",4) , rep("E", 4))
date = rep(c("19991230", "19991231", "20000101", "20000102"), 5)
warning = c("Red", "None", "None", "None", "Yellow", "None", "Red", "None", "Yellow", "Yellow", "None", "Yellow", "None", "None", "None", "None", "Yellow", "None", "None", "None")
data = data.frame(location, date, warning)

我正在尝试创建一个新列,如果在每个特定的日子没有发生警告,它将显示"无",如果发生一个或多个黄色警告(除非同一天发生一个或者多个"红色"警告(,则显示"黄色",在这种情况下,"红色"输出优先。

我已经考虑过使用日期聚合,但我不确定应用哪个函数。我也尝试过在每次约会中循环,尝试和!计数"无"警告,至少缩小范围,但没有任何运气。也许我需要在日期上使用ifelse和for循环?以下尝试不佳:

aggregate(data, by=date, FUN)

data <- data %>%
group_by(date) %>%
mutate(day_warning_type = case_when(
warning != "None" ~ TRUE, TRUE ~ FALSE
)) %>%
ungroup()

希望有人至少能帮助我朝着正确的方向前进,因为到目前为止,我还没有取得多大进展,因为我正在努力了解如何处理角色变量。

您使用group_by是正确的。创建第二个按日期汇总的数据集,然后将其合并回主数据集,这可能更简单。见下文

# Summarize each date based on number of Yellow/Red/None warnings
data_sum <- data %>%
group_by(date) %>%
summarize(
day_warning_none = length(which(warning == "None")),
day_warning_yellow = length(which(warning == "Yellow")),
day_warning_red = length(which(warning == "Red"))
) %>%
ungroup() %>%
# Create a summary  measure
mutate(
day_warning = case_when(
day_warning_red > 0 ~ "Red",
day_warning_yellow > 0 ~ "Yellow",
TRUE ~ "None"
)
)
head(data.sum)
date     day_warning_none day_warning_yellow day_warning_red day_warning
<fct>               <int>              <int>           <int> <chr>      
1 19991230                1                  3               1 Red        
2 19991231                4                  1               0 Yellow     
3 20000101                4                  0               1 Red        
4 20000102                4                  1               0 Yellow    
# Merge back in
data2 <- left_join(data, data_sum) %>%
arrange(date)
head(data2, 10)
location     date warning day_warning_none day_warning_yellow day_warning_red day_warning
1         A 19991230     Red                1                  3               1         Red
2         B 19991230  Yellow                1                  3               1         Red
3         C 19991230  Yellow                1                  3               1         Red
4         D 19991230    None                1                  3               1         Red
5         E 19991230  Yellow                1                  3               1         Red
6         A 19991231    None                4                  1               0      Yellow
7         B 19991231    None                4                  1               0      Yellow
8         C 19991231  Yellow                4                  1               0      Yellow
9         D 19991231    None                4                  1               0      Yellow
10        E 19991231    None                4                  1               0      Yellow

您可以在警告上创建计数,并根据计数创建标志:

data %>% 
group_by(date) %>% 
mutate(day_warning_type = case_when(
sum(warning == "Red") > 0 ~ "Red",
sum(warning == "Red") == 0 & sum(warning == "Yellow") > 0 ~ "Yellow",
TRUE ~ "None"
)) %>% 
ungroup()
# A tibble: 20 x 4
location date     warning day_warning_type
<fct>    <fct>    <fct>   <chr>           
1 A        19991230 Red     Red             
2 A        19991231 None    Yellow          
3 A        20000101 None    Red             
4 A        20000102 None    Yellow          
5 B        19991230 Yellow  Red             
6 B        19991231 None    Yellow          
7 B        20000101 Red     Red             
8 B        20000102 None    Yellow          
9 C        19991230 Yellow  Red             
10 C        19991231 Yellow  Yellow          
11 C        20000101 None    Red             
12 C        20000102 Yellow  Yellow          
13 D        19991230 None    Red             
14 D        19991231 None    Yellow          
15 D        20000101 None    Red             
16 D        20000102 None    Yellow          
17 E        19991230 Yellow  Red             
18 E        19991231 None    Yellow          
19 E        20000101 None    Red             
20 E        20000102 None    Yellow 

最新更新