计算每日整洁数据的列变化量

  • 本文关键字:变化 数据 每日 计算 r
  • 更新时间 :
  • 英文 :


公司每天为category_1和category_2创建一个值。新公司可能会在12月25日出现时中途加入调查。这是三天的数据。所以,两个间隔:12月24-25日和12月25-26日。

问题对于每个类别,在3天内有多少增加/减少/没有变化?例如,在第一类中,A从2变为1,B从3变为4,等等。

手写得到:

cat1 - Up: 2, Down: 5, No change: 2

cat2 - Up: 6, Down: 2, No change: 1

如何计算R脚本中上升/下降/没有变化的数量?

library("tidyverse")
d1 <- as.Date("2022-12-24")
d2 <- as.Date("2022-12-25")
d3 <- as.Date("2022-12-26")
df <- tibble(
company = c(LETTERS[1:4], LETTERS[1:5], LETTERS[1:5]),
cat1 = c(2, 3, 4, 5, 1, 4, 5, 3, 2, 1, 4, 4, 2, 1),
cat2 = c(6, 7, 8, 9, 5, 5, 9, 10, 11, 6, 5, 10, 12, 13),
date = c(rep(d1, 4), rep(d2, 5), rep(d2, 5))
)
df

使用dplyr的一种方法,假设数据是有序的。注:我把日期3的错别字改成了d3

library(dplyr)
df %>% 
group_by(company) %>% 
mutate(cat1_change = cat1 - lag(cat1), cat2_change = cat2 - lag(cat2)) %>% 
ungroup() %>% 
summarize(type = c("up", "down", "no-change"), 
across(ends_with("change"), ~ 
c(sum(.x > 0, na.rm=T), sum(.x < 0, na.rm=T), sum(.x == 0, na.rm=T))))
# A tibble: 3 × 3
type      cat1_change cat2_change
<chr>           <int>       <int>
1 up                  2           6
2 down                5           2
3 no-change           2           1

df <- structure(list(company = c("A", "B", "C", "D", "A", "B", "C", 
"D", "E", "A", "B", "C", "D", "E"), cat1 = c(2, 3, 4, 5, 1, 4, 
5, 3, 2, 1, 4, 4, 2, 1), cat2 = c(6, 7, 8, 9, 5, 5, 9, 10, 11, 
6, 5, 10, 12, 13), date = structure(c(19350, 19350, 19350, 19350, 
19351, 19351, 19351, 19351, 19351, 19352, 19352, 19352, 19352, 
19352), class = "Date")), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -14L))

一个选项与data.table-由company分组,在'cat'列上循环,获得相邻元素的diff,转换为sign,并与factorlabels,melt重命名为长格式,并与dcast重塑为'宽'格式

library(data.table)
dcast(melt(setDT(df)[, lapply(.SD, (x) factor(sign(diff(x)), 
levels = c(-1, 0, 1), labels = c("down", "no-change", "up"))),
company, .SDcols = patterns("^cat")], id.var = "company", 
value.name = "type"), type ~ paste0(variable, "_change"), length)

与产出

type cat1_change cat2_change
1:      down           5           2
2: no-change           2           1
3:        up           2           6

最新更新