r语言 - 聚合和折叠基于向量，同时保持顺序 - r - Aggregate and collapse a vector based while maintaing order 小贝子编程网

我有一个数据帧如下：

+------+-----+----------+
| from | to  | priority |
+------+-----+----------+
|    1 |   8 |        1 |
|    2 |   6 |        1 |
|    3 |   4 |        1 |
|    4 |   5 |        3 |
|    5 |   6 |        4 |
|    6 |   2 |        5 |
|    7 |   8 |        2 |
|    4 |   3 |        5 |
|    2 |   1 |        1 |
|    6 |   6 |        4 |
|    1 |   7 |        5 |
|    8 |   4 |        6 |
|    9 |   5 |        3 |
+------+-----+----------+

我的目标是根据 from 列对"to"列进行分组，但如果变量已经存在于任一列中，我不想进一步考虑它们此外，总优先级将是所有组优先级的总和

因此，生成的数据帧如下所示：

+------+------+----------------+
| from |  to  | Total Priority |
+------+------+----------------+
|    1 | 8, 7 |              6 |
|    2 |    6 |              1 |
|    3 |    4 |              1 |
|    9 |    5 |              3 |
+------+------+----------------+

另外，我想在分组时保持与原始表相同的顺序

我能够使用"拆分堆栈形状"包折叠 from 列，如下所示

library(splitstackshape)
cSplit(df, 'to', sep = ','
+        , direction = 'long')[, .(to = toString(unique(to)))
+                              , by = from]

这确实引入了重复值我想知道是否有办法使用任何其他软件包获得所需的结果

使用最后注释中可重现显示DF，按from给出DF2排序，然后遍历其行，删除任何重复的行。我们在这里需要一个循环，因为每次删除都取决于先前的删除。最后总结结果。

library(dplyr)
DF2 <- arrange(DF, from)
i <- 1
while(i <= nrow(DF2)) {
ix <- seq_len(i-1)
dup <- with(DF2, (to[i] %in% c(to[ix], from[ix])) | (from[i] %in% to[ix]))
if (dup) DF2 <- DF2[-i, ] else i <- i + 1
}
DF2 %>%
group_by(from) %>%
summarize(to = toString(to), priority = sum(priority)) %>%
ungroup

给：

# A tibble: 4 x 3
from to    priority
<int> <chr>    <int>
1     1 8, 7         6
2     2 6            1
3     3 4            1
4     9 5            3

注意

Lines <- "from | to  | priority
1 |   8 |        1
2 |   6 |        1
3 |   4 |        1
4 |   5 |        3
5 |   6 |        4
6 |   2 |        5
7 |   8 |        2
4 |   3 |        5
2 |   1 |        1
6 |   6 |        4
1 |   7 |        5
8 |   4 |        6
9 |   5 |        3"
DF <- read.table(text = Lines, header = TRUE, sep = "|", strip.white = TRUE)

目前还不清楚您如何尝试创建组，但这至少可以让您进入正确的球场：

library(tidyverse)
df <- tribble(~from, ~to, ~priority,
1,8,1,
2,6,1,
3,4,1,
4,5,3,
5,6,4,
6,2,5,
7,8,2,
4,3,5,
2,1,1,
6,6,4,
1,7,5,
8,4,6,
9,5,3)
df %>%
group_by(from) %>%
summarise(to = toString(to),
`Total Priority` = sum(priority, na.rm=T))

您的结果将是：

# A tibble: 9 x 3
from to    `Total Priority`
<dbl> <chr>            <dbl>
1     1 8, 7                 6
2     2 6, 1                 2
3     3 4                    1
4     4 5, 3                 8
5     5 6                    4
6     6 2, 6                 9
7     7 8                    2
8     8 4                    6
9     9 5                    3

r语言 - 聚合和折叠基于向量，同时保持顺序

注意

相关内容

最新更新

热门标签：