我想根据分数变量value
和附加group
变量重新排列数据。但是,根据group
排序应该是降序或升序。这些组由考试成绩(越高越好(和处理时间(越低越好(组成。
df <- data.frame(id = rep(1:4, 4),
value = rnorm(16, 5),
group = c(paste0("test", 1:3), "time0"))
df$value[seq(4,16, 4)] <- 1:4
> df %>% group_by(group) %>% arrange(group, desc(value))
# A tibble: 16 x 3
# Groups: group [4]
id value group
<int> <dbl> <fct>
1 3 6.06 test1
2 4 4.69 test1
3 1 4.32 test1
4 2 3.56 test1
5 4 5.96 test2
6 1 5.96 test2
7 3 4.43 test2
8 2 3.86 test2
9 3 6.28 test3
10 4 5.55 test3
11 2 4.59 test3
12 1 3.53 test3
13 4 4 time0
14 3 3 time0
15 2 2 time0
16 1 1 time0
所需的输出如下所示:
id value group
<int> <dbl> <fct>
1 3 6.06 test1
2 4 4.69 test1
3 1 4.32 test1
4 2 3.56 test1
5 4 5.96 test2
6 1 5.96 test2
7 3 4.43 test2
8 2 3.86 test2
9 3 6.28 test3
10 4 5.55 test3
11 2 4.59 test3
12 1 3.53 test3
13 4 1 time0
14 3 2 time0
15 2 3 time0
16 1 4 time0
我尝试使用arrange_if
但无法弄清楚。 任何帮助都非常感谢。
感谢您到目前为止的回答,它们同样有帮助!
编辑澄清:这与这个问题不同,因为排序不仅基于多列,而且还取决于列内特征。
这使得测试组中的行按降序排序,时间组中的行按升序排序。如果你想要相反,只需反转 -1 和 1。
df %>%
arrange(group, value*ifelse(grepl('time', group), 1, -1))
# id value group
# 1 1 6.358680 test1
# 2 1 6.100025 test1
# 3 1 4.844204 test1
# 4 1 3.622940 test1
# 5 2 5.763176 test2
# 6 2 4.897212 test2
# 7 2 4.585005 test2
# 8 2 3.529248 test2
# 9 3 5.387672 test3
# 10 3 4.835476 test3
# 11 3 4.605710 test3
# 12 3 4.521850 test3
# 13 4 1.000000 time0
# 14 4 2.000000 time0
# 15 4 3.000000 time0
# 16 4 4.000000 time0
这是另一个选项,当value
是字符时有效
df <- data.frame(id = rep(1:4, 4),
value = rnorm(16, 5),
group = c(paste0("test", 1:3), "time0"))
set.seed(2019)
df$value <- sample(letters, nrow(df), T)
df %>%
arrange(group, rank(value)*ifelse(grepl('time', group), 1, -1))
# id value group
# 1 1 u test1
# 2 1 f test1
# 3 1 c test1
# 4 1 b test1
# 5 2 s test2
# 6 2 p test2
# 7 2 f test2
# 8 2 b test2
# 9 3 v test3
# 10 3 u test3
# 11 3 s test3
# 12 3 h test3
# 13 4 a time0
# 14 4 q time0
# 15 4 q time0
# 16 4 r time0
我们可以做一个filter
来排除'time0'组,对数据集的其余部分进行arrange
,并与另一组组bind_rows
library(dplyr)
df %>%
filter(group != 'time0') %>%
arrange(group, desc(value)) %>%
bind_rows(., df %>%
filter(group == 'time0') %>%
arrange(value))
# id value group
#1 3 6.06 test1
#2 4 4.69 test1
#3 1 4.32 test1
#4 2 3.56 test1
#5 4 5.96 test2
#6 1 5.96 test2
#7 3 4.43 test2
#8 2 3.86 test2
#9 3 6.28 test3
#10 4 5.55 test3
#11 2 4.59 test3
#12 1 3.53 test3
#13 1 1.00 time0
#14 2 2.00 time0
#15 3 3.00 time0
#16 4 4.00 time0
此外,如果"值"可以是非数字的'
df %>%
arrange(group, desc(as.numeric(value)), is.na(as.numeric(value)))
数据
df <- structure(list(id = c(3L, 4L, 1L, 2L, 4L, 1L, 3L, 2L, 3L, 4L,
2L, 1L, 4L, 3L, 2L, 1L), value = c(6.06, 4.69, 4.32, 3.56, 5.96,
5.96, 4.43, 3.86, 6.28, 5.55, 4.59, 3.53, 4, 3, 2, 1), group = c("test1",
"test1", "test1", "test1", "test2", "test2", "test2", "test2",
"test3", "test3", "test3", "test3", "time0", "time0", "time0",
"time0")), class = "data.frame", row.names = c("1", "2", "3",
"4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15",
"16"))