我有两种可能的数据集类型:
test6 <- data.frame(S=c("B","Z","B","Z","B","Z","B","B","B","Z","Z","Z"),w=c(1,1.2,1.3,2,0.9,0.95,1,1.5,1,1.1,0.8,1.3))
test5 <- data.frame(S=c("B","Z","B","Z","B","Z","B","B","Z","Z"),w=c(1,1.2,1.3,2,0.9,0.95,1,1.5,1.1,0.8))
我想命令他们得到最终结果,用于测试6:
S w
1 B 1.00
3 B 1.30
5 B 0.90
2 Z 1.20
4 Z 2.00
6 Z 0.95
7 B 1.00
8 B 1.50
9 B 1.00
10 Z 1.10
11 Z 0.80
12 Z 1.30
用于测试5:
S w
1 B 1.00
3 B 1.30
5 B 0.90
2 Z 1.20
4 Z 2.00
7 B 1.00
8 B 1.50
6 Z 0.95
9 Z 1.10
10 Z 0.80
因此,在测试6的情况下,得到一个交替排序,先是3B,然后是3Z,然后是3B,再是2Z,依此类推。我找到了一种方法:
library(groupdata2)
fold(test6, k = 2,method="n_dis",cat_col = "S")
它创建了组,然后我可以对这些组进行排序以获得这个结果,但这只适用于test6的情况,即每个组中的S类计数相同。有人有更好、更简单的想法吗?提前感谢!
(部分答案。(
对于交替-3秒来说,这并不太难:
ind <- ave(rep(1, nrow(test6)), test6$S, FUN = function(z) (seq_along(z)-1) %/% 3)
ind
# [1] 0 0 0 0 0 0 1 1 1 1 1 1
test6[order(ind, test6$S),]
# S w
# 1 B 1.00
# 3 B 1.30
# 5 B 0.90
# 2 Z 1.20
# 4 Z 2.00
# 6 Z 0.95
# 7 B 1.00
# 8 B 1.50
# 9 B 1.00
# 10 Z 1.10
# 11 Z 0.80
# 12 Z 1.30
对于test5
,同样的方法很接近,但3/2分组的顺序不同:
ind <- ave(rep(1, nrow(test5)), test5$S, FUN = function(z) (seq_along(z)-1) %/% 3)
ind
# [1] 0 0 0 0 0 0 1 1 1 1
test5[order(ind, test5$S),]
# S w
# 1 B 1.00
# 3 B 1.30
# 5 B 0.90
# 2 Z 1.20
# 4 Z 2.00
# 6 Z 0.95
# 7 B 1.00
# 8 B 1.50
# 9 Z 1.10
# 10 Z 0.80
您可以将cumsum
与rep
组合使用,以获得可在order
中使用的数字。
i <- test6$S == "B"
x <- integer(length(i))
x[i] <- cumsum(rep(c(2,0,0), length.out=sum(i))) - 1
x[!i] <- cumsum(rep(c(2,0,0), length.out=sum(!i)))
test6[order(x),]
# S w
#1 B 1.00
#3 B 1.30
#5 B 0.90
#2 Z 1.20
#4 Z 2.00
#6 Z 0.95
#7 B 1.00
#8 B 1.50
#9 B 1.00
#10 Z 1.10
#11 Z 0.80
#12 Z 1.30
i <- test5$S == "B"
x <- integer(length(i))
x[i] <- cumsum(rep(c(2,0,0,2,0), length.out=sum(i))) - 1
x[!i] <- cumsum(rep(c(2,0,2,0,0), length.out=sum(!i)))
test5[order(x),]
# S w
#1 B 1.00
#3 B 1.30
#5 B 0.90
#2 Z 1.20
#4 Z 2.00
#7 B 1.00
#8 B 1.50
#6 Z 0.95
#9 Z 1.10
#10 Z 0.80