如何按一个降序列和一个升序列对数据帧排序

我有一个数据帧，它看起来像这样:

    P1  P2  P3  T1  T2  T3  I1  I2
1   2   3   5   52  43  61  6   "b"
2   6   4   3   72  NA  59  1   "a"
3   1   5   6   55  48  60  6   "f"
4   2   4   4   65  64  58  2   "b"

我想对它按降序按I1排序，对I1中相同值的行按升序按I2排序，得到顺序为1 3 4 2的行。但是order函数似乎只有一个decreasing参数，对于所有的排序向量，它是TRUE或FALSE。如何使排序正确?

我使用这段代码生成所需的输出。这就是你要找的吗?

rum <- read.table(textConnection("P1  P2  P3  T1  T2  T3  I1  I2
2   3   5   52  43  61  6   b
6   4   3   72  NA  59  1   a
1   5   6   55  48  60  6   f
2   4   4   65  64  58  2   b"), header = TRUE)
rum$I2 <- as.character(rum$I2)
rum[order(rum$I1, rev(rum$I2), decreasing = TRUE), ]
  P1 P2 P3 T1 T2 T3 I1 I2
1  2  3  5 52 43 61  6  b
3  1  5  6 55 48 60  6  f
4  2  4  4 65 64 58  2  b
2  6  4  3 72 NA 59  1  a

我使用rank:

rum <- read.table(textConnection("P1  P2  P3  T1  T2  T3  I1  I2
2   3   5   52  43  61  6   b
6   4   3   72  NA  59  1   a
1   5   6   55  48  60  6   f
2   4   4   65  64  58  2   b
1   5   6   55  48  60  6   c"), header = TRUE)
> rum[order(rum$I1, -rank(rum$I2), decreasing = TRUE), ]
  P1 P2 P3 T1 T2 T3 I1 I2
1  2  3  5 52 43 61  6  b
5  1  5  6 55 48 60  6  c
3  1  5  6 55 48 60  6  f
4  2  4  4 65 64 58  2  b
2  6  4  3 72 NA 59  1  a

恐怕Roman Luštrik的答案是错误的。它对这个输入的作用是偶然的。例如，考虑它在非常相似的输入上的输出(在I2列中增加了与原始第3行相似的带有"c"的行):

rum <- read.table(textConnection("P1  P2  P3  T1  T2  T3  I1  I2
2   3   5   52  43  61  6   b
6   4   3   72  NA  59  1   a
1   5   6   55  48  60  6   f
2   4   4   65  64  58  2   b
1   5   6   55  48  60  6   c"), header = TRUE)
rum$I2 <- as.character(rum$I2)
rum[order(rum$I1, rev(rum$I2), decreasing = TRUE), ]
  P1 P2 P3 T1 T2 T3 I1 I2
3  1  5  6 55 48 60  6  f
1  2  3  5 52 43 61  6  b
5  1  5  6 55 48 60  6  c
4  2  4  4 65 64 58  2  b
2  6  4  3 72 NA 59  1  a

这不是期望的结果:I2的前三个值是f b c而不是b c f，这是意料之中的，因为二级排序是I2升序排序。

要得到I2的相反顺序，您希望大的值很小，反之亦然。对于数值乘以-1就可以了，但对于字符就有点棘手了。字符/字符串的一般解决方案是遍历因子，反转级别(使大值变小，小值变大)并将因子更改回字符:

rum <- read.table(textConnection("P1  P2  P3  T1  T2  T3  I1  I2
2   3   5   52  43  61  6   b
6   4   3   72  NA  59  1   a
1   5   6   55  48  60  6   f
2   4   4   65  64  58  2   b
1   5   6   55  48  60  6   c"), header = TRUE)
f=factor(rum$I2)
levels(f) = rev(levels(f))
rum[order(rum$I1, as.character(f), decreasing = TRUE), ]
  P1 P2 P3 T1 T2 T3 I1 I2
1  2  3  5 52 43 61  6  b
5  1  5  6 55 48 60  6  c
3  1  5  6 55 48 60  6  f
4  2  4  4 65 64 58  2  b
2  6  4  3 72 NA 59  1  a

设df为包含两个字段A和B的数据帧

情况1:如果您的字段A和B是数字

df[order(df[,1],df[,2]),] - sorts fields A and B in ascending order
df[order(df[,1],-df[,2]),] - sorts fields A in ascending and B in descending order
优先考虑A。

情况2:如果字段A或B是非数字，则表示因子或字符

在我们的例子中，如果B是字符，我们想按倒序排序
df[order(df[,1],-as.numeric(as.factor(df[,2]))),] -> this sorts field A(numerical) in ascending and field B(character) in descending.
优先考虑a。

The idea is that you can apply -sign in order function ony on numericals. So for sorting character strings in descending order you have to coerce them to numericals.

    library(dplyr)
    library(tidyr)
    #supposing you want to arrange column 'c' in descending order and 'd' in ascending order. name of data frame is df
    ## first doing descending
    df<-arrange(df,desc(c))
    ## then the ascending order of col 'd;
    df <-arrange(df,d)

默认排序是稳定的，所以我们排序两次:首先按次要键，然后按主要键

rum1 <- rum[order(rum$I2, decreasing = FALSE),]
rum2 <- rum1[order(rum1$I1, decreasing = TRUE),]

简写:

rum[order(rum$I1, -rum$I2, decreasing = TRUE), ]

rum[order(rum$T1, -rum$T2 ), ]

正确的做法是:

rum[order(rum$T1, rum$T2, decreasing=c(T,F)), ]

在@dudusan的示例中，您还可以反转I1的顺序，然后升序排序:

> rum <- read.table(textConnection("P1  P2  P3  T1  T2  T3  I1  I2
+   2   3   5   52  43  61  6   b
+   6   4   3   72  NA  59  1   a
+   1   5   6   55  48  60  6   f
+   2   4   4   65  64  58  2   b
+   1   5   6   55  48  60  6   c"), header = TRUE)
> f=factor(rum$I1)   
> levels(f) <- sort(levels(f), decreasing = TRUE)
> rum[order(as.character(f), rum$I2), ]
  P1 P2 P3 T1 T2 T3 I1 I2
1  2  3  5 52 43 61  6  b
5  1  5  6 55 48 60  6  c
3  1  5  6 55 48 60  6  f
4  2  4  4 65 64 58  2  b
2  6  4  3 72 NA 59  1  a
>

这看起来有点短，你不颠倒I2的顺序两次

你可以使用神奇的dplyr包有一个函数叫做arrange。您只需根据所选择的层次结构设置要排序的数据框架和列。默认为升序。但是如果你想按降序排列，你可以使用desc.

rum <- read.table(textConnection("P1 P2 P3 T1 T2 T3 I1 I22 . b6 4 3 72 NA 59 1 a1 5 6 55 48 60 6 f2 4 4 65 64 58 2 ")， header = TRUE)

库(dplyr)
安排(朗姆酒,desc (I1), I2)

一般情况下，xtfrm()是获得一个数值向量的泛型函数像给定的输入向量一样排序。递减排序可以由用xtfrm()的负值排序。例:这正是如何做的。dplyr的desc()已实现)

例如，对于问题中的数据:

df <- read.table(text = "
P1  P2  P3  T1  T2  T3  I1  I2
2   3   5   52  43  61  6   b
6   4   3   72  NA  59  1   a
1   5   6   55  48  60  6   f
2   4   4   65  64  58  2   b
", header = TRUE)
df[order(-xtfrm(df$I1), df$I2), ]
#>   P1 P2 P3 T1 T2 T3 I1 I2
#> 1  2  3  5 52 43 61  6  b
#> 3  1  5  6 55 48 60  6  f
#> 4  2  4  4 65 64 58  2  b
#> 2  6  4  3 72 NA 59  1  a

这种方法可以推广到基R函数来排序也接受一个向量值的decreasing论点。从我的回答这个最近的问题:

sortdf <- function(x, by = colnames(x), decreasing = FALSE) {
  x[do.call(order, Map(sortproxy, x[by], decreasing)), , drop = FALSE]
}
sortproxy <- function(x, decreasing = FALSE) {
  as.integer((-1)^as.logical(decreasing)) * xtfrm(x)
}

对于当前的示例数据，我们(当然)得到:

sortdf(df, by = c("I1", "I2"), decreasing = c(TRUE, FALSE))
#>   P1 P2 P3 T1 T2 T3 I1 I2
#> 1  2  3  5 52 43 61  6  b
#> 3  1  5  6 55 48 60  6  f
#> 4  2  4  4 65 64 58  2  b
#> 2  6  4  3 72 NA 59  1  a

相关内容

最新更新

热门标签：