如何按部分字符串(或第一个单词)对数据帧进行排序



我对R很陌生,我正在自学基本操作。

我想让步如下:

County   Population
ACounty, Alabama   106242
BCounty, Alabama   362845
ACounty, Texas   242342
BCounty, Texas   293729

我试过:

df<-df %>% arrange(County)
view(df)

最终为:

County   Population
ACounty, Alabama   106242
ACounty, Texas   242342
BCounty, Alabama   362845
BCounty, Texas   293729

您可以根据州划分县和州以及arrange数据。

library(dplyr)
library(tidyr)
df %>%
separate(County, c('County', 'State'), sep = ",\s*") %>%
arrange(State) %>%
unite(County, County, State, sep = ",")

在基本R中,您可以通过删除逗号之前的所有内容来只保留状态信息,并使用order按状态排列数据。

df[order(sub('.*,', '', df$County)), ]

我们可以在不拆分或合并的情况下做到这一点

library(dplyr)
library(stringr)
df1 %>% 
arrange(str_remove(County, ",.*"))
#            County Population
#1 ACounty, Alabama     106242
#2   ACounty, Texas     242342
#3 BCounty, Alabama     362845
#4   BCounty, Texas     293729

数据

df1 <- structure(list(County = c("ACounty, Alabama", "BCounty, Alabama", 
"ACounty, Texas", "BCounty, Texas"), Population = c(106242L, 
362845L, 242342L, 293729L)), class = "data.frame", row.names = c(NA, 
-4L))

最新更新