r语言 - Dplyr:子集编号变量容易 - r - Dplyr: Subset numbered Variables easy 小贝子编程网

我有一个数据帧，其中包含带有编号名称的变量，例如'dtx1', 'dtx2' (...) 'dtx20'。我想选择一个带有"dplyr"的子集。如何选择所有变量？我不想写出每个名字： new_df <- select(old_df, dtx1, dtx2, (...), dtx20)我在这里和谷歌上尝试了多次搜索，但我可能没有正确的 vokabulary。

如果你知道从哪里到哪里要子集，你可以使用这样的东西：

DF <- dplyr::select(DF, -c(dtxN:dtxM))` #Being N and M the numbers.

如果你不知道你想要消除的那些，但你们都有一个共同的名字，借用@Mateusz1981：

DF <- DF[,- grep("dtx",colnames))]

假设我们有一个虚拟数据框old_df：

  dtx1 dtx20 dtx d1tx
1    0     0   0    1
2    1     2   0    2

如果您只想保留在字符串末尾编号的列，您可以使用 dplyr 来执行此操作：

library(dplyr)
new_df <- select(old_df, matches("[0-9]+$"))

输出：

  dtx1 dtx20
1    0     0
2    1     2

它基本上匹配列名末尾的任何类型的数字。

如果你想要所有包含特定字符串的变量，你也可以使用 contains：

new_df = old_df %>% 
    select(contains("dtx"))

dplyr::num_range()可能是一个不错的选择，以及dplyr::starts_with()，具体取决于您要保留的内容。

df1 <- data.frame(foo=1,dtx1 = 2, dtx2 = 3, bar = 4, dtx3 = 5, dtx4 = 6)
df1
#   foo dtx1 dtx2 bar dtx3 dtx4
# 1   1    2    3   4    5    6
library(dplyr)
select(df1, num_range("dtx",1:3))
#   dtx1 dtx2 dtx3
# 1    2    3    5
select(df1, starts_with("dtx"))
#   dtx1 dtx2 dtx3 dtx4
# 1    2    3    5    6

这不太安全：

select(df1, dtx1:dtx4)
#   dtx1 dtx2 bar dtx3 dtx4
# 1    2    3   4    5    6

r语言 - Dplyr:子集编号变量容易

相关内容

最新更新

热门标签：