r语言 - 按列名和常量列名的向量筛选数据帧



这当然很容易,但对于我的生活,我找不到正确的语法。

我想保留所有"ID_"列,无论列数和附加的数字如何,并按常量名称保留其他列。

类似于以下命令不起作用的东西(每次都在重新创建的数据上(:

###Does not work, but shows what I am trying to do
testdf1 <- df1[,c(paste(idvec, collapse="','"),"ConstantNames_YESwant")]

重新创建的数据:

rand <- sample(1:2, 1)
if(rand==1){
  df1 <- data.frame(
    ID_0=0,
    ID_1=1,
    ID_2=11,
    ID_3=111,
    LotsOfColumnsWithVariousNames_NOwant="unwanted_data",
    ConstantNames_YESwant="wanted_data",
    stringsAsFactors = FALSE
  )
  desired.df1 <- data.frame(
    ID_0=0,
    ID_1=1,
    ID_2=11,
    ID_3=111,
    ConstantNames_YESwant="wanted_data",
    stringsAsFactors = FALSE
  )
}
if(rand==2){
  df1 <- data.frame(
    ID_0=0,
    ID_1=1,
    LotsOfColumnsWithVariousNames_NOwant="unwanted_data",
    ConstantNames_YESwant="wanted_data",
    stringsAsFactors = FALSE
  )
  desired.df1 <- data.frame(
    ID_0=0,
    ID_1=1,
    ConstantNames_YESwant="wanted_data",
    stringsAsFactors = FALSE
  )
}

这是你想要的吗?

library(tidyverse)
df1 %>% 
  select(matches("ID_*"), ConstantNames_YESwant)
df1 %>% 
  select(starts_with("ID"), ConstantNames_YESwant)
# ID_0 ID_1 ConstantNames_YESwant
# 1    0    1           wanted_data

在基数 R 中,你可以做

#Get all the ID columns
idvec <- grep("ID", colnames(df1), value = TRUE)
#Select ID columns and the constant names you want. 
df1[c(idvec, "ConstantNames_YESwant")]
#  ID_0 ID_1 ConstantNames_YESwant
#1    0    1           wanted_data

最新更新