我有一个表,它看起来像这样:
df1 <- data.frame(
"seqid" = c("12", "12", "13", "12", "12", "15"),
"source" = c("star", "star", "star", "star", "star", "star"),
"type" = c("CDS", "CDS", "CDS", "intron", "CDS", "intron"),
"start" = c("15", "21", "23", "35", "45", "60"),
"end" = c("70", "80", "86", "45", "67", "88"),
"attributes" = c("ENSOCUT00000011013", "ENSOCUT00000064484",
"ENSOCUT00000013302",
"ENSOCUT00000010968", "ENSOCUT00000010968", "ENSOCUT00000060283"),
stringsAsFactors = F,check.names=FALSE)
seqid | 源 | 类型开始 | 结束 | 属性 |
---|---|---|---|---|
12 | 星 | CDS15 | 70 | ENSOCUT00000011013[/tr>|
12 | 星 | CDS21 | 80 | ENSOCUT00000064484|
12 | 星 | CDS23 | 86 | ENSOCUT00000013302[/tr>|
12 | 恒星 | 内含子35 | 45 | ENSOCUT00000010968|
12 | 星 | CDS>45 | 67 | ENSOCUT00000010968|
12 | 恒星 | 内含子>88 | ENSOCUT00000060283
df1[c(1,2,3,5),]
通常情况下,在数据的括号中选择编号的行/列。帧df:
df[rows_selected_go_here, columns_selected_go_here]
我假设您只需要来自df
的那些条目,其中Type
(字符串(等于CDS
library(tidyverse)
df <- mutate(df, TOBINCL= 0)
df$TOBINCL[grepl("^CDS$", df$Type, ignore.case = TRUE)] <- 1
mynewdf <- df[(df$TOBINCL==1) , ]