如何选择一系列行，这些行以给定的文本开始和结束为条件

我有一个数据框架，它的格式看起来像这样。我想清理df，留下一定范围的行，从第1列显示"country"并在结束前两行写"结束"在第一列。我需要这样做，因为稍后我必须将df与来自其他时期的相同类型的工作表的其他dfs绑定在一起，因此工作表之间的范围是不同的。

tbody> <<tr>国家数量中国日本美国

A列	B列	C列
-	-	-
		年
	1	2018
	2	2019
	3	2019

start_position <- which(df[,1]=="country")
end_position <- which(df[,1]=="end")
# change the third line
#         ___                               _       
# df<- df[df(start_position:(end_position-2)),]
#
df <- df[ start_position:(end_position-2),]

给你


library(tibble)
df <- tribble(
~ColumnA, ~ColumnB, ~ColumnC,
"-", "-", "-",
"country", "number", "year",
"china", "1", "2018",
"japan", "2", "2019",
"usa", "3", "2019",
"end", "", ""
)
names_idx <- which(df[, 1] == "country")
end_idx <- which(df[, 1] == "end")
out <- df[(names_idx + 1):(end_idx - 1), ]
colnames(out) <- as.vector(as.matrix(df)[names_idx, ])
out

# A tibble: 3 × 3
country number year 
<chr>   <chr>  <chr>
1 china   1      2018 
2 japan   2      2019 
3 usa     3      2019

相关内容

最新更新

热门标签：