我有一个文本文件,其中包含许多来自研究分析的不同输出部分。 文本文件如下所示...
Zone 1
Dist. Time Amb. Time Ster. Time Vert. Vert. Zone Zone
Tr.(cm) Amb. Cnts. Ster. Cnts. Rest. Cnts. Time Entries Time
======= ============ ====== ============ ====== ============ ====== ============ ========== ============
626.29 000:00:29.90 480 000:00:05.25 52 000:00:24.85 11 000:00:11.75 1 000:01:00.00
489.99 000:00:23.20 401 000:00:07.30 75 000:00:29.45 5 000:00:11.65 0 000:01:00.00
-----------------------------------------------------------------------------------------------------
Zone Totals
Dist. Time Amb. Time Ster. Time Vert. Vert. Zone Zone
Tr.(cm) Amb. Cnts. Ster. Cnts. Rest. Cnts. Time Entries Time
======= ============ ====== ============ ====== ============ ====== ============ ========== ============
5661.08 000:04:39.30 4360 000:00:55.35 572 000:04:25.35 81 000:02:23.85 1 000:10:00.00
======= ============ ====== ============ ====== ============ ====== ============ ==========
-----------------------------------------------------------------------------------------------------
Block Summary
-------------
Dist. Time Amb. Time Ster. Time Vert. Vert. Zone
Trav.(cm) Amb. Cnts. Ster. Cnts. Rest. Cnts. Time Entries
========== ============ ====== ============ ====== ============ ====== ============ ==========
626.29 000:00:29.90 480 000:00:05.25 52 000:00:24.85 11 000:00:11.75 1
489.99 000:00:23.20 401 000:00:07.30 75 000:00:29.45 5 000:00:11.65 0
我怎样才能只 grep 区域总计部分?更具体地说,我只想从"区域总数"部分中获取"Dist. Tr."编号。但是我会很高兴得到整个部分,然后在需要的地方裁剪线条。
我在想这样的事情...
dist_move = apply(data.frame(grep("Totals",dat)+1, grep("Block",dat)-2),1,function(x) (dat[x[1]:x[2]]))
但它只是抓住了所有的线
假设在末尾的注释中创建的文件,读入,找到Zone Totals
行并读取下一个第 5 行的第一个数字。 不使用任何包,它适用于单个和多个区域总计部分。
L <- trimws(readLines("test-file.dat"))
scan(text = sub(" .*", "", L[grep("Zone Totals", L) + 5]), quiet = TRUE)
## [1] 5661.08
或者这个稍短的变化:
L <- readLines("test-file.dat")
read.table(text = L[grep("Zone Totals", L) + 5])[[1]]
## [1] 5661.08
注意
Lines <- "Zone 1
Dist. Time Amb. Time Ster. Time Vert. Vert. Zone Zone
Tr.(cm) Amb. Cnts. Ster. Cnts. Rest. Cnts. Time Entries Time
======= ============ ====== ============ ====== ============ ====== ============ ========== ============
626.29 000:00:29.90 480 000:00:05.25 52 000:00:24.85 11 000:00:11.75 1 000:01:00.00
489.99 000:00:23.20 401 000:00:07.30 75 000:00:29.45 5 000:00:11.65 0 000:01:00.00
-----------------------------------------------------------------------------------------------------
Zone Totals
Dist. Time Amb. Time Ster. Time Vert. Vert. Zone Zone
Tr.(cm) Amb. Cnts. Ster. Cnts. Rest. Cnts. Time Entries Time
======= ============ ====== ============ ====== ============ ====== ============ ========== ============
5661.08 000:04:39.30 4360 000:00:55.35 572 000:04:25.35 81 000:02:23.85 1 000:10:00.00
======= ============ ====== ============ ====== ============ ====== ============ ==========
-----------------------------------------------------------------------------------------------------
Block Summary
-------------
Dist. Time Amb. Time Ster. Time Vert. Vert. Zone
Trav.(cm) Amb. Cnts. Ster. Cnts. Rest. Cnts. Time Entries
========== ============ ====== ============ ====== ============ ====== ============ ==========
626.29 000:00:29.90 480 000:00:05.25 52 000:00:24.85 11 000:00:11.75 1
489.99 000:00:23.20 401 000:00:07.30 75 000:00:29.45 5 000:00:11.65
"
cat(Lines, file = "test-file.dat")
一种稍微通用的方法(如果"区域总计"中有多个行(,使用stringr
library(stringr)
# Split into lines
lines <- unlist(strsplit(myText, "n"))
# Find bounds of target section
sectStart <- str_which(lines, "Zone Totals")
sectStop <- str_which(lines[seq(sectStart+1, length(lines))],
"-----")[1] + sectStart
# subset data rows and extract first entry
dist_move <- str_subset(lines[seq(sectStart, sectStop)], "^[:digit:]") %>%
str_extract("^[:digit:]+\.{0,1}[:digit:]*")
注意
myText <-
"Zone 1
Dist. Time Amb. Time Ster. Time Vert. Vert. Zone Zone
Tr.(cm) Amb. Cnts. Ster. Cnts. Rest. Cnts. Time Entries Time
======= ============ ====== ============ ====== ============ ====== ============ ========== ============
626.29 000:00:29.90 480 000:00:05.25 52 000:00:24.85 11 000:00:11.75 1 000:01:00.00
489.99 000:00:23.20 401 000:00:07.30 75 000:00:29.45 5 000:00:11.65 0 000:01:00.00
-----------------------------------------------------------------------------------------------------
Zone Totals
Dist. Time Amb. Time Ster. Time Vert. Vert. Zone Zone
Tr.(cm) Amb. Cnts. Ster. Cnts. Rest. Cnts. Time Entries Time
======= ============ ====== ============ ====== ============ ====== ============ ========== ============
5661.08 000:04:39.30 4360 000:00:55.35 572 000:04:25.35 81 000:02:23.85 1 000:10:00.00
======= ============ ====== ============ ====== ============ ====== ============ ==========
-----------------------------------------------------------------------------------------------------
Block Summary
-------------
Dist. Time Amb. Time Ster. Time Vert. Vert. Zone
Trav.(cm) Amb. Cnts. Ster. Cnts. Rest. Cnts. Time Entries
========== ============ ====== ============ ====== ============ ====== ============ ==========
626.29 000:00:29.90 480 000:00:05.25 52 000:00:24.85 11 000:00:11.75 1
489.99 000:00:23.20 401 000:00:07.30 75 000:00:29.45 5 000:00:11.65 0"