R-获取空白后的字符串

我有一个具有以下格式的统计文件。

---------- Begin Simulation Statistics ----------
simSeconds                                   0.000500                       # Number of seconds simulated (Second)
simTicks                                    500000000                       # Number of ticks simulated (Tick)
finalTick                                   500000000                       # Number of ticks from beginning of simulation (restored from checkpoints and never reset) (Tick)
simFreq                                  1000000000000                       # The number of ticks per simulated second ((Tick/Second))
hostSeconds                                      7.36                       # Real time elapsed on the host (Second)
hostTickRate                                 67905191                       # The number of ticks simulated per host second (ticks/s) ((Tick/Second))
hostMemory                                     658020                       # Number of bytes of host memory used (Byte)
simInsts                                       956628                       # Number of instructions simulated (Count)
simOps                                        1634485                       # Number of ops (including micro ops) simulated (Count)
hostInstRate                                   129917                       # Simulator instruction rate (inst/s) ((Count/Second))
hostOpRate                                     221975                       # Simulator op (including micro ops) rate (op/s) ((Count/Second))
system.clk_domain.clock                          1000                       # Clock period in ticks (Tick)
system.cpu.numCycles                          1000001                       # Number of cpu cycles simulated (Cycle)
system.cpu.numWorkItemsStarted                      0                       # Number of work items this cpu started (Count)
system.cpu.numWorkItemsCompleted                    0                       # Number of work items this cpu completed (Count)
system.cpu.instsAdded                         1824678                       # Number of instructions added to the IQ (excludes non-spec) (Count)
system.cpu.nonSpecInstsAdded                       85                       # Number of non-speculative instructions added to the IQ (Count)
system.cpu.instsIssued                        1774419                       # Number of instructions issued (Count)
system.cpu.squashedInstsIssued                    817                       # Number of squashed instructions issued (Count)
system.cpu.squashedInstsExamined               190251                       # Number of squashed instructions iterated over during squash; mainly for profiling (Count)
system.cpu.squashedOperandsExamined            226242                       # Number of squashed operands that are examined and possibly removed from graph (Count)
system.cpu.squashedNonSpecRemoved                  37                       # Number of squashed non-spec instructions that were removed (Count)
system.cpu.numIssuedDist::samples              914073                       # Number of insts issued each cycle
system.cpu.numIssuedDist::mean               1.941222                       # Number of insts issued each cycle
system.cpu.numIssuedDist::stdev              2.095850                       # Number of insts issued each cycle
system.cpu.numIssuedDist::underflows                0      0.00%      0.00% # Number of insts issued each cycle
system.cpu.numIssuedDist::0                    403267     44.12%     44.12% # Number of insts issued each cycle
system.cpu.numIssuedDist::1                     63660      6.96%     51.08% # Number of insts issued each cycle"

我把它一行一行地读了一遍。对于每一行，我想使用str_extract来获得每个stat的名称，然后是第一个值。

对于我使用的前者，str_extract(line, "(\S+)")。对于后者，我尝试了str_extract(line, "\s+(\S+)")，但得到了值之前的空白。我如何才能将其更改为不包括该部分？

此外，是否存在更多的"；优雅的"；实现相同目标的方式？

使用

str_extract(test3, "(?<!\S|^)\S+")

请参阅正则表达式证明。

解释

--------------------------------------------------------------------------------
(?<!                     look behind to see if there is not:
--------------------------------------------------------------------------------
S                       non-whitespace (all but n, r, t, f,
and " ")
--------------------------------------------------------------------------------
|                        OR
--------------------------------------------------------------------------------
^                        the beginning of the string
--------------------------------------------------------------------------------
)                        end of look-behind
--------------------------------------------------------------------------------
S+                      non-whitespace (all but n, r, t, f,
and " ") (1 or more times (matching the
most amount possible))

相关内容

最新更新

热门标签：