什么是 python 代码相当于 linux 命令 grep -A?



如何使用python在文件中匹配的字符串后打印n行?

Linux Command grep

 abc@xyz:~/Desktop$ grep -A 10 'foo' bar.txt
      foo
      <shippingcost>
        <amount>3.19</amount>
        <currency>EUR</currency>
      </shippingcost>
      <shippingtype>Normal</shippingtype>
      <quality>GOOD</quality> 
      <unlimitedquantity>false</unlimitedquantity>
      <isrsl>N</isrsl> 
      <stock>1</stock>

此命令将在文件栏中匹配的字符串"foo"之后打印 10 行.txt

使用Python如何做同样的事情?

我尝试过:

import re
with open("bar.txt") as origin_file:
for line in origin_file:
    line= re.findall(r'foo', line)
    if line:
        print line

上面的 Python 代码给出了以下输出:

abc@xyz:~/Desktop$ python grep.py
['foo']

file对象(如origin_file(都是迭代器。您不仅可以使用

for line in origin_file:

但您也可以使用 next(origin_file) 从迭代器获取下一项。 实际上,您可以从for-loop中调用迭代器上的next

import re
# Python 2
with open("bar.txt") as origin_file:
    for line in origin_file:
        if re.search(r'foo', line):
            print line,
            for i in range(10):
                print next(origin_file),
# in Python 3, `print` is a function not a statement
# so the code would have to be change to something like
# with open("bar.txt") as origin_file:
#     for line in origin_file:
#         if re.search(r'foo', line):
#             print(line, end='')
#             for i in range(10):
#                 print(next(origin_file), end='')

如果没有 10 行额外的代码,上面的代码将引发StopIteration错误 找到最后一个foo后。要处理这种可能性,您可以使用itertools.islice 从迭代器中切出最多 10 个项目:

import re
import itertools as IT
with open("bar.txt") as origin_file:
    for line in origin_file:
        if re.search(r'foo', line):
            print line, 
            for line in IT.islice(origin_file, 10):
                print line,

现在代码将优雅地结束(不会引发StopIteration异常(,即使有 不是 foo 后的 10 行。

那是因为您分配给了行,并且您没有从文件对象中读取行,请将其更改为:

import re
with open("bar.txt") as origin_file:
for line in origin_file.readlines():
    found = re.findall(r'foo', line)
    if found:
        print line

最新更新