Python Pyparsing Located vs locatedExpr

我正在将pyparsing版本2中的一些模式更改为pyparsing版本3。

我用来解析的示例文件的内容

this is a sample page to test parsing
line 0001 line 1
line 0002 line 2

用于创建解析器的模式

p.Literal('line ') + p.Regex(r'(?P<abc>d+)') + p.SkipTo(p.LineEnd().suppress())

当我使用版本2中的locatedExpr时，我得到以下输出

{'locn_start': 55, 'abc': '0002', 'value': ['line ', '0002', ' line 2'], 'locn_end': 71}

当我使用版本3中的locate时，对于相同的模式

，我得到以下输出

{'locn_start': 55, 'value': {'abc': '0002'}, 'locn_end': 71}

但是，如果我像下面的例子一样从模式中删除命名的捕获组

[p.Literal('line ') + p.Regex(r'd+')] + p.SkipTo(p.LineEnd().suppress())

得到与locatedExpr

相同的输出

{'locn_start': 55, 'value': ['line ', '0002', ' line 2'], 'locn_end': 71}

然而，我喜欢在我的解析中分组信息，我想知道是否有人知道Located和locatedExpr之间的区别

在上述所有情况下，我都使用parse_with_tabs

我认为您可能正在使用as_dict()来查看解析结果的内容。as_dict()不会在结果中显示未命名的元素。请改用dump()法。Pyparsing的run_tests方法使用dump()来显示解析结果:

import pyparsing as p
tests = """
line 0002 line 2
"""
parser = p.Literal('line ') + p.Regex(r'(?P<abc>d+)') + p.SkipTo(p.LineEnd().suppress())
p.Located(parser).run_tests(tests)
p.locatedExpr(parser).run_tests(tests)
parser = p.Literal('line ') + p.Regex(r'd+') + p.SkipTo(p.LineEnd().suppress())
p.Located(parser).run_tests(tests)
p.locatedExpr(parser).run_tests(tests)

打印

line 0002 line 2
[0, ['line ', '0002', 'line 2'], 16]
- locn_end: 16
- locn_start: 0
- value: ['line ', '0002', 'line 2']
- abc: '0002'
[0]:
0
[1]:
['line ', '0002', 'line 2']
- abc: '0002'
[2]:
16
line 0002 line 2
[[0, 'line ', '0002', 'line 2', 16]]
[0]:
[0, 'line ', '0002', 'line 2', 16]
- abc: '0002'
- locn_end: 16
- locn_start: 0
- value: ['line ', '0002', 'line 2']
line 0002 line 2
[0, ['line ', '0002', 'line 2'], 16]
- locn_end: 16
- locn_start: 0
- value: ['line ', '0002', 'line 2']
[0]:
0
[1]:
['line ', '0002', 'line 2']
[2]:
16
line 0002 line 2
[[0, 'line ', '0002', 'line 2', 16]]
[0]:
[0, 'line ', '0002', 'line 2', 16]
- locn_end: 16
- locn_start: 0
- value: ['line ', '0002', 'line 2']

新的Located类在报告解析值的方式上更加一致，无论它是否包含任何命名项或regex组。

相关内容

最新更新

热门标签：