奇怪的早期EOF终止Forward()parser_element



在仔细阅读和调试了pyparsing示例中的几个Forward()功能之后,我根据ISC Bind9/DHCP配置文件的需要将其中几个功能集拼凑在一起:

  • 将"!"符号推送/弹出到execStack中
  • Forward()
  • 重用parsing_common.ipv4_address

有一个EBNF(详见此Zytrax链接)我在这里苦苦挣扎:

address_match_list = element ; [ element; ... ]
element = [!] (ip [/prefix] | key key-name | "acl_name" | { address_match_list } )

我的最终(但失败的最佳)草案是:

element = Forward()
element <<= (
# Hide the exclamation so we can do deeper parse cleaner w/o clutter of '!'
(0, None) * Word('!') +
# Might be nice to do a bit of lookahead for '.', ':', 'key', and '"'
# | is matchFirst, not matchLongest
# ^ is matchLongest
(
ZeroOrMore(
(
# Typical pattern "1.2.3.4/24;"
(
Combine(
pyparsing_common.ipv4_address + '/' + Word(nums, max=3)
) + ';'
) ^                                        # Start: '999.999.999.999/99'
# Typical pattern "2.3.4.5;"
(pyparsing_common.ipv4_address + ';') ^    # Start: '999.999.999.999'
# Typical pattern "3210::1;"
(pyparsing_common.ipv6_address + ';') ^    # Start: 'XXXX:'
(Keyword('key') + Word(alphanums, max=63) + ';')
# Start: 'key <key-varname>'
)
) ^
# Typical pattern "{ 1.2.3.4; };"
ZeroOrMore('{' - element + '}' + ';')
).setParseAction(pushFirst)
).setParseAction(pushExclamation)

我跑了element.runTests()

element.runTests('2.2.2.2; { 3.3.3.3; };')
2.2.2.2; { 3.3.3.3; };
^
FAIL: Expected end of text, found '{'  (at char 9), (line:1, col:10)

匹配第一个元素后意外的"预期 EOF"是停止整个解析器的原因。

演示问题的独立代码片段。

#!/usr/bin/env python3
# EBNF detailed at http://www.zytrax.com/books/dns/ch7/address_match_list.html
from pyparsing import *
exprStack = []
def pushFirst(strg, loc, toks):
exprStack.append(toks[0])
def pushExclamation(strg, loc, toks):
for t in toks:
if t == '!':
exprStack.append('!')
else:
break
# Address_Match_List (AML)
# This AML combo is ordered very carefully so that longest pattern are tried firstly
#
# EBNF reiterated here:
#
#    address_match_list = element ; [ element; ... ]
#
#    element = [!] (ip [/prefix] | key key-name | "acl_name" | { address_match_list } )
#
element = Forward()
element <<= (
# Hide the exclamation so we can do deeper parse cleaner w/o clutter of '!'
(0, None) * Word('!') +
# Might be nice to do a bit of lookahead for '.', ':', 'key', and '"'
# | is matchFirst, not matchLongest
# ^ is matchLongest
(
ZeroOrMore(
(
# Typical pattern "1.2.3.4/24;"
(
Combine(
pyparsing_common.ipv4_address + '/' + Word(nums, max=3)
) + ';'
) ^                                        # Start: '999.999.999.999/99'
# Typical pattern "2.3.4.5;"
(pyparsing_common.ipv4_address + ';') ^    # Start: '999.999.999.999'
# Typical pattern "3210::1;"
(pyparsing_common.ipv6_address + ';') ^    # Start: 'XXXX:'
(
Keyword('key') + Word(alphanums, max=63) + ';'
)                                          # Start: 'key <key-variable-name>'
)
) ^
# Typical pattern "{ 1.2.3.4; };"
ZeroOrMore('{' + element + '}' + ';')
).setParseAction(pushFirst)
).setParseAction(pushExclamation)
element.setName('"element ;"')
element.setDebug()
result = element.runTests("""
123.123.123.123;
!210.210.210.210;
{ 234.234.234.234 };
2.2.2.2; { 3.3.3.3; };
{ 4.4.4.4; }; 5.5.5.5;
{ 6.6.6.6; 7.7.7.7; }; 8.8.8.8;
!{ 9.9.9.9; 10.10.10.10; };
12.12.12.12; !13.13.13.13;
14.14.14.14/15; 16.16.16.16; key MySha512Key;
17.17.17.17/18; { 19.19.19.19; }; key YourSha512Key; }
""")
import pprint
pp = pprint.PrettyPrinter(indent=4)
print("Result: ")
pp.pprint(result)

有效语法内容的测试运行

完整的element.runTests()输出:


123.123.123.123;
['123.123.123.123', ';']
!210.210.210.210;
['!', '210.210.210.210', ';']
{ 234.234.234.234 };
^
FAIL: exception raised in parse action  (at char 0), (line:1, col:1)
2.2.2.2; { 3.3.3.3; };
^
FAIL: Expected end of text, found '{'  (at char 9), (line:1, col:10)
{ 4.4.4.4; }; 5.5.5.5;
^
FAIL: Expected end of text, found '5'  (at char 14), (line:1, col:15)
{ 6.6.6.6; 7.7.7.7; }; 8.8.8.8;
^
FAIL: Expected end of text, found '8'  (at char 23), (line:1, col:24)
!{ 9.9.9.9; 10.10.10.10; };
['!', '{', '9.9.9.9', ';', '10.10.10.10', ';', '}', ';']
12.12.12.12; !13.13.13.13;
^
FAIL: Expected end of text, found '!'  (at char 13), (line:1, col:14)
14.14.14.14/15; 16.16.16.16; key MySha512Key;
['14.14.14.14/15', ';', '16.16.16.16', ';', 'key', 'MySha512Key', ';']
17.17.17.17/18; { 19.19.19.19; }; key YourSha512Key; }
^
FAIL: Expected end of text, found '{'  (at char 16), (line:1, col:17)

漂亮的打印结果是:

Result: 
(   False,
[   ('123.123.123.123;', (['123.123.123.123', ';'], {})),
('!210.210.210.210;', (['!', '210.210.210.210', ';'], {})),
(   '{ 234.234.234.234 };',
exception raised in parse action  (at char 0), (line:1, col:1)),
(   '2.2.2.2; { 3.3.3.3; };',
Expected end of text, found '{'  (at char 9), (line:1, col:10)),
(   '{ 4.4.4.4; }; 5.5.5.5;',
Expected end of text, found '5'  (at char 14), (line:1, col:15)),
(   '{ 6.6.6.6; 7.7.7.7; }; 8.8.8.8;',
Expected end of text, found '8'  (at char 23), (line:1, col:24)),
(   '!{ 9.9.9.9; 10.10.10.10; };',
(['!', '{', '9.9.9.9', ';', '10.10.10.10', ';', '}', ';'], {})),
(   '12.12.12.12; !13.13.13.13;',
Expected end of text, found '!'  (at char 13), (line:1, col:14)),
(   '14.14.14.14/15; 16.16.16.16; key MySha512Key;',
(['14.14.14.14/15', ';', '16.16.16.16', ';', 'key', 'MySha512Key', ';'], {})),
(   '17.17.17.17/18; { 19.19.19.19; }; key YourSha512Key; }',
Expected end of text, found '{'  (at char 16), (line:1, col:17))])
Process finished with exit code 0

我仍在234.234.234.234;3.3.3.3;上缓慢调试,所以我希望有人在我慢慢调试时会瞥一眼并说"它在那里"。

故意失败的语法的测试运行

更新:添加了故意失败的语法内容的测试代码:

result = element.runTests("""
20
!
key;
21;
{ 23 };
{ 24.24.24.24 };
{ 25.25.25.25; }
26.26.26.26
27.27.27.27; key
28.28.28.28; { key }
29.29.29.29, 30.30.30.30;
{ 31.31.31.31; 32.32.32.32; }
{ 33.33.33.33; 34.34.34.34; }; 35;
""", failureTests=True)
print("Result of failed contents: ")
pp.pprint(result)

失败内容的测试运行(漂亮的打印格式):

Result of failed contents: 
(   True,
[   ('20', exception raised in parse action  (at char 0), (line:1, col:1)),
('!', exception raised in parse action  (at char 0), (line:1, col:1)),
(   'key;',
exception raised in parse action  (at char 0), (line:1, col:1)),
('21;', exception raised in parse action  (at char 0), (line:1, col:1)),
(   '{ 23 };',
exception raised in parse action  (at char 0), (line:1, col:1)),
(   '{ 24.24.24.24 };',
exception raised in parse action  (at char 0), (line:1, col:1)),
(   '{ 25.25.25.25; }',
exception raised in parse action  (at char 0), (line:1, col:1)),
(   '26.26.26.26',
exception raised in parse action  (at char 0), (line:1, col:1)),
(   '27.27.27.27; key',
Expected end of text, found 'k'  (at char 13), (line:1, col:14)),
(   '28.28.28.28; { key }',
Expected end of text, found '{'  (at char 13), (line:1, col:14)),
(   '29.29.29.29, 30.30.30.30;',
exception raised in parse action  (at char 0), (line:1, col:1)),
(   '{ 31.31.31.31; 32.32.32.32; }',
exception raised in parse action  (at char 0), (line:1, col:1)),
(   '{ 33.33.33.33; 34.34.34.34; }; 35;',
Expected end of text, found '3'  (at char 31), (line:1, col:32))])
Process finished with exit code 0
Match "element ;" at loc 0(1,1)
Exception raised:exception raised in parse action  (at char 0), (line:1, col:1)
20
^
FAIL: exception raised in parse action  (at char 0), (line:1, col:1)
Match "element ;" at loc 0(1,1)
Exception raised:exception raised in parse action  (at char 0), (line:1, col:1)
!
^
FAIL: exception raised in parse action  (at char 0), (line:1, col:1)
Match "element ;" at loc 0(1,1)
Exception raised:exception raised in parse action  (at char 0), (line:1, col:1)
key;
^
FAIL: exception raised in parse action  (at char 0), (line:1, col:1)
Match "element ;" at loc 0(1,1)
Exception raised:exception raised in parse action  (at char 0), (line:1, col:1)
21;
^
FAIL: exception raised in parse action  (at char 0), (line:1, col:1)
Match "element ;" at loc 0(1,1)
Match "element ;" at loc 1(1,2)
Matched "element ;" -> []
Exception raised:exception raised in parse action  (at char 0), (line:1, col:1)
{ 23 };
^
FAIL: exception raised in parse action  (at char 0), (line:1, col:1)
Match "element ;" at loc 0(1,1)
Match "element ;" at loc 1(1,2)
Matched "element ;" -> []
Exception raised:exception raised in parse action  (at char 0), (line:1, col:1)
{ 24.24.24.24 };
^
FAIL: exception raised in parse action  (at char 0), (line:1, col:1)
Match "element ;" at loc 0(1,1)
Match "element ;" at loc 1(1,2)
Matched "element ;" -> ['25.25.25.25', ';']
Exception raised:exception raised in parse action  (at char 0), (line:1, col:1)
{ 25.25.25.25; }
^
FAIL: exception raised in parse action  (at char 0), (line:1, col:1)
Match "element ;" at loc 0(1,1)
Exception raised:exception raised in parse action  (at char 0), (line:1, col:1)
26.26.26.26
^
FAIL: exception raised in parse action  (at char 0), (line:1, col:1)
Match "element ;" at loc 0(1,1)
Matched "element ;" -> ['27.27.27.27', ';']
27.27.27.27; key
^
FAIL: Expected end of text, found 'k'  (at char 13), (line:1, col:14)
Match "element ;" at loc 0(1,1)
Matched "element ;" -> ['28.28.28.28', ';']
28.28.28.28; { key }
^
FAIL: Expected end of text, found '{'  (at char 13), (line:1, col:14)
Match "element ;" at loc 0(1,1)
Exception raised:exception raised in parse action  (at char 0), (line:1, col:1)
29.29.29.29, 30.30.30.30;
^
FAIL: exception raised in parse action  (at char 0), (line:1, col:1)
Match "element ;" at loc 0(1,1)
Match "element ;" at loc 1(1,2)
Matched "element ;" -> ['31.31.31.31', ';', '32.32.32.32', ';']
Exception raised:exception raised in parse action  (at char 0), (line:1, col:1)
{ 31.31.31.31; 32.32.32.32; }
^
FAIL: exception raised in parse action  (at char 0), (line:1, col:1)
Match "element ;" at loc 0(1,1)
Match "element ;" at loc 1(1,2)
Matched "element ;" -> ['33.33.33.33', ';', '34.34.34.34', ';']
Match "element ;" at loc 1(1,2)
Matched "element ;" -> ['33.33.33.33', ';', '34.34.34.34', ';']
Matched "element ;" -> ['{', '33.33.33.33', ';', '34.34.34.34', ';', '}', ';']
{ 33.33.33.33; 34.34.34.34; }; 35;
^
FAIL: Expected end of text, found '3'  (at char 31), (line:1, col:

更新:根据Paul MacG提供的答案,我用他的建议更新了代码片段。

在我开始之前,我在两次测试运行中发现了另外两个错误(有效语法和无效语法);这两个错误都在有效的语法测试运行中。 我已将测试代码段更新为:

result = element.runTests("""
123.123.123.123;
!210.210.210.210;
{ 234.234.234.234; };
2.2.2.2; { 3.3.3.3; };
{ 4.4.4.4; }; 5.5.5.5;
{ 6.6.6.6; 7.7.7.7; }; 8.8.8.8;
!{ 9.9.9.9; 10.10.10.10; };
12.12.12.12; !13.13.13.13;
14.14.14.14/15; 16.16.16.16; key MySha512Key;
17.17.17.17/18; { 19.19.19.19; }; key YourSha512Key;
""")
print("Result of valid contents: ")
pp.pprint(result)

现在,测试结果缩小到只有一个失败的语法:

Result of valid contents: 
(   False,
[   ('123.123.123.123;', (['123.123.123.123', ';'], {})),
('!210.210.210.210;', (['!', '210.210.210.210', ';'], {})),
(   '{ 234.234.234.234; };',
([(['{', '234.234.234.234', ';', '}', ';'], {})], {})),
(   '2.2.2.2; { 3.3.3.3; };',
(['2.2.2.2', ';', (['{', '3.3.3.3', ';', '}', ';'], {})], {})),
(   '{ 4.4.4.4; }; 5.5.5.5;',
([(['{', '4.4.4.4', ';', '}', ';'], {}), '5.5.5.5', ';'], {})),
(   '{ 6.6.6.6; 7.7.7.7; }; 8.8.8.8;',
([(['{', '6.6.6.6', ';', '7.7.7.7', ';', '}', ';'], {}), '8.8.8.8', ';'], {})),
(   '!{ 9.9.9.9; 10.10.10.10; };',
(['!', (['{', '9.9.9.9', ';', '10.10.10.10', ';', '}', ';'], {})], {})),
(   '12.12.12.12; !13.13.13.13;',
Expected end of text, found '!'  (at char 13), (line:1, col:14)),
(   '14.14.14.14/15; 16.16.16.16; key MySha512Key;',
(['14.14.14.14/15', ';', '16.16.16.16', ';', 'key', 'MySha512Key', ';'], {})),
(   '17.17.17.17/18; { 19.19.19.19; }; key YourSha512Key;',
(['17.17.17.17/18', ';', (['{', '19.19.19.19', ';', '}', ';'], {}), 'key', 'YourSha512Key', ';'], {}))])

这是向前迈出的重要一步。

我注意到以下基本变化:

  • delimitedList()介绍
  • ZeroOrMoreForward()内得到合并

我们留下了一个与嵌套element中使用的感叹号有关的错误。

import pprint
pp = pprint.PrettyPrinter(indent=4)
result = element.runTests("""
12.12.12.12; !13.13.13.13;
""")
print("Result of valid contents: ")
pp.pprint(result)

测试结果为:

Match "element ;" at loc 0(1,1)
Matched "element ;" -> ['12.12.12.12', ';']
12.12.12.12; !13.13.13.13;
^
FAIL: Expected end of text, found '!'  (at char 13), (line:1, col:14)
Result of valid contents: 
(   False,
[   (   '12.12.12.12; !13.13.13.13;',
Expected end of text, found '!'  (at char 13), (line:1, col:14))])

工作解决方案的最终运行

在最终的测试代码中,我采纳了 Paul McG 的建议,即将感叹号parser_element推送到ZeroOrMore内部,如下所示:

# Address_Match_List (AML)
# This AML combo is ordered very carefully so that longest pattern are tried firstly
#
# EBNF reiterated here:
#
#    address_match_list = element ; [ element; ... ]
#
#    element = [!] (ip [/prefix] | key key-name | "acl_name" | { address_match_list } )
#
element = Forward()
element <<= (
# Might be nice to do a bit of lookahead for '.', ':', 'key', and '"'
# | is matchFirst, not matchLongest
# ^ is matchLongest
ZeroOrMore(
# Hide the exclamation so we can do deeper parse cleaner w/o clutter of '!'
(0, None) * Word('!') +
(
(
(Combine(pyparsing_common.ipv4_address + '/' + Word(nums, max=3)) + ';')
^ (pyparsing_common.ipv4_address + ';')
^ (pyparsing_common.ipv6_address + ';')
^ (Keyword('key') + Word(alphanums, max=63) + ';')
^ Keyword('acl_name')
).setParseAction(pushFirst)
^ Group('{' - delimitedList(element, delim=';') + '}' + ';')
)
)
).setParseAction(pushExclamation)
element.setName('"element ;"')
element.setDebug()
import pprint
pp = pprint.PrettyPrinter(indent=4)
result = element.runTests("""
123.123.123.123;
!210.210.210.210;
{ 234.234.234.234; };
2.2.2.2; { 3.3.3.3; };
{ 4.4.4.4; }; 5.5.5.5;
{ 6.6.6.6; 7.7.7.7; }; 8.8.8.8;
!{ 9.9.9.9; 10.10.10.10; };
12.12.12.12; !13.13.13.13;
14.14.14.14/15; 16.16.16.16; key MySha512Key;
17.17.17.17/18; { 19.19.19.19; }; key YourSha512Key;
""")
print("Result of valid contents: ")
pp.pprint(result)

作为上述测试运行的结果,其有效语法内容的测试结果为:

Result of valid contents: 
(   True,
[   ('123.123.123.123;', (['123.123.123.123', ';'], {})),
('!210.210.210.210;', (['!', '210.210.210.210', ';'], {})),
(   '{ 234.234.234.234; };',
([(['{', '234.234.234.234', ';', '}', ';'], {})], {})),
(   '2.2.2.2; { 3.3.3.3; };',
(['2.2.2.2', ';', (['{', '3.3.3.3', ';', '}', ';'], {})], {})),
(   '{ 4.4.4.4; }; 5.5.5.5;',
([(['{', '4.4.4.4', ';', '}', ';'], {}), '5.5.5.5', ';'], {})),
(   '{ 6.6.6.6; 7.7.7.7; }; 8.8.8.8;',
([(['{', '6.6.6.6', ';', '7.7.7.7', ';', '}', ';'], {}), '8.8.8.8', ';'], {})),
(   '!{ 9.9.9.9; 10.10.10.10; };',
(['!', (['{', '9.9.9.9', ';', '10.10.10.10', ';', '}', ';'], {})], {})),
(   '12.12.12.12; !13.13.13.13;',
(['12.12.12.12', ';', '!', '13.13.13.13', ';'], {})),
(   '14.14.14.14/15; 16.16.16.16; key MySha512Key;',
(['14.14.14.14/15', ';', '16.16.16.16', ';', 'key', 'MySha512Key', ';'], {})),
(   '17.17.17.17/18; { 19.19.19.19; }; key YourSha512Key;',
(['17.17.17.17/18', ';', (['{', '19.19.19.19', ';', '}', ';'], {}), 'key', 'YourSha512Key', ';'], {}))])

哇。 下面的答案解决了这个问题。 需要再努力一些,以便我可以更好地总结"为什么"。

现在可以轻松滑冰到填写 ISC 样式配置的其余部分。

这可能会让您更接近,但我不确定它是否正确地执行堆栈位。

element = Forward()
element <<= (
# Hide the exclamation so we can do deeper parse cleaner w/o clutter of '!'
(0, None) * Word('!') +
# Might be nice to do a bit of lookahead for '.', ':', 'key', and '"'
# | is matchFirst, not matchLongest
# ^ is matchLongest
ZeroOrMore(
(
(Combine(pyparsing_common.ipv4_address + '/' + Word(nums, max=3)) + ';')
^ (pyparsing_common.ipv4_address + ';')
^ (pyparsing_common.ipv6_address + ';')
^ (Keyword('key') + Word(alphanums, max=63) + ';')
^ Keyword('acl_name')
).setParseAction(pushFirst)
^ Group('{' - delimitedList(element, delim=';') + '}' + ';')
)
).setParseAction(pushExclamation)

我已经开始在下一行的开头使用运算符格式化我的长表达式,这对我来说感觉更具可读性。我猜您可能希望 {} 中的元素保留在它们自己的子组中,所以我将它们分组。如果你想摆脱混乱,所有这些分号看起来都可以被抑制,如果你适当地构建你的结果。

最新更新