为什么TextX在字符串文字中忽略\n,而在正则表达式中不忽略



TL;DR:该问题将在TextX的3.0版本中修复。解决方法是使用正则表达式匹配转义((字符,如n

完整问题:使用TextX,我正在解析一种本土的标记语言,其中段落和换行符非常重要。我想我在尝试匹配新行时错过了一个基本的理解:为什么"n""nn"不起作用,而它们的正则表达式对应/n//nn/起作用?

注意:在解析器级别重新定义空白,以使用ws=" t"排除n

import textx as tx
grammar = r"""
Root:
content*=Content
;
Content:
Text | ParagraphBreak | LineBreak
;
ParagraphBreak:
paragraphbreak="nn"
// paragraphbreak=/nn/
;
LineBreak:
linebreak="n"  // Will cause parsing error
// linebreak=/n/  // Will parse fine
;
Text[noskipws]:  // All text valid
text=/[^n]*/
;
"""
parser = tx.metamodel_from_str(grammar, ws=" t")
source = "Line.nBreak.nn"
parsed_source = parser.model_from_str(source)
print(parsed_source.content)

在我的系统上运行上述代码时,使用

  • Python 3.10.1
  • 诗歌版本1.1.12,来自Poetry.lock:
    • [[package]]名称=";琶音";,version=";1.10.2"。。。,python版本="*">
    • [[package]]名称=";textx";,version=";2.3.0"。。。,python版本="*&";,[package.dependences]Arpeggio=">1.9.0〃

我得到以下结果:

具有路径根:/Users/[redacted]/Library/Caches/pypoetry/virtualenvs

File ".../[redacted]-py3.10/lib/python3.10/site-packages/textx/model.py", line 291, in _parse
return self.parser_model.parse(self)
File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 291, in parse
result = self._parse(parser)
File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 370, in _parse
result = e.parse(parser)
File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 789, in parse
result = self._parse(parser)
File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 945, in _parse
parser._nm_raise(self, c_pos, parser)
File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 1718, in _nm_raise
raise self.nm
File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 485, in _parse
result = p(parser)
File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 291, in parse
result = self._parse(parser)
File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 423, in _parse
parser._nm_raise(self, c_pos, parser)
File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 1718, in _nm_raise
raise self.nm
File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 409, in _parse
result = e.parse(parser)
File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 291, in parse
result = self._parse(parser)
File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 370, in _parse
result = e.parse(parser)
File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 291, in parse
result = self._parse(parser)
File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 370, in _parse
result = e.parse(parser)
File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 789, in parse
result = self._parse(parser)
File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 898, in _parse
parser._nm_raise(self, c_pos, parser)
File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 1718, in _nm_raise
raise self.nm
File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 409, in _parse
result = e.parse(parser)
File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 291, in parse
result = self._parse(parser)
File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 370, in _parse
result = e.parse(parser)
File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 291, in parse
result = self._parse(parser)
File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 370, in _parse
result = e.parse(parser)
File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 789, in parse
result = self._parse(parser)
File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 898, in _parse
parser._nm_raise(self, c_pos, parser)
File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 1718, in _nm_raise
raise self.nm
arpeggio.NoMatch: Expected 'nn' or 'n' or EOF at position (1, 6) => 'Line.* Break.  '.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/[redacted]/scratchpad/TextX/linebreaks.py", line 31, in <module>
parsed_source = parser.model_from_str(source)
File ".../[redacted]-py3.10/lib/python3.10/site-packages/textx/metamodel.py", line 615, in model_from_str
model = self._parser_blueprint.clone().get_model_from_str(
File ".../[redacted]-py3.10/lib/python3.10/site-packages/textx/model.py", line 332, in get_model_from_str
self.parse(model_str, file_name=file_name)
File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 1516, in parse
self.parse_tree = self._parse()
File ".../[redacted]-py3.10/lib/python3.10/site-packages/textx/model.py", line 294, in _parse
raise TextXSyntaxError(message=text(e),
textx.exceptions.TextXSyntaxError: None:1:6: error: Expected 'nn' or 'n' or EOF at position (1, 6) => 'Line.* Break.  '.

我期望得到与regex版本相同的结果,即:

[<textx:Text instance at 0x10129bc40>, <textx:LineBreak instance at 0x101298040>, <textx:Text instance at 0x101298130>, <textx:ParagraphBreak instance at 0x10129aec0>]

这是当前开发版本中解决的问题。请参阅此textX问题。

该修复程序将成为即将发布的textX 3.0版本的一部分。

最新更新