C# ANTLR4 DefaultErrorStrategy 或自定义错误侦听器不会捕获无法识别的字符



这很奇怪,但DefaultErrorStrategy并没有做任何事情来捕获流中无法识别的字符。我尝试了自定义错误策略、自定义错误侦听器和BailErrorStrategy- 这里没有运气。

我的语法

grammar Polynomial;
parse           : canonical EOF
;
canonical       : polynomial+                                     #canonicalPolynom
| polynomial+ EQUAL polynomial+                   #equality
;
polynomial      : SIGN? '(' (polynomial)* ')'                     #parens
| monomial                                        #monom
;
monomial        : SIGN? coefficient? VAR ('^' INT)?               #addend
| SIGN? coefficient                               #number
;
coefficient             : INT | DEC;
INT                     : ('0'..'9')+;
DEC                     : INT '.' INT;
VAR                     : [a-z]+;
SIGN                    : '+' | '-';
EQUAL                   : '=';
WHITESPACE              : (' '|'t')+ -> skip;

我正在提供输入23*44=12@1234

我希望我的解析器为语法中未定义的字符*@抛出不匹配的令牌或任何类型的异常。

相反,我的解析器只是跳过*@并遍历一棵树,就像不存在一样。

我的处理程序函数,我在其中调用词法分析器,解析器之类的东西。

private static (IParseTree tree, string parseErrorMessage) TryParseExpression(string expression)
{
ICharStream stream = CharStreams.fromstring(expression);
ITokenSource lexer = new PolynomialLexer(stream);
ITokenStream tokens = new CommonTokenStream(lexer);
PolynomialParser parser = new PolynomialParser(tokens);
//parser.ErrorHandler = new PolynomialErrorStrategy(); -> I tried custom error strategy
//parser.RemoveErrorListeners();
//parser.AddErrorListener(new PolynomialErrorListener()); -> I tried custom error listener
parser.BuildParseTree = true;
try
{
var tree = parser.canonical();
return (tree, string.Empty);
}
catch (RecognitionException re)
{
return (null, re.Message);
}
catch (ParseCanceledException pce)
{
return (null, pce.Message);
}
}            

我尝试添加自定义错误侦听器。

public class PolynomialErrorListener : BaseErrorListener
{
private const string Eof = "EOF";
public override void SyntaxError(TextWriter output, IRecognizer recognizer, IToken offendingSymbol, int line, int charPositionInLine, string msg,
RecognitionException e)
{
if (msg.Contains(Eof))
{
throw new ParseCanceledException($"{GetSyntaxErrorHeader(charPositionInLine)}. Missing an expression after '=' sign");
}
if (e is NoViableAltException || e is InputMismatchException)
{
throw new ParseCanceledException($"{GetSyntaxErrorHeader(charPositionInLine)}. Probably, not closed operator");
}
throw new ParseCanceledException($"{GetSyntaxErrorHeader(charPositionInLine)}. {msg}");
}
private static string GetSyntaxErrorHeader(int errorPosition)
{
return $"Expression is invalid. Input is not valid at {--errorPosition} position";
}
}

之后,我尝试实现自定义错误策略。

public class PolynomialErrorStrategy : DefaultErrorStrategy
{
public override void ReportError(Parser recognizer, RecognitionException e)
{
throw e;
}
public override void Recover(Parser recognizer, RecognitionException e)
{
for (ParserRuleContext context = recognizer.Context; context != null; context = (ParserRuleContext) context.Parent) {
context.exception = e;
}
throw new ParseCanceledException(e);
}
public override IToken RecoverInline(Parser recognizer)
{
InputMismatchException e = new InputMismatchException(recognizer);
for (ParserRuleContext context = recognizer.Context; context != null; context = (ParserRuleContext) context.Parent) {
context.exception = e;
}
throw new ParseCanceledException(e);
}
protected override void ReportInputMismatch(Parser recognizer, InputMismatchException e)
{
string msg = "mismatched input " + GetTokenErrorDisplay(e.OffendingToken);
// msg += " expecting one of " + e.GetExpectedTokens().ToString(recognizer.());
RecognitionException ex = new RecognitionException(msg, recognizer, recognizer.InputStream, recognizer.Context);
throw ex;
}
protected override void ReportMissingToken(Parser recognizer)
{
BeginErrorCondition(recognizer);
IToken token = recognizer.CurrentToken;
IntervalSet expecting = GetExpectedTokens(recognizer);
string msg = "missing " + expecting.ToString() + " at " + GetTokenErrorDisplay(token);
throw new RecognitionException(msg, recognizer, recognizer.InputStream, recognizer.Context);
}
}

是否有我忘记在解析器中指定的任何标志,或者我的语法不正确?

有趣的是,我在我的IDE中使用了ANTLR插件,当我在这里测试我的语法时,这个插件正确地响应了line 1:2 token recognition error at: '*'

完整源代码:https://github.com/EvgeniyZ/PolynomialCanonicForm

我正在使用 ANTLR 4.8 完整版本.jar

编辑

我试图添加到语法规则中

parse           : canonical EOF
;

这里仍然没有运气

如果这样做会发生什么:

parse
: canonical EOF
;

并调用此规则:

var tree = parser.parse();

通过添加EOF令牌(输入结束),您将强制分析器使用所有令牌,当分析器无法正确处理它们时,这应该会导致错误。

有趣的是,我正在我的IDE中使用ANTLR插件,当我在这里测试我的语法时,这个插件会正确响应line 1:2 token recognition error at: '*'

这就是词法分析器在std.err流上发出的内容。词法分析器只是报告此警告并愉快地进行。因此,词法分析器只是忽略这些字符,因此永远不会在解析器中结束。如果在词法分析器的末尾添加以下行:

// Fallback rule: matches any single character if not matched by another lexer rule
UNKNOWN : . ;

然后,*@字符将作为UNKNOWN令牌发送到解析器,然后会导致识别错误。

最新更新