非参数化和参数化语句的不同标记名称，或者如何使用 RuleLexer 跳转到以前的标记

如何实现以下示例中的不同令牌名称：

#someNameAttribute //where #someNameAttribute should be assigned to IDENTIFIER lexer rule
#someNameAttribute("2a3a796e-9870-4b88-9f2d-383eb9566613", 10) // where #someNameAttribute should be assigned to PARAMETERIZED_IDENTIFIER since we faced with parenthesis

我现在有语法(但它总是分配给标识符(：

grammar Rule;
ruleExpression
: identifierExpression EOF | parameterizedIdentifierExpression EOF
;
identifierExpression
: IDENTIFIER
;
parameterizedIdentifierExpression
: PIDENTIFIER LPAREN UUID DELIMETER NUMERIC RPAREN
;
DELIMETER           : ',';
LPAREN              : '(';
RPAREN              : ')';
UUID                : '"'[0-9a-fA-F]+'-'[0-9a-fA-F]+'-'[1-5][0-9a-fA-F]+'-'[89abAB][0-9a-fA-F]+'-'[0-9a-fA-F]+'"';
NUMERIC             : [0-9]+ ( '.' [0-9]+ )? ;
IDENTIFIER          : '#' [a-zA-Z$_] [a-zA-Z$_0-9]*;
// PARAMETERIZED_IDENTIFIER         : { behind(LPAREN) }? IDENTIFIER;  // Tried to use semantic predicate but no luck. Might be used it wrong way
WS                  : [ rtu000Cn]+ -> skip;

或者，如果可以以某种方式检查从 Java 代码 #someNameAttribute 后括号上的下一个令牌 - 将很高兴听到如何做到这一点。我也尝试了这种方式，但是 RuleLexer.nextToken(( 允许我检查下一个令牌，但我无法再次跳转到上一个令牌以继续整个语句(因此开始丢失一些令牌(。

我如何预测要分配的令牌名称或如何使用 Java 代码中的 RuleLexer 跳转到以前的令牌？

尝试这样的事情(仅适用于Java(：

grammar Rule;
any           : .*? EOF;
LPAREN        : '(';
RPAREN        : ')';
UUID          : '"'[0-9a-fA-F]+'-'[0-9a-fA-F]+'-'[1-5][0-9a-fA-F]+'-'[89abAB][0-9a-fA-F]+'-'[0-9a-fA-F]+'"';
NUMERIC       : [0-9]+ ( '.' [0-9]+ )? ;
PIDENTIFIER   : IDENTIFIER {_input.LA(1) == '('}?;
IDENTIFIER    : '#' [a-zA-Z$_] [a-zA-Z$_0-9]*;
WS            : [ rtu000Cn]+ -> skip;
OTHER         : . ;

如果标识符和(之间允许有空格，请执行以下操作：

grammar Rule;
@lexer::members {
boolean spacesAndOpenParenAhead() {
for (int i = 1; ; i++) {
char ch = (char)_input.LA(i);
if (ch == '(') {
return true;
}
else if (ch != ' ' && ch != 't' && ch != 'r' && ch != 'n') {
return false;
}
}
}
}
...
PIDENTIFIER         : IDENTIFIER {spacesAndOpenParenAhead()}?;
IDENTIFIER          : '#' [a-zA-Z$_] [a-zA-Z$_0-9]*;

当我在两个示例语法上运行下面的代码时：

import org.antlr.v4.runtime.*;
public class Main {
public static void main(String[] args) throws Exception {
String source = "#someNameAttributen" +
"#someNameAttribute("2a3a796e-9870-4b88-9f2d-383eb9566613", 10)";
RuleLexer lexer = new RuleLexer(CharStreams.fromString(source));
CommonTokenStream stream = new CommonTokenStream(lexer);
stream.fill();
for (Token t : stream.getTokens()) {
System.out.printf("%-20s `%s`%n",
RuleLexer.VOCABULARY.getDisplayName(t.getType()),
t.getText().replace("n", "\n"));
}
}
}

以下内容打印在我的控制台上：

IDENTIFIER           `#someNameAttribute`
PIDENTIFIER          `#someNameAttribute`
'('                  `(`
UUID                 `"2a3a796e-9870-4b88-9f2d-383eb9566613"`
OTHER                `,`
NUMERIC              `10`
')'                  `)`

相关内容

最新更新

热门标签：