我正在尝试为HL7派生语言创建ANTLR语法。HL7 具有一个功能,即使用输入本身的前几个字节映射消息中的所有分隔符。例如:MSH|^~&
指定各种分隔符,按字段分隔符|
组件分隔符^
、重复分隔符~
、转义字符、子组件分隔符
&
的顺序。
是否可以生成不对这些令牌进行硬编码的 ANTLR 语法?
正如 Kaby76 在评论中暗示的那样:是的,一些谓词巫毒教是可能的:
lexer grammar HL7Lexer;
@members {
private char fieldSeparator;
private char componentSeparator;
private char repetitionSeparator;
private char escapeSeparator;
private char subcomponentSeparator;
private boolean separatorsInitialised = false;
private void setEncodingChars(String chars) {
this.fieldSeparator = chars.charAt(3);
this.componentSeparator = chars.charAt(4);
this.repetitionSeparator = chars.charAt(5);
this.escapeSeparator = chars.charAt(6);
this.subcomponentSeparator = chars.charAt(7);
this.separatorsInitialised = true;
}
private boolean isEncodingCharAhead() {
if (!this.separatorsInitialised) {
return true;
}
char ch = (char)this._input.LA(1);
return ch == this.fieldSeparator || ch == this.componentSeparator
|| ch == this.repetitionSeparator || ch == this.escapeSeparator
|| ch == this.subcomponentSeparator;
}
}
MSH
: 'MSH' . . . . . {this.setEncodingChars(getText());}
;
FIELD_SEP
: {this._input.LA(1) == this.fieldSeparator}? .
;
COMPONENT_SEP
: {this._input.LA(1) == this.componentSeparator}? .
;
REPETITION_SEP
: {this._input.LA(1) == this.repetitionSeparator}? .
;
ESCAPE_SEP
: {this._input.LA(1) == this.escapeSeparator}? .
;
SUBCOMPONENT_SEP
: {this._input.LA(1) == this.subcomponentSeparator}? .
;
OTHER
: ( {!this.isEncodingCharAhead()}? . )+
;
使用输入MSH|^~&|ADT1|GOOD HEALTH HOSPITAL|GHH LAB, INC.|GOOD HEALTH HOSPITAL|198808181126|SECURITY|ADT^A01^ADT_A01|MSG00001|P|2.8||
测试此词法分析器语法时:
String message = "MSH|^~\&|ADT1|GOOD HEALTH HOSPITAL|GHH LAB, INC.|GOOD HEALTH HOSPITAL|198808181126|SECURITY|ADT^A01^ADT_A01|MSG00001|P|2.8||";
HL7Lexer lexer = new HL7Lexer(CharStreams.fromString(message));
CommonTokenStream stream = new CommonTokenStream(lexer);
stream.fill();
for (Token t : stream.getTokens()) {
System.out.printf("%-20s '%s'n",
HL7Lexer.VOCABULARY.getSymbolicName(t.getType()),
t.getText().replace("n", "\n"));
}
将创建以下令牌:
MSH 'MSH|^~&'
FIELD_SEP '|'
OTHER 'ADT1'
FIELD_SEP '|'
OTHER 'GOOD HEALTH HOSPITAL'
FIELD_SEP '|'
OTHER 'GHH LAB, INC.'
FIELD_SEP '|'
OTHER 'GOOD HEALTH HOSPITAL'
FIELD_SEP '|'
OTHER '198808181126'
FIELD_SEP '|'
OTHER 'SECURITY'
FIELD_SEP '|'
OTHER 'ADT'
COMPONENT_SEP '^'
OTHER 'A01'
COMPONENT_SEP '^'
OTHER 'ADT_A01'
FIELD_SEP '|'
OTHER 'MSG00001'
FIELD_SEP '|'
OTHER 'P'
FIELD_SEP '|'
OTHER '2.8'
FIELD_SEP '|'
FIELD_SEP '|'
EOF '<EOF>'