我有一个使用StreamTokenizer的非常简单的令牌化器,它将把数学表达式转换为它们各自的组件(如下)。我遇到的问题是,如果表达式中有一个名为T_1的变量,它将分解为[T,_,1],我想将其返回为[T_1]。
我尝试使用一个变量来检查最后一个字符是否是下划线,如果是,请将下划线附加到列表中。Size-1,但它似乎是一个非常笨重和低效的解决方案。有办法做到这一点吗?谢谢
StreamTokenizer tokenizer = new StreamTokenizer(new StringReader(s));
tokenizer.ordinaryChar('-'); // Don't parse minus as part of numbers.
tokenizer.ordinaryChar('/'); // Don't parse slash as part of numbers.
List<String> tokBuf = new ArrayList<String>();
while (tokenizer.nextToken() != StreamTokenizer.TT_EOF) //While not the end of file
{
switch (tokenizer.ttype) //Switch based on the type of token
{
case StreamTokenizer.TT_NUMBER: //Number
tokBuf.add(String.valueOf(tokenizer.nval));
break;
case StreamTokenizer.TT_WORD: //Word
tokBuf.add(tokenizer.sval);
break;
case '_':
tokBuf.add(tokBuf.size()-1, tokenizer.sval);
break;
default: //Operator
tokBuf.add(String.valueOf((char) tokenizer.ttype));
}
}
return tokBuf;
这就是您想要的。
tokenizer.wordChars('_', '_');
这使得_作为单词的一部分是可识别的。
附录:
这构建并运行:
public static void main(String args[]) throws Exception {
String s = "abc_xyz abc 123 1 + 1";
StreamTokenizer tokenizer = new StreamTokenizer(new StringReader(s));
tokenizer.ordinaryChar('-'); // Don't parse minus as part of numbers.
tokenizer.ordinaryChar('/'); // Don't parse slash as part of numbers.
tokenizer.wordChars('_', '_'); // Don't parse slash as part of numbers.
List<String> tokBuf = new ArrayList<String>();
while (tokenizer.nextToken() != StreamTokenizer.TT_EOF) //While not the end of file
{
switch (tokenizer.ttype) //Switch based on the type of token
{
case StreamTokenizer.TT_NUMBER: //Number
tokBuf.add(String.valueOf(tokenizer.nval));
break;
case StreamTokenizer.TT_WORD: //Word
tokBuf.add(tokenizer.sval);
break;
default: //Operator
tokBuf.add(String.valueOf((char) tokenizer.ttype));
}
}
System.out.println(tokBuf);
}
run:
[abc_xyz, abc, 123.0, 1.0, +, 1.0]
StringTokenizer可能更适合。如果是这样的话,下面是你如何使用它:
导入java.util.ArrayList;导入java.util.List;导入java.util.StringTokenizer;
public class Solution {
public static void main(String args[]) throws Exception {
StringTokenizer tokenizer = new StringTokenizer("T_1 1 * bar");
List<String> tokBuf = new ArrayList<String>();
while (tokenizer.hasMoreTokens()) //While not the end of file
{
tokBuf.add(tokenizer.nextToken());
}
System.out.println(tokBuf);
}
}
打印出来:
[T_1, 1, *, bar]