String:
The car has <ex id="3"/><g id="4">attributes</g><g id="5">, such as weight and color
使用正则表达式:(<.*?>)
我可以得到像<ex id="3"/>
和<g id="4">
这样的标签
但是我们如何从句子中删除所有字符串部分,使最终字符串看起来像<ex id="3"/><g id="4"></g><g id="5">
只有标签。
Q:从句子中删除除标签以外的任何内容(非标签操作符)。
下面的代码创建一个包含所需标记的新字符串。
public static void main(String[] args)
{
String line = "The car has <ex id=\"3\"/><g id=\"4\">attributes</g><g id=\"5\">, such as weight and color";
String regex = "(<.*?>)";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(line);
StringBuilder compactline = new StringBuilder();
while (matcher.find()) {
compactline.append(matcher.group());
}
System.out.println("Original Line : " + line);
System.out.println("Compact Line : " + compactline);
}
输出Original Line : The car has <ex id="3"/><g id="4">attributes</g><g id="5">, such as weight and color
Compact Line : <ex id="3"/><g id="4"></g><g id="5">
String segment ="The car has <ex id="3"/><g id="4">attributes</g><g id="5">, such as weight and color;
String regex="(<.*?>)";
Pattern pattern = Pattern.compile(regex);
Matcher matcher=pattern.matcher(segment);
while(matcher.find())
{
System.out.println("matcher ="+matcher.group());
}
输出:
matcher =<ex id="3"/>
matcher =<g id="4">
matcher =</g>
matcher =<g id="5">
matcher =</g>
matcher =<g id="6">
matcher =</g>
matcher =<g id="7">
matcher =</g>
在尝试@anubhava的建议后,此代码片段工作并仅获取字符串中存在的标记。