执行子字符串操作时包括分隔符



执行子字符串操作时如何包含分隔符?

即给定字符串message如下所示:

<nutrition>
<daily-values>
<total-fat units="g">65</total-fat>
<saturated-fat units="g">20</saturated-fat>
<cholesterol units="mg">300</cholesterol>
<sodium units="mg">2400</sodium>
<carb units="g">300</carb>
<fiber units="g">25</fiber>
<protein units="g">50</protein>
</daily-values>
</nutrition>
<food>
<name>Avocado Dip</name>
<mfr>Sunnydale</mfr>
<serving units="g">29</serving>
<calories total="110" fat="100"/>
<total-fat>11</total-fat>
<saturated-fat>3</saturated-fat>
<cholesterol>5</cholesterol>
<sodium>210</sodium>
<carb>2</carb>
<fiber>0</fiber>
<protein>1</protein>
<vitamins>
<a>0</a>
<c>0</c>
</vitamins>
<minerals>
<ca>0</ca>
<fe>0</fe>
</minerals>
</food>

然后

message = message.substring(message.indexOf("<food>"), message.indexOf("</food>"));

返回

<food>
<name>Avocado Dip</name>
<mfr>Sunnydale</mfr>
<serving units="g">29</serving>
<calories total="110" fat="100"/>
<total-fat>11</total-fat>
<saturated-fat>3</saturated-fat>
<cholesterol>5</cholesterol>
<sodium>210</sodium>
<carb>2</carb>
<fiber>0</fiber>
<protein>1</protein>
<vitamins>
<a>0</a>
<c>0</c>
</vitamins>
<minerals>
<ca>0</ca>
<fe>0</fe>
</minerals>

如果我不知道 XML 文件的周围内容,如何让它保留最后一个</food>标签?

这是使用javax.xml的解决方案。它旨在解决文档中存在多个<food>元素的情况。为了正确处理这种情况,您需要

  1. 将 XML 反序列化为org.w3c.dom.Document
  2. <food>节点列表提取为org.w3c.dom.NodeList
  3. 最后序列化回字符串

下面是一个简化的示例:

private static final String XML =
"<?xml version = "1.0" encoding = "UTF-8"?>n"
+ "<message>n"
+ "  <food>n"
+ "    <name>A</name>n"
+ "  </food>n"
+ "  <food>n"
+ "    <name>B</name>n"
+ "  </food>n"
+ "</message>n";
@Test
public void xpath() throws Exception {
// Deserialize
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
Document document;
try (InputStream in = new ByteArrayInputStream(XML.getBytes(StandardCharsets.UTF_8))) {
document = factory.newDocumentBuilder().parse(in);
}
XPath xPath = XPathFactory.newInstance().newXPath();
XPathExpression expr = xPath.compile("//food");
NodeList nodeList = (NodeList) expr.evaluate(document, XPathConstants.NODESET);
for (int i = 0; i < nodeList.getLength(); i++) {
Node node = nodeList.item(i);
System.out.println(node.getNodeName() + ": " + node.getTextContent().trim());
}
// Serialize
Document exportDoc = factory.newDocumentBuilder().newDocument();
Node exportNode = exportDoc.importNode(nodeList.item(0), true);
exportDoc.appendChild(exportNode);
String content = serialize(exportDoc);
System.out.println(content);
}
private static String serialize(Document doc) throws TransformerException {
DOMSource domSource = new DOMSource(doc);
StringWriter writer = new StringWriter();
StreamResult result = new StreamResult(writer);
TransformerFactory tf = TransformerFactory.newInstance();
Transformer transformer = tf.newTransformer();
// set indent
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "2");
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
transformer.transform(domSource, result);
return writer.toString();
}

第一个输出显示所有<food>元素都正确反序列化:

food: A
food: B

第二个输出显示第一个元素被序列化回字符串:

<food>
<name>A</name>
</food>

最新更新