Java 如何查找和替换 XML 中的属性值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/28837786/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-11 06:55:33  来源:igfitidea点击:

How to find and replace an attribute value in a XML

javaxmldom

提问by Enzo

I am building a "XML scanner" in Java that finds attribute values starting with "!Here:". The attribute value contains instructions to replace later. for example I have this xml file filled with records like

我正在用 Java 构建一个“XML 扫描器”,它可以找到以“!Here:”开头的属性值。属性值包含稍后替换的说明。例如,我有这个 xml 文件充满了像

<bean value="!Here:Sring:HashKey"></bean>

How can I find and replace the attribute values only knowing it starts with "!Here:"?

我怎样才能找到并替换只知道它以 开头的属性值"!Here:"

采纳答案by T.Gounelle

In order to modify some element or attribute values in the XML file, while still being respectful of XML structure, you will need to use a XML parser. It's a bit more involved than just String$replace()...

为了修改 XML 文件中的某些元素或属性值,同时仍然尊重 XML 结构,您将需要使用 XML 解析器。它涉及的不仅仅是String$replace()......

Given an example XML like:

给出一个示例 XML,如:

<?xml version="1.0" encoding="UTF-8"?>
<beans> 
    <bean id="exampleBean" class="examples.ExampleBean">
        <!-- setter injection using -->
        <property name="beanTwo" ref="anotherBean"/>
        <property name="integerProperty" value="!Here:Integer:Foo"/>
    </bean>
    <bean id="anotherBean" class="examples.AnotherBean">
        <property name="stringProperty" value="!Here:String:Bar"/>
    </bean>
</beans>

In order to change the 2 markers !Here, you need

为了更改2个标记!Here,您需要

  1. to load the file into a dom Document,
  2. select with xpath the wanted nodes. Here I search for all nodes in the document with an attribute valuethat contains the string !Here. The xpath expression is //*[contains(@value, '!Here')].
  3. do the transformation you want on each selected nodes. Here I just change !Hereby What?.

  4. save the modified dom Documentinto a new file.

  1. 将文件加载到 dom 中Document
  2. 使用 xpath 选择想要的节点。在这里,我使用value包含字符串的属性搜索文档中的所有节点!Here。xpath 表达式是//*[contains(@value, '!Here')].
  3. 在每个选定的节点上进行您想要的转换。在这里,我只是改变!HereWhat?

  4. 将修改后的 dom 保存Document到一个新文件中。



static String inputFile = "./beans.xml";
static String outputFile = "./beans_new.xml";

// 1- Build the doc from the XML file
Document doc = DocumentBuilderFactory.newInstance()
            .newDocumentBuilder().parse(new InputSource(inputFile));

// 2- Locate the node(s) with xpath
XPath xpath = XPathFactory.newInstance().newXPath();
NodeList nodes = (NodeList)xpath.evaluate("//*[contains(@value, '!Here')]",
                                          doc, XPathConstants.NODESET);

// 3- Make the change on the selected nodes
for (int idx = 0; idx < nodes.getLength(); idx++) {
    Node value = nodes.item(idx).getAttributes().getNamedItem("value");
    String val = value.getNodeValue();
    value.setNodeValue(val.replaceAll("!Here", "What?"));
}

// 4- Save the result to a new XML doc
Transformer xformer = TransformerFactory.newInstance().newTransformer();
xformer.transform(new DOMSource(doc), new StreamResult(new File(outputFile)));

The resulting XML file is:

生成的 XML 文件是:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<beans> 
    <bean class="examples.ExampleBean" id="exampleBean">
        <!-- setter injection using -->
        <property name="beanTwo" ref="anotherBean"/>
        <property name="integerProperty" value="What?:Integer:Foo"/>
    </bean>
    <bean class="examples.AnotherBean" id="anotherBean">
        <property name="stringProperty" value="What?:String:Bar"/>
    </bean>
</beans>

回答by SHoko

We have some alternatives to this in Java.

我们在 Java 中有一些替代方案。

  • First, JAXP(it has been bundled with Java since version 1.4).
  • 首先是JAXP(它从 1.4 版开始与 Java 捆绑在一起)。

Let's assume we need to change the attribute customerto falsein this XML:

假设我们需要将属性更改customerfalse这个XML:

<?xml version="1.0" encoding="UTF-8"?>
<notification id="5">
   <to customer="true">[email protected]</to>
   <from>[email protected]</from>
</notification>

With JAXP (this implementation is based in @t-gounelle sample) we could do this:

使用 JAXP(此实现基于 @t-gounelle 示例),我们可以这样做:

//Load the document
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);
factory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
Document input = factory.newDocumentBuilder().parse(resourcePath);
//Select the node(s) with XPath
XPath xpath = XPathFactory.newInstance().newXPath();
NodeList nodes = (NodeList) xpath.evaluate(String.format("//*[contains(@%s, '%s')]", attribute, oldValue), input, XPathConstants.NODESET);
// Updated the selected nodes (here, we use the Stream API, but we can use a for loop too)
IntStream
    .range(0, nodes.getLength())
    .mapToObj(i -> (Element) nodes.item(i))
    .forEach(value -> value.setAttribute(attribute, newValue));
// Get the result as a String
TransformerFactory factory = TransformerFactory.newInstance();
factory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);
Transformer xformer = factory.newTransformer();
xformer.setOutputProperty(OutputKeys.INDENT, "yes");
Writer output = new StringWriter();
xformer.transform(new DOMSource(input), new StreamResult(output));
String result = output.toString();

Note that in order to disable external entity processing (XXE) for the DocumentBuilderFactoryclass, we configure the XMLConstants.FEATURE_SECURE_PROCESSINGfeature. It's a good practice to configure it when we parse untrusted XML files. Check this OWASP guidewith additional information.

请注意,为了禁用类的外部实体处理 ( XXE) DocumentBuilderFactory,我们配置了XMLConstants.FEATURE_SECURE_PROCESSING功能。当我们解析不受信任的 XML 文件时,配置它是一个很好的做法。查看此OWASP 指南,了解更多信息。

  • Another alternative is dom4j. It's an open-source framework for processing XML which is integrated with XPath and fully supports DOM, SAX, JAXP and the Java platform such as Java Collections.
  • 另一种选择是dom4j。它是一个用于处理 XML 的开源框架,它与 XPath 集成并完全支持 DOM、SAX、JAXP 和 Java 平台(如 Java Collections)。

We need to add the following dependencies to our pom.xml to use it:

我们需要将以下依赖项添加到我们的 pom.xml 中才能使用它:

<dependency>
    <groupId>org.dom4j</groupId>
    <artifactId>dom4j</artifactId>
    <version>2.1.1</version>
</dependency>
<dependency>
    <groupId>jaxen</groupId>
    <artifactId>jaxen</artifactId>
    <version>1.2.0</version>
</dependency>

The implementation is very similar to JAXP equivalent:

该实现与 JAXP 等价物非常相似:

// Load the document
SAXReader xmlReader = new SAXReader();
Document input = xmlReader.read(resourcePath);
// Features to prevent XXE
xmlReader.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
xmlReader.setFeature("http://xml.org/sax/features/external-general-entities", false);
xmlReader.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
// Select the nodes
String expr = String.format("//*[contains(@%s, '%s')]", attribute, oldValue);
XPath xpath = DocumentHelper.createXPath(expr);
List<Node> nodes = xpath.selectNodes(input);
// Updated the selected nodes
IntStream
    .range(0, nodes.getLength())
    .mapToObj(i -> (Element) nodes.get(i);)
    .forEach(value -> value.addAttribute(attribute, newValue));
// We can get the representation as String in the same way as the previous JAXP snippet.

Note that with this method despite the name, if an attribute already exists for the given name it will be replaced otherwise it will add it. We can found the javadoc here.

请注意,尽管使用此方法名称,但如果给定名称的属性已存在,它将被替换,否则将添加它。我们可以在这里找到 javadoc 。

  • Another nice alternative is jOOX, this library inspires its API in jQuery.
  • 另一个不错的选择是jOOX,这个库在jQuery 中激发了它的 API 。

We need to add the following dependencies to our pom.xml to use jOOX.

我们需要将以下依赖项添加到我们的 pom.xml 以使用 jOOX。

For use with Java 9+:

用于 Java 9+:

<dependency>
    <groupId>org.jooq</groupId>
    <artifactId>joox</artifactId>
    <version>1.6.2</version>
</dependency>

For use with Java 6+:

用于 Java 6+:

<dependency>
    <groupId>org.jooq</groupId>
    <artifactId>joox-java-6</artifactId>
    <version>1.6.2</version>
</dependency>

We can implement our attribute changer like this:

我们可以像这样实现我们的属性更改器:

// Load the document
DocumentBuilder builder = JOOX.builder();
Document input = builder.parse(resourcePath);
Match $ = $(input);
// Select the nodes
$
    .find("to") // We can use and XPATH expresion too.
    .get() 
    .stream()
    .forEach(e -> e.setAttribute(attribute, newValue));
// Get the String reprentation
$.toString();

As we can see in this sample, the syntaxis is less verbose than JAXP and dom4j samples.

正如我们在这个示例中看到的,语法没有 JAXP 和 dom4j 示例那么冗长。

I compared the 3 implementations with JMH and I got the following results:

我将 3 个实现与 JMH 进行了比较,得到以下结果:

| Benchmark                          Mode  Cnt  Score   Error  Units |
|--------------------------------------------------------------------|
| AttributeBenchMark.dom4jBenchmark  avgt    5  0.167 ± 0.050  ms/op |
| AttributeBenchMark.jaxpBenchmark   avgt    5  0.185 ± 0.047  ms/op |
| AttributeBenchMark.jooxBenchmark   avgt    5  0.307 ± 0.110  ms/op |

I put the examples hereif you need to take a look.

如果您需要查看,我将示例放在这里

回答by Georgii Zykov

Gounelle's answer is correct, however, it is based on fact that you know attribute name in advance.

Gounelle 的答案是正确的,但是,这是基于您事先知道属性名称的事实。

If you want to find all attributes based only on their value, use this expression for xpath:

如果您只想根据其值查找所有属性,请对 xpath 使用以下表达式:

NodeList attributes = (NodeList) xpath.evaluate(
    "//*/@*[contains(. , '!Here')]",
     doc, 
    XPathConstants.NODESET
)

Here, you select all attributes by setting //*/@*. Then you can set a condition like I mentioned above.

在这里,您可以通过设置来选择所有属性//*/@*。然后你可以像我上面提到的那样设置一个条件。

By the way, if you search for a single attribute, you can use Attrinstead of Node

顺便说一句,如果您搜索单个属性,则可以使用Attr代替Node

Attr attribute = (Attr) xpath.evaluate(
    "//*/@*[contains(. , '!Here')]",
     doc, 
    XPathConstants.NODE
)

attribute.setValue("What!");

If you want to find attributes by particular value, use

如果要按特定值查找属性,请使用

"//*/@*[ . = '!Here:String:HashKey' ]"

If you search for attribute using number comparison, for instance, if you had

如果您使用数字比较搜索属性,例如,如果您有

<bean value="999"></bean>
<bean value="1337"></bean>

then you could select second bean by setting expression to

然后你可以通过将表达式设置为第二个 bean

"//*/@*[ . > 1000]"