java 如何使用java将一个XML文件拆分成多个XML文件
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/29166170/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
how to split an XML file into multiple XML files using java
提问by Din Ionu? Valentin
I'm using XML files in Java for the first time and i need some help. I am trying to split an XML file to multiple XML files using Java
我是第一次在 Java 中使用 XML 文件,我需要一些帮助。我正在尝试使用 Java 将一个 XML 文件拆分为多个 XML 文件
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<products>
<product>
<description>Sony 54.6" (Diag) Xbr Hx929 Internet Tv</description>
<gtin>00027242816657</gtin>
<price>2999.99</price>
<orderId>2343</orderId>
<supplier>Sony</supplier>
</product>
<product>
<description>Apple iPad 2 with Wi-Fi 16GB - iOS 5 - Black
</description>
<gtin>00885909464517</gtin>
<price>399.0</price>
<orderId>2343</orderId>
<supplier>Apple</supplier>
</product>
<product>
<description>Sony NWZ-E464 8GB E Series Walkman Video MP3 Player Blue
</description>
<gtin>00027242831438</gtin>
<price>91.99</price>
<orderId>2343</orderId>
<supplier>Sony</supplier>
</product>
<product>
<description>Apple MacBook Air A 11.6" Mac OS X v10.7 Lion MacBook
</description>
<gtin>00885909464043</gtin>
<price>1149.0</price>
<orderId>2344</orderId>
<supplier>Apple</supplier>
</product>
<product>
<description>Panasonic TC-L47E50 47" Smart TV Viera E50 Series LED
HDTV</description>
<gtin>00885170076471</gtin>
<price>999.99</price>
<orderId>2344</orderId>
<supplier>Panasonic</supplier>
</product>
</products>
and I'm trying to get three XML documents like:
我正在尝试获取三个 XML 文档,例如:
<?xml version="1.0" encoding="UTF-8"?>
<products>
<product>
<description>Sony 54.6" (Diag) Xbr Hx929 Internet Tv</description>
<gtin>00027242816657</gtin>
<price currency="USD">2999.99</price>
<orderid>2343</orderid>
</product>
<product>
<description>Sony NWZ-E464 8GB E Series Walkman Video MP3 Player Blue</description>
<gtin>00027242831438</gtin>
<price currency="USD">91.99</price>
<orderid>2343</orderid>
</product>
</products>
one for each supplier. How can I receive it? Any help on this will be great.
每个供应商一个。我怎样才能收到它?对此的任何帮助都会很棒。
采纳答案by slux83
Make sure you change the path in "inputFile" to your file and also the output part:
确保将“inputFile”中的路径更改为您的文件以及输出部分:
StreamResult result = new StreamResult(new File("C:\xmls\" + supplier.trim() + ".xml"));
Here your code.
这是你的代码。
import java.io.File;
import java.util.ArrayList;
import java.util.List;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpression;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
public class ExtractXml
{
/**
* @param args
*/
public static void main(String[] args) throws Exception
{
String inputFile = "resources/products.xml";
File xmlFile = new File(inputFile);
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(xmlFile);
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(true); // never forget this!
XPathFactory xfactory = XPathFactory.newInstance();
XPath xpath = xfactory.newXPath();
XPathExpression allProductsExpression = xpath.compile("//product/supplier/text()");
NodeList productNodes = (NodeList) allProductsExpression.evaluate(doc, XPathConstants.NODESET);
//Save all the products
List<String> suppliers = new ArrayList<String>();
for (int i=0; i<productNodes.getLength(); ++i)
{
Node productName = productNodes.item(i);
System.out.println(productName.getTextContent());
suppliers.add(productName.getTextContent());
}
//Now we create the split XMLs
for (String supplier : suppliers)
{
String xpathQuery = "/products/product[supplier='" + supplier + "']";
xpath = xfactory.newXPath();
XPathExpression query = xpath.compile(xpathQuery);
NodeList productNodesFiltered = (NodeList) query.evaluate(doc, XPathConstants.NODESET);
System.out.println("Found " + productNodesFiltered.getLength() +
" product(s) for supplier " + supplier);
//We store the new XML file in supplierName.xml e.g. Sony.xml
Document suppXml = dBuilder.newDocument();
//we have to recreate the root node <products>
Element root = suppXml.createElement("products");
suppXml.appendChild(root);
for (int i=0; i<productNodesFiltered.getLength(); ++i)
{
Node productNode = productNodesFiltered.item(i);
//we append a product (cloned) to the new file
Node clonedNode = productNode.cloneNode(true);
suppXml.adoptNode(clonedNode); //We adopt the orphan :)
root.appendChild(clonedNode);
}
//At the end, we save the file XML on disk
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
DOMSource source = new DOMSource(suppXml);
StreamResult result = new StreamResult(new File("resources/" + supplier.trim() + ".xml"));
transformer.transform(source, result);
System.out.println("Done for " + supplier);
}
}
}
回答by VikasBhat
Consider this xml
考虑这个xml
<?xml version="1.0"?>
<SSNExportDocument xmlns="urn:com:ssn:schema:export:SSNExportFormat.xsd" Version="0.1" DocumentID="b482350d-62bb-41be-b792-8a9fe3884601-1" ExportID="b482350d-62bb-41be-b792-8a9fe3884601" JobID="464" RunID="3532468" CreationTime="2019-04-16T02:20:01.332-04:00" StartTime="2019-04-15T20:20:00.000-04:00" EndTime="2019-04-16T02:20:00.000-04:00">
<MeterData MeterName="MUNI1-11459398" UtilDeviceID="11459398" MacID="00:12:01:fae:fe:00:d5:fc">
<RegisterData StartTime="2019-04-15T20:00:00.000-04:00" EndTime="2019-04-15T20:00:00.000-04:00" NumberReads="1">
<RegisterRead ReadTime="2019-04-15T20:00:00.000-04:00" GatewayCollectedTime="2019-04-16T01:40:06.214-04:00" RegisterReadSource="REG_SRC_TYPE_EO_CURR_READ" Season="-1">
<Tier Number="0">
<Register Number="1" Summation="5949.1000" SummationUOM="GAL"/>
</Tier>
</RegisterRead>
</RegisterData>
</MeterData>
<MeterData MeterName="MUNI4-11460365" UtilDeviceID="11460365" MacID="00:11:01:bc:fe:00:d3:f9">
<RegisterData StartTime="2019-04-15T20:00:00.000-04:00" EndTime="2019-04-15T20:00:00.000-04:00" NumberReads="1">
<RegisterRead ReadTime="2019-04-15T20:00:00.000-04:00" GatewayCollectedTime="2019-04-16T01:40:11.082-04:00" RegisterReadSource="REG_SRC_TYPE_EO_CURR_READ" Season="-1">
<Tier Number="0">
<Register Number="1" Summation="136349.9000" SummationUOM="GAL"/>
</Tier>
</RegisterRead>
</RegisterData>
</MeterData>
We can use JAXB which converts your xml tags to objects. Then we can play around with them.
我们可以使用 JAXB 将您的 xml 标签转换为对象。然后我们就可以和他们一起玩了。
File xmlFile = new File("input.xml");
jaxbContext = JAXBContext.newInstance(SSNExportDocument.class);
Unmarshaller jaxbUnmarshaller = jaxbContext.createUnmarshaller();
SSNExportDocument ssnExpDoc = (SSNExportDocument) jaxbUnmarshaller.unmarshal(xmlFile);
MeterData mD = new MeterData();
Map<String, List<MeterData>> meterMapper = new HashMap<String, List<MeterData>>(); // Phantom Reference
for (MeterData mData : ssnExpDoc.getMeterData()) {
String meterFullName = mData.getMeterName();
String[] splitMeterName = meterFullName.split("-");
List<MeterData> _meterDataList = meterMapper.get(splitMeterName[0]);// o(1)
if (_meterDataList == null) {
_meterDataList = new ArrayList<>();
_meterDataList.add(mData);
meterMapper.put(splitMeterName[0], _meterDataList);
_meterDataList = null;
} else {
_meterDataList.add(mData);
}
}
meterMapper contains tag names against list of objects
meterMapper 包含针对对象列表的标签名称
Then Marshall the contents using
然后使用 Marshall 将内容编组
JAXBContext jaxbContext = JAXBContext.newInstance(SSNExportDocument.class);
// Create Marshaller
Marshaller jaxbMarshaller = jaxbContext.createMarshaller();
// Required formatting??
jaxbMarshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, Boolean.TRUE);
jaxbMarshaller.setProperty(Marshaller.JAXB_FRAGMENT, Boolean.TRUE);
//jaxbMarshaller.setProperty("com.sun.xml.bind.xmlDeclaration", Boolean.FALSE);
// Print XML String to Console
StringWriter sw = new StringWriter();
// Write XML to StringWriter
jaxbMarshaller.marshal(employee, sw);
// Verify XML Content
String xmlContent = sw.toString();
System.out.println(xmlContent);
回答by slux83
You can have a look here to see how to parse a XML document using DOM, in Java: DOM XML Parser Example
您可以在此处查看如何在 Java 中使用 DOM 解析 XML 文档: DOM XML Parser Example
Here, how to write the new XML file(s): Create XML file using java
在这里,如何编写新的 XML 文件: 使用 java 创建 XML 文件
In addition you could study XPath to easily select your nodes: Java Xpath expression
此外,您可以学习 XPath 以轻松选择您的节点:Java Xpath 表达式
If the performances are not your goal, first of all, once you load your DOM and your Xpath, you can retrieve all the suppliers you have in your xml document using the following XPath query
如果性能不是您的目标,首先,一旦您加载了 DOM 和 Xpath,您就可以使用以下 XPath 查询检索您的 xml 文档中的所有供应商
//supplier/text()
you will get something like that:
你会得到这样的东西:
Text='Sony'
Text='Apple'
Text='Sony'
Text='Apple'
Text='Panasonic'
Then I will put those results in a ArraryList or whatever. The second step will be the iteration of that collection, and for each item query the XML input document in order to extract all the nodes with a particular supplier:
然后我会将这些结果放入 ArraryList 或其他任何内容中。第二步将是该集合的迭代,并为每个项目查询 XML 输入文档以提取具有特定供应商的所有节点:
/products/product[supplier='Sony']
of course in java you will have to build the last xpath query in a dynamic way:
当然,在 Java 中,您必须以动态方式构建最后一个 xpath 查询:
String xpathQuery = "/products/product/[supplier='" + currentValue + "']
After that, you will get the list of nodes which match the supplier you specified. The next step would be constructing the new XML DOM and save it on a file.
之后,您将获得与您指定的供应商匹配的节点列表。下一步是构建新的 XML DOM 并将其保存在文件中。
回答by Selva
DOM parser will consume more memory. I prefer to use SAX parser to read XML and write .
DOM 解析器会消耗更多内存。我更喜欢使用 SAX 解析器来读取 XML 和编写 .
回答by Christoph Burmeister
I like the approach of Xmappr (https://code.google.com/p/xmappr/) where you can use simple annotations:
我喜欢 Xmappr ( https://code.google.com/p/xmappr/) 的方法,您可以在其中使用简单的注释:
first the root-element Products which simply holds a list of Product-instances
首先是根元素产品,它只包含产品实例列表
@RootElement
public class Products {
@Element
public List<Product> product;
}
Then the Product-class
然后是产品类
@RootElement
public class Product {
@Element
public String description;
@Element
public String supplier;
@Element
public String gtin;
@Element
public String price;
@Element
public String orderId;
}
And then you simply fetch the Product-instances from the Products:
然后您只需从产品中获取产品实例:
public static void main(String[] args) throws FileNotFoundException {
Reader reader = new FileReader("test.xml");
Xmappr xm = new Xmappr(Products.class);
Products products = (Products) xm.fromXML(reader);
// fetch list of products
List<Product> listOfProducts = products.product;
// do sth with the products in the list
for (Product product : listOfProducts) {
System.out.println(product.description);
}
}
And then you can do whatever you want with the products (e.g. sorting them according the supplier and put them out to an xml-file)
然后你可以对产品做任何你想做的事情(例如,根据供应商对它们进行分类并将它们放到一个 xml 文件中)
回答by Florian Schaetz
An alternative to Dom would be, if you have the Schema (XSD) for your XML dialect, JAXB.
Dom 的替代方案是,如果您有 XML 方言的架构 (XSD),JAXB。