Java 根据 XSD 文件验证 XML 文件的最佳方法是什么?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/15732/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
What's the best way to validate an XML file against an XSD file?
提问by Jeff
I'm generating some xml files that needs to conform to an xsd file that was given to me. What's the best way to verify they conform?
我正在生成一些需要符合给我的 xsd 文件的 xml 文件。验证它们是否符合的最佳方法是什么?
采纳答案by McDowell
The Java runtime library supports validation. Last time I checked this was the Apache Xerces parser under the covers. You should probably use a javax.xml.validation.Validator.
Java 运行时库支持验证。上次我检查这是幕后的 Apache Xerces 解析器。您可能应该使用javax.xml.validation.Validator。
import javax.xml.XMLConstants;
import javax.xml.transform.Source;
import javax.xml.transform.stream.StreamSource;
import javax.xml.validation.*;
import java.net.URL;
import org.xml.sax.SAXException;
//import java.io.File; // if you use File
import java.io.IOException;
...
URL schemaFile = new URL("http://host:port/filename.xsd");
// webapp example xsd:
// URL schemaFile = new URL("http://java.sun.com/xml/ns/j2ee/web-app_2_4.xsd");
// local file example:
// File schemaFile = new File("/location/to/localfile.xsd"); // etc.
Source xmlFile = new StreamSource(new File("web.xml"));
SchemaFactory schemaFactory = SchemaFactory
.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
try {
Schema schema = schemaFactory.newSchema(schemaFile);
Validator validator = schema.newValidator();
validator.validate(xmlFile);
System.out.println(xmlFile.getSystemId() + " is valid");
} catch (SAXException e) {
System.out.println(xmlFile.getSystemId() + " is NOT valid reason:" + e);
} catch (IOException e) {}
The schema factory constant is the string http://www.w3.org/2001/XMLSchema
which defines XSDs. The above code validates a WAR deployment descriptor against the URL http://java.sun.com/xml/ns/j2ee/web-app_2_4.xsd
but you could just as easily validate against a local file.
模式工厂常量是http://www.w3.org/2001/XMLSchema
定义 XSD的字符串。上面的代码根据 URL 验证 WAR 部署描述符,http://java.sun.com/xml/ns/j2ee/web-app_2_4.xsd
但您可以同样轻松地针对本地文件进行验证。
You should not use the DOMParser to validate a document (unless your goal is to create a document object model anyway). This will start creating DOM objects as it parses the document - wasteful if you aren't going to use them.
您不应该使用 DOMParser 来验证文档(除非您的目标是创建文档对象模型)。这将在解析文档时开始创建 DOM 对象 - 如果您不打算使用它们,那就太浪费了。
回答by SCdF
Here's how to do it using Xerces2. A tutorial for this, here(req. signup).
下面是如何使用Xerces2做到这一点。这里有一个教程(需要注册)。
Original attribution: blatantly copied from here:
原始归属:公然从这里复制:
import org.apache.xerces.parsers.DOMParser;
import java.io.File;
import org.w3c.dom.Document;
public class SchemaTest {
public static void main (String args[]) {
File docFile = new File("memory.xml");
try {
DOMParser parser = new DOMParser();
parser.setFeature("http://xml.org/sax/features/validation", true);
parser.setProperty(
"http://apache.org/xml/properties/schema/external-noNamespaceSchemaLocation",
"memory.xsd");
ErrorChecker errors = new ErrorChecker();
parser.setErrorHandler(errors);
parser.parse("memory.xml");
} catch (Exception e) {
System.out.print("Problem parsing the file.");
}
}
}
回答by Adam
Are you looking for a tool or a library?
您在寻找工具还是库?
As far as libraries goes, pretty much the de-facto standard is Xerces2which has both C++and Javaversions.
就库而言,事实上的标准几乎是Xerces2,它同时具有C++和Java版本。
Be fore warned though, it is a heavy weight solution. But then again, validating XML against XSD files is a rather heavy weight problem.
但请注意,这是一个重量级的解决方案。但话又说回来,针对 XSD 文件验证 XML 是一个相当重要的问题。
As for a tool to do this for you, XMLFoxseems to be a decent freeware solution, but not having used it personally I can't say for sure.
至于为您执行此操作的工具,XMLFox似乎是一个不错的免费软件解决方案,但我没有亲自使用它,我不能肯定地说。
回答by KnomDeGuerre
I had to validate an XML against XSD just one time, so I tried XMLFox. I found it to be very confusing and weird. The help instructions didn't seem to match the interface.
我只需要一次针对 XSD 验证 XML,所以我尝试了 XMLFox。我发现它非常令人困惑和奇怪。帮助说明似乎与界面不匹配。
I ended up using LiquidXML Studio 2008 (v6) which was much easier to use and more immediately familiar (the UI is very similar to Visual Basic 2008 Express, which I use frequently). The drawback: the validation capability is not in the free version, so I had to use the 30 day trial.
我最终使用了 LiquidXML Studio 2008 (v6),它更易于使用且更直接熟悉(UI 与我经常使用的 Visual Basic 2008 Express 非常相似)。缺点:验证功能不在免费版本中,所以我不得不使用 30 天试用版。
回答by Todd
If you are generating XML files programatically, you may want to look at the XMLBeans library. Using a command line tool, XMLBeans will automatically generate and package up a set of Java objects based on an XSD. You can then use these objects to build an XML document based on this schema.
如果您以编程方式生成 XML 文件,您可能需要查看XMLBeans库。使用命令行工具,XMLBeans 将自动生成并打包一组基于 XSD 的 Java 对象。然后,您可以使用这些对象来构建基于此模式的 XML 文档。
It has built-in support for schema validation, and can convert Java objects to an XML document and vice-versa.
它具有对模式验证的内置支持,并且可以将 Java 对象转换为 XML 文档,反之亦然。
Castorand JAXBare other Java libraries that serve a similar purpose to XMLBeans.
回答by StaxMan
One more answer: since you said you need to validate files you are generating(writing), you might want to validate content while you are writing, instead of first writing, then reading back for validation. You can probably do that with JDK API for Xml validation, if you use SAX-based writer: if so, just link in validator by calling 'Validator.validate(source, result)', where source comes from your writer, and result is where output needs to go.
另一个答案:既然您说需要验证正在生成(写入)的文件,那么您可能希望在写入时验证内容,而不是先写入,然后再回读以进行验证。如果您使用基于 SAX 的编写器,您可能可以使用 JDK API 进行 Xml 验证:如果是这样,只需通过调用“Validator.validate(source, result)”链接验证器,其中源来自您的编写器,结果是输出需要去哪里。
Alternatively if you use Stax for writing content (or a library that uses or can use stax), Woodstoxcan also directly support validation when using XMLStreamWriter. Here's a blog entryshowing how that is done:
或者,如果您使用 Stax 编写内容(或使用或可以使用 stax 的库),Woodstox也可以在使用 XMLStreamWriter 时直接支持验证。这是一个博客条目,显示了如何完成:
回答by chickeninabiscuit
We build our project using ant, so we can use the schemavalidate task to check our config files:
我们使用 ant 构建我们的项目,因此我们可以使用 schemavalidate 任务来检查我们的配置文件:
<schemavalidate>
<fileset dir="${configdir}" includes="**/*.xml" />
</schemavalidate>
Now naughty config files will fail our build!
现在顽皮的配置文件将使我们的构建失败!
回答by juwens
If you have a Linux-Machine you could use the free command-line tool SAXCount. I found this very usefull.
如果您有 Linux 机器,您可以使用免费的命令行工具 SAXCount。我发现这非常有用。
SAXCount -f -s -n my.xml
It validates against dtd and xsd. 5s for a 50MB file.
它针对 dtd 和 xsd 进行验证。50MB 文件需要 5 秒。
In debian squeeze it is located in the package "libxerces-c-samples".
在 debian 中,它位于“libxerces-c-samples”包中。
The definition of the dtd and xsd has to be in the xml! You can't config them separately.
dtd 和 xsd 的定义必须在 xml 中!您不能单独配置它们。
回答by Paulo Fidalgo
Using Java 7 you can follow the documentation provided in package description.
使用 Java 7,您可以按照包描述中提供的文档进行操作。
// parse an XML document into a DOM tree DocumentBuilder parser = DocumentBuilderFactory.newInstance().newDocumentBuilder(); Document document = parser.parse(new File("instance.xml")); // create a SchemaFactory capable of understanding WXS schemas SchemaFactory factory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI); // load a WXS schema, represented by a Schema instance Source schemaFile = new StreamSource(new File("mySchema.xsd")); Schema schema = factory.newSchema(schemaFile); // create a Validator instance, which can be used to validate an instance document Validator validator = schema.newValidator(); // validate the DOM tree try { validator.validate(new DOMSource(document)); } catch (SAXException e) { // instance document is invalid! }
// parse an XML document into a DOM tree DocumentBuilder parser = DocumentBuilderFactory.newInstance().newDocumentBuilder(); Document document = parser.parse(new File("instance.xml")); // create a SchemaFactory capable of understanding WXS schemas SchemaFactory factory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI); // load a WXS schema, represented by a Schema instance Source schemaFile = new StreamSource(new File("mySchema.xsd")); Schema schema = factory.newSchema(schemaFile); // create a Validator instance, which can be used to validate an instance document Validator validator = schema.newValidator(); // validate the DOM tree try { validator.validate(new DOMSource(document)); } catch (SAXException e) { // instance document is invalid! }
回答by rogerdpack
Since this is a popular question, I will point out that java can also validate against "referred to" xsd's, for instance if the .xml file itself specifies XSD's in the header, using xsi:SchemaLocation
or xsi:noNamespaceSchemaLocation
(or xsi for particular namespaces) ex:
由于这是一个流行的问题,我将指出 java 还可以针对“引用”xsd 进行验证,例如,如果 .xml 文件本身在标头中指定了 XSD,则使用xsi:SchemaLocation
or xsi:noNamespaceSchemaLocation
(或 xsi 用于特定命名空间)例如:
<document xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="http://www.example.com/document.xsd">
...
or SchemaLocation (always a list of namespace to xsd mappings)
或 SchemaLocation(总是命名空间到 xsd 映射的列表)
<document xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:SchemaLocation="http://www.example.com/my_namespace http://www.example.com/document.xsd">
...
The other answers work here as well, because the .xsd files "map" to the namespaces declared in the .xml file, because they declare a namespace, and if matches up with the namespace in the .xml file, you're good. But sometimes it's convenient to be able to have a custom resolver...
其他答案在这里也适用,因为 .xsd 文件“映射”到 .xml 文件中声明的命名空间,因为它们声明了一个命名空间,如果与 .xml 文件中的命名空间匹配,你就很好。但有时能够拥有自定义解析器很方便......
From the javadocs: "If you create a schema without specifying a URL, file, or source, then the Java language creates one that looks in the document being validated to find the schema it should use. For example:"
来自 javadocs:“如果您创建一个模式而不指定 URL、文件或源,那么 Java 语言会创建一个在被验证的文档中查找它应该使用的模式的模式。例如:”
SchemaFactory factory = SchemaFactory.newInstance("http://www.w3.org/2001/XMLSchema");
Schema schema = factory.newSchema();
and this works for multiple namespaces, etc.
The problem with this approach is that the xmlsns:xsi
is probably a network location, so it'll by default go out and hit the network with each and every validation, not always optimal.
这适用于多个命名空间等。这种方法的问题在于xmlsns:xsi
它可能是一个网络位置,因此默认情况下它会在每次验证时出去并访问网络,并不总是最佳的。
Here's an example that validates an XML file against any XSD's it references (even if it has to pull them from the network):
下面是一个根据它引用的任何 XSD 验证 XML 文件的示例(即使它必须从网络中提取它们):
public static void verifyValidatesInternalXsd(String filename) throws Exception {
InputStream xmlStream = new new FileInputStream(filename);
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setValidating(true);
factory.setNamespaceAware(true);
factory.setAttribute("http://java.sun.com/xml/jaxp/properties/schemaLanguage",
"http://www.w3.org/2001/XMLSchema");
DocumentBuilder builder = factory.newDocumentBuilder();
builder.setErrorHandler(new RaiseOnErrorHandler());
builder.parse(new InputSource(xmlStream));
xmlStream.close();
}
public static class RaiseOnErrorHandler implements ErrorHandler {
public void warning(SAXParseException e) throws SAXException {
throw new RuntimeException(e);
}
public void error(SAXParseException e) throws SAXException {
throw new RuntimeException(e);
}
public void fatalError(SAXParseException e) throws SAXException {
throw new RuntimeException(e);
}
}
You can avoid pulling referenced XSD's from the network, even though the xml files reference url's, by specifying the xsd manually (see some other answers here) or by using an "XML catalog" style resolver. Spring apparently also can interceptthe URL requests to serve local files for validations. Or you can set your own via setResourceResolver, ex:
通过手动指定 xsd(请参阅此处的其他一些答案)或使用“XML 目录”样式解析器,您可以避免从网络中提取引用的 XSD,即使 xml 文件引用了 url 。Spring 显然也可以拦截URL 请求以提供本地文件进行验证。或者您可以通过setResourceResolver设置自己的,例如:
Source xmlFile = new StreamSource(xmlFileLocation);
SchemaFactory schemaFactory = SchemaFactory
.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema schema = schemaFactory.newSchema();
Validator validator = schema.newValidator();
validator.setResourceResolver(new LSResourceResolver() {
@Override
public LSInput resolveResource(String type, String namespaceURI,
String publicId, String systemId, String baseURI) {
InputSource is = new InputSource(
getClass().getResourceAsStream(
"some_local_file_in_the_jar.xsd"));
// or lookup by URI, etc...
return new Input(is); // for class Input see
// https://stackoverflow.com/a/2342859/32453
}
});
validator.validate(xmlFile);
See also herefor another tutorial.
另请参阅此处以获取另一个教程。
I believe the default is to use DOM parsing, you can do something similar with SAX parser that is validating as wellsaxReader.setEntityResolver(your_resolver_here);
我相信,默认是使用DOM解析,你可以用SAX解析器是验证类似的东西,以及saxReader.setEntityResolver(your_resolver_here);