Java 使 DocumentBuilder.parse 忽略 DTD 引用
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/155101/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Make DocumentBuilder.parse ignore DTD references
提问by joe
When I parse my xml file (variable f) in this method, I get an error
当我用这种方法解析我的 xml 文件(变量 f)时,我收到一个错误
C:\Documents and Settings\joe\Desktop\aicpcudev\OnlineModule\map.dtd (The system cannot find the path specified)
C:\Documents and Settings\joe\Desktop\aicpcudev\OnlineModule\map.dtd(系统找不到指定的路径)
I know I do not have the dtd, nor do I need it. How can I parse this File object into a Document object while ignoring DTD reference errors?
我知道我没有 dtd,也不需要它。如何在忽略 DTD 引用错误的同时将此 File 对象解析为 Document 对象?
private static Document getDoc(File f, String docId) throws Exception{
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(f);
return doc;
}
采纳答案by toolkit
A similar approach to the one suggested by @anjanb
与@anjanb建议的方法类似的方法
builder.setEntityResolver(new EntityResolver() {
@Override
public InputSource resolveEntity(String publicId, String systemId)
throws SAXException, IOException {
if (systemId.contains("foo.dtd")) {
return new InputSource(new StringReader(""));
} else {
return null;
}
}
});
I found that simply returning an empty InputSource worked just as well?
我发现简单地返回一个空的 InputSource 也能正常工作吗?
回答by Edward Z. Yang
I know I do not have the dtd, nor do I need it.
我知道我没有 dtd,也不需要它。
I am suspicious of this statement; does your document contain any entity references? If so, you definitely need the DTD.
我对这个说法表示怀疑;您的文档是否包含任何实体引用?如果是这样,您肯定需要 DTD。
Anyway, the usual way of preventing this from happening is using an XML catalog to define a local path for "map.dtd".
无论如何,防止这种情况发生的常用方法是使用 XML 目录为“map.dtd”定义本地路径。
回答by anjanb
here's another user who got the same issue : http://forums.sun.com/thread.jspa?threadID=284209&forumID=34
这是另一个遇到相同问题的用户:http: //forums.sun.com/thread.jspa?threadID=284209&forumID=34
user ddssot on that post says
该帖子上的用户 ddssot 说
myDocumentBuilder.setEntityResolver(new EntityResolver() {
public InputSource resolveEntity(java.lang.String publicId, java.lang.String systemId)
throws SAXException, java.io.IOException
{
if (publicId.equals("--myDTDpublicID--"))
// this deactivates the open office DTD
return new InputSource(new ByteArrayInputStream("<?xml version='1.0' encoding='UTF-8'?>".getBytes()));
else return null;
}
});
The user further mentions "As you can see, when the parser hits the DTD, the entity resolver is called. I recognize my DTD with its specific ID and return an empty XML doc instead of the real DTD, stopping all validation..."
用户进一步提到“正如您所看到的,当解析器命中 DTD 时,实体解析器被调用。我用它的特定 ID 识别我的 DTD,并返回一个空的 XML 文档而不是真正的 DTD,停止所有验证......”
Hope this helps.
希望这可以帮助。
回答by jt.
Try setting features on the DocumentBuilderFactory:
尝试在 DocumentBuilderFactory 上设置功能:
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setValidating(false);
dbf.setNamespaceAware(true);
dbf.setFeature("http://xml.org/sax/features/namespaces", false);
dbf.setFeature("http://xml.org/sax/features/validation", false);
dbf.setFeature("http://apache.org/xml/features/nonvalidating/load-dtd-grammar", false);
dbf.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
DocumentBuilder db = dbf.newDocumentBuilder();
...
Ultimately, I think the options are specific to the parser implementation. Here is some documentation for Xerces2if that helps.
最终,我认为这些选项特定于解析器实现。如果有帮助,这里是 Xerces2 的一些文档。
回答by Peter J
I found an issue where the DTD file was in the jar file along with the XML. I solved the issue based on the examples here, as follows: -
我发现了一个问题,即 DTD 文件与 XML 一起位于 jar 文件中。我根据此处的示例解决了该问题,如下所示:-
DocumentBuilder db = dbf.newDocumentBuilder();
db.setEntityResolver(new EntityResolver() {
public InputSource resolveEntity(String publicId, String systemId) throws SAXException, IOException {
if (systemId.contains("doc.dtd")) {
InputStream dtdStream = MyClass.class
.getResourceAsStream("/my/package/doc.dtd");
return new InputSource(dtdStream);
} else {
return null;
}
}
});
回答by Shoaib Khan
Source XML (With DTD)
源 XML(使用 DTD)
<!DOCTYPE MYSERVICE SYSTEM "./MYSERVICE.DTD">
<MYACCSERVICE>
<REQ_PAYLOAD>
<ACCOUNT>1234567890</ACCOUNT>
<BRANCH>001</BRANCH>
<CURRENCY>USD</CURRENCY>
<TRANS_REFERENCE>201611100000777</TRANS_REFERENCE>
</REQ_PAYLOAD>
</MYACCSERVICE>
Java DOM implementation for accepting above XML as String and removing DTD declaration
接受上述 XML 作为字符串并删除 DTD 声明的 Java DOM 实现
public Document removeDTDFromXML(String payload) throws Exception {
System.out.println("### Payload received in XMlDTDRemover: " + payload);
Document doc = null;
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
try {
dbf.setValidating(false);
dbf.setNamespaceAware(true);
dbf.setFeature("http://xml.org/sax/features/namespaces", false);
dbf.setFeature("http://xml.org/sax/features/validation", false);
dbf.setFeature("http://apache.org/xml/features/nonvalidating/load-dtd-grammar", false);
dbf.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
DocumentBuilder db = dbf.newDocumentBuilder();
InputSource is = new InputSource();
is.setCharacterStream(new StringReader(payload));
doc = db.parse(is);
} catch (ParserConfigurationException e) {
System.out.println("Parse Error: " + e.getMessage());
return null;
} catch (SAXException e) {
System.out.println("SAX Error: " + e.getMessage());
return null;
} catch (IOException e) {
System.out.println("IO Error: " + e.getMessage());
return null;
}
return doc;
}
Destination XML (Without DTD)
目标 XML(无 DTD)
<MYACCSERVICE>
<REQ_PAYLOAD>
<ACCOUNT>1234567890</ACCOUNT>
<BRANCH>001</BRANCH>
<CURRENCY>USD</CURRENCY>
<TRANS_REFERENCE>201611100000777</TRANS_REFERENCE>
</REQ_PAYLOAD>
</MYACCSERVICE>
回答by McCoy
I'm working with sonarqube, and sonarlint for eclipse showed me Untrusted XML should be parsed without resolving external data (squid:S2755)
我正在使用 sonarqube,并且 sonarlint for eclipse 向我展示了Untrusted XML should beparsed without resolve external data (squid:S2755)
I managed to solve it using:
我设法解决它使用:
factory = DocumentBuilderFactory.newInstance();
factory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
// If you can't completely disable DTDs, then at least do the following:
// Xerces 1 - http://xerces.apache.org/xerces-j/features.html#external-general-entities
// Xerces 2 - http://xerces.apache.org/xerces2-j/features.html#external-general-entities
// JDK7+ - http://xml.org/sax/features/external-general-entities
factory.setFeature("http://xml.org/sax/features/external-general-entities", false);
// Xerces 1 - http://xerces.apache.org/xerces-j/features.html#external-parameter-entities
// Xerces 2 - http://xerces.apache.org/xerces2-j/features.html#external-parameter-entities
// JDK7+ - http://xml.org/sax/features/external-parameter-entities
factory.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
// Disable external DTDs as well
factory.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
// and these as well, per Timothy Morgan's 2014 paper: "XML Schema, DTD, and Entity Attacks"
factory.setXIncludeAware(false);
factory.setExpandEntityReferences(false);