使用 Java 和文件路径中的空格解析 XML 文件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1132082/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-29 15:18:03  来源:igfitidea点击:

Parsing XML files with Java and spaces in file path

javaxmlspacesfilepath

提问by glmxndr

I have files on my file system, on Windows XP. I want to parse them using Java (JRE 1.6).

我的文件系统上有文件,在 Windows XP 上。我想使用 Java (JRE 1.6) 解析它们。

Problem is, I don't understand how Java and Xerces work together when the file path has spaces in it.

问题是,当文件路径中有空格时,我不明白 Java 和 Xerces 如何协同工作。

If the file has no spaces in its path, all works fine.

如果文件的路径中没有空格,则一切正常。

If there are spaces, I may have this kind of trouble, even if I call the parser with a FileInputStream instance:

如果有空格,我可能会遇到这种麻烦,即使我使用 FileInputStream 实例调用解析器

java.net.UnknownHostException: .
    at java.net.PlainSocketImpl.connect(Unknown Source)
    at java.net.Socket.connect(Unknown Source)
    at java.net.Socket.connect(Unknown Source)
    at sun.net.NetworkClient.doConnect(Unknown Source)
    at sun.net.NetworkClient.openServer(Unknown Source)
    at sun.net.ftp.FtpClient.openServer(Unknown Source)
    at sun.net.ftp.FtpClient.openServer(Unknown Source)
    at sun.net.www.protocol.ftp.FtpURLConnection.connect(Unknown Source)
    at sun.net.www.protocol.ftp.FtpURLConnection.getInputStream(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startEntity(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startDTDEntity(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDTDScannerImpl.setInputSource(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.dispatch(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown Source)
    at javax.xml.parsers.DocumentBuilder.parse(Unknown Source)

(sun.net.ftp.FtpClient.openServer??? Wtf ?)

sun.net.ftp.FtpClient.openServer???哇?)

or else this kind of trouble :

否则这种麻烦:

java.net.MalformedURLException: unknown protocol: d
    at java.net.URL.<init>(Unknown Source)
    at java.net.URL.<init>(Unknown Source)
    at java.net.URL.<init>(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startEntity(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startDTDEntity(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDTDScannerImpl.setInputSource(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.dispatch(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown Source)

(It says unknown protocol: dbecause, I guess, the file is on the D drive.)

(它说unknown protocol: d是因为,我猜,该文件在 D 驱动器上。)

Has anyone any clue of why that happens, and how to circumvent the problem ? I tried to supply my own EntityResolver but my log tells me it is not even called before the crash.

有没有人知道为什么会发生这种情况,以及如何规避这个问题?我试图提供我自己的 EntityResolver 但我的日志告诉我它在崩溃之前甚至没有被调用。



EDIT:

编辑:

Here is the code calling the parser.

这是调用解析器的代码。

public Document fileToDom(File file) throws ProcessException {
    Document doc = null;
    try {
        DocumentBuilderFactory db = DocumentBuilderFactory.newInstance();
        DocumentBuilder builder = db.newDocumentBuilder();
        if (this.errorHandler!=null){
            builder.setErrorHandler(this.errorHandler);}
        else {
            builder.setErrorHandler(new DefaultHandler());
        }
        FileInputStream test= new FileInputStream(file);
        doc = builder.parse(test);
        ...
    } catch (Exception e) {...}
    ...
}


For the moment I find myself forced to remove the DOCTYPE before the parse, which removes all the problems, and the DTD validation... Not so great a solution.

目前我发现自己被迫在解析之前删除 DOCTYPE,这消除了所有问题,以及 DTD 验证......不是一个很好的解决方案。

回答by Jon Skeet

Are you just using DocumentBuilder.parse(filename)?

你只是在用DocumentBuilder.parse(filename)吗?

If so, that's failing because it expects a URI. Open a FileInputStreamto the file, and then pass that to DocumentBuilder.parse(InputStream).

如果是这样,那就失败了,因为它需要一个 URI。打开FileInputStream文件,然后将其传递给DocumentBuilder.parse(InputStream).

回答by Dan Fleet

It looks like it's trying to connect to a URL in the doctype header so it can download it in order to validate the document against the downloaded DTD.

看起来它正在尝试连接到 doctype 标头中的 URL,以便它可以下载它以根据下载的 DTD 验证文档。

回答by victor hugo

Try this URI style:

试试这个 URI 样式:

file:///d:/folder/folder%20with%20space/file.xml

file:///d:/folder/folder%20with%20space/file.xml

回答by Sabeer Ebrahim

Try this.

试试这个。

InputSource is = new InputSource();
is.setCharacterStream(new StringReader(test));
doc = builder.parse(is);

instead of just parsing the 'test'

而不是仅仅解析“测试”