java 为什么我的 DOM 解析器无法读取 UTF-8

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/16400136/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-31 22:51:33  来源:igfitidea点击:

Why my DOM parser cant read UTF-8

javaparsingdom

提问by ivanz

I have problem that my DOM parser can′t load file when there are UTF-8 characters in XML file Now, i am aware that i have to give him instruction to read utf-8, but i don′t know how to put it in my code here it is:

我遇到的问题是我的 DOM 解析器在 XML 文件中有 UTF-8 字符时无法加载文件 现在,我知道我必须给他阅读 utf-8 的指令,但我不知道如何放置在我的代码中,它是:

File xmlFile = new File(fileName);
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(xmlFile);
doc.getDocumentElement().normalize();

i am aware that there is method setencoding(), but i don′t know where to put it in my code...

我知道有 setencoding() 方法,但我不知道把它放在我的代码中的什么地方...

回答by Rajesh Mbm

Try this. Worked for me

试试这个。对我来说有效

        InputStream inputStream= new FileInputStream(completeFileName);
        Reader reader = new InputStreamReader(inputStream,"UTF-8");
        InputSource is = new InputSource(reader);
        is.setEncoding("UTF-8");

        DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
        DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
        Document doc = dBuilder.parse(is);

回答by Eugene

Try to use Reader and provide encoding as parameter:

尝试使用 Reader 并提供编码作为参数:

InputStream inputStream = new FileInputStream(fileName);
documentBuilder.parse(new InputSource(new InputStreamReader(inputStream, "UTF-8")));

回答by john-salib

I used what Eugene did up there and changed it a little.

我使用了 Eugene 在那里所做的并稍微改变了它。

DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();

FileInputStream in = new FileInputStream(new File("XML.xml"));
Document doc = dBuilder.parse(in, "UTF-8");

though this will be read as UTF-8if you are printing in eclipse console it won't show any 'UTF-8' characters unless the java file is saved as 'UTF-8', or at least that what happened with me

尽管这将被视为UTF-8在 Eclipse 控制台中打印,但它不会显示任何“UTF-8”字符,除非将 java 文件另存为“UTF-8”,或者至少是我发生的情况