java 为什么我的 DOM 解析器无法读取 UTF-8
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/16400136/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Why my DOM parser cant read UTF-8
提问by ivanz
I have problem that my DOM parser can′t load file when there are UTF-8 characters in XML file Now, i am aware that i have to give him instruction to read utf-8, but i don′t know how to put it in my code here it is:
我遇到的问题是我的 DOM 解析器在 XML 文件中有 UTF-8 字符时无法加载文件 现在,我知道我必须给他阅读 utf-8 的指令,但我不知道如何放置在我的代码中,它是:
File xmlFile = new File(fileName);
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(xmlFile);
doc.getDocumentElement().normalize();
i am aware that there is method setencoding(), but i don′t know where to put it in my code...
我知道有 setencoding() 方法,但我不知道把它放在我的代码中的什么地方...
回答by Rajesh Mbm
Try this. Worked for me
试试这个。对我来说有效
InputStream inputStream= new FileInputStream(completeFileName);
Reader reader = new InputStreamReader(inputStream,"UTF-8");
InputSource is = new InputSource(reader);
is.setEncoding("UTF-8");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(is);
回答by Eugene
Try to use Reader and provide encoding as parameter:
尝试使用 Reader 并提供编码作为参数:
InputStream inputStream = new FileInputStream(fileName);
documentBuilder.parse(new InputSource(new InputStreamReader(inputStream, "UTF-8")));
回答by john-salib
I used what Eugene did up there and changed it a little.
我使用了 Eugene 在那里所做的并稍微改变了它。
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
FileInputStream in = new FileInputStream(new File("XML.xml"));
Document doc = dBuilder.parse(in, "UTF-8");
though this will be read as UTF-8
if you are printing in eclipse console it won't show any 'UTF-8' characters unless the java file is saved as 'UTF-8', or at least that what happened with me
尽管这将被视为UTF-8
在 Eclipse 控制台中打印,但它不会显示任何“UTF-8”字符,除非将 java 文件另存为“UTF-8”,或者至少是我发生的情况