Java - 逐行读取 csv 文件 - 被读取的奇怪的不存在的字符卡住了!
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2877928/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Java - Reading a csv file line by line - stuck with weird non-existent characters being read!
提问by rockit
hello fellow java developers. I'm having a very strange issue.
各位 Java 开发人员,您好。我有一个非常奇怪的问题。
I'm trying to read a csv file line by line. Im at the point where Im just testing out the reading of the lines. ONly each time that I read a line, the line contains square characters between each character of text. I even saved the file as a txt file in wordpad and notepad with no change.
我正在尝试逐行读取 csv 文件。我刚刚测试了线条的阅读。只有每次我读一行时,该行在文本的每个字符之间都包含方形字符。我什至将文件保存为写字板和记事本中的 txt 文件,没有任何更改。
Thus I must be doing something stupid...
因此,我一定是在做一些愚蠢的事情......
I have a csv file, standard csv file, yes a text file with commas in it. I try to read a line of text, but the text is all f-ed up and cannot find the phrase within the text.
我有一个 csv 文件,标准的 csv 文件,是的,一个带有逗号的文本文件。我尝试阅读一行文本,但文本全部被拼凑,无法在文本中找到该短语。
Any advice? code below.
有什么建议吗?下面的代码。
//open csv
File filReadMe = new File(strRoot + "data2.csv");
BufferedReader brReadMe = new BufferedReader
(new InputStreamReader(new FileInputStream(filReadMe)));
String strLine = brReadMe.readLine();
//for all lines
while (strLine != null){
//if line contains "(see also"
if (strLine.toLowerCase().contains("(see also")){
//write line from "(see also" to ")"
int iBegin = strLine.toLowerCase().indexOf("(see also");
String strTemp = strLine.substring(iBegin);
int iLittleEnd = strTemp.indexOf(")");
System.out.println(strLine.substring(iBegin, iBegin + iLittleEnd));
}
//update line
strLine = brReadMe.readLine();
} //end for
brReadMe.close();
回答by mdma
I can only think that this is an inconsistent character encoding. Open the file in notepad, choose Save As, and select UTF-8 in the drop down for "encoding". Then add "UTF-8" as a second parameter to InputStreamReader, e.g.
我只能认为这是一种不一致的字符编码。在记事本中打开文件,选择另存为,然后在“编码”下拉列表中选择 UTF-8。然后将“UTF-8”作为第二个参数添加到 InputStreamReader,例如
BufferedReader brReadMe = new BufferedReader
(new InputStreamReader(new FileInputStream(filReadMe), "UTF-8"));
That should sort out any inconsistencies with encoding.
这应该解决与编码的任何不一致之处。

