Java Apache Commons CSV 库中封装的令牌和分隔符之间的字符无效
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/26729799/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Invalid char between encapsulated token and delimiter in Apache Commons CSV library
提问by Santhosh Sridhar
I am getting the following error while parsing the CSV file using the Apache Commons CSVlibrary.
使用Apache Commons CSV库解析 CSV 文件时出现以下错误。
Exception in thread "main" java.io.IOException: (line 2) invalid char between encapsulated token and delimiter
at org.apache.commons.csv.Lexer.parseEncapsulatedToken(Lexer.java:275)
at org.apache.commons.csv.Lexer.nextToken(Lexer.java:152)
at org.apache.commons.csv.CSVParser.nextRecord(CSVParser.java:450)
at org.apache.commons.csv.CSVParser.getRecords(CSVParser.java:327)
at parse.csv.file.CSVFileParser.main(CSVFileParser.java:29)
What's the meaning of this error ?
这个错误是什么意思?
回答by Steve Siebert
That line in the CSV file contains an invalid character between one of your cells and either the end of line, end of file, or the next cell. A very common cause for this is a failure to escape your encapsulating character (the character that is used to "wrap" each cell, so CSV knows where a cell (token) starts and ends.
CSV 文件中的该行在您的一个单元格与行尾、文件尾或下一个单元格之间包含无效字符。一个非常常见的原因是未能转义您的封装字符(用于“包装”每个单元格的字符,因此 CSV 知道单元格(令牌)的开始和结束位置。
回答by Santhosh Sridhar
I found the solution to the problem. One of my CSV file has an attribute as follows: "attribute with nested "quote" "
我找到了问题的解决方案。我的 CSV 文件之一具有如下属性: “带有嵌套“引用”的属性”
Due to nested quote in the attribute the parser fails.
由于属性中的嵌套引用,解析器失败。
To avoid the above problem escape the nested quote as follows: "attribute with nested """"quote"""" "
为了避免上述问题逸出嵌套引用如下: “属性具有嵌套‘’”“报价”“””“
This is the one way to solve the problem.
这是解决问题的一种方法。
回答by Cuga
We ran into this in this same error with data containing quotes in otherwise unquoted input. I.e.:
我们在同样的错误中遇到了这个错误,数据在其他未加引号的输入中包含引号。IE:
some cell|this "cell" caused issues|other data
It was hard to find, but in Apache's docs, they mention the withQuote()
method which can take null
as a value.
很难找到,但是在Apache 的文档中,他们提到了withQuote()
可以null
作为值的方法。
We were getting the exact same error message and this (thankfully) ended up fixing the issue for us.
我们收到了完全相同的错误消息,这(谢天谢地)最终为我们解决了问题。
回答by Anand
We ran into this issue when we had embedded quote in our data.
当我们在数据中嵌入报价时,我们遇到了这个问题。
0,"020"1,"BS:5252525 ORDER:99999"4
Solution applied was CSVFormat csvFileFormat = CSVFormat.DEFAULT.withQuote(null);
应用的解决方案是 CSVFormat csvFileFormat = CSVFormat.DEFAULT.withQuote(null);
@Cuga tip helped us to resolve. Thanks @Cuga
@Cuga 提示帮助我们解决了问题。谢谢@Cuga
Full code is
完整代码是
public static void main(String[] args) throws IOException {
FileReader fileReader = null;
CSVFormat csvFileFormat = CSVFormat.DEFAULT.withQuote(null);
String fileName = "test.csv";
fileReader = new FileReader(fileName);
CSVParser csvFileParser = new CSVParser(fileReader, csvFileFormat);
List<CSVRecord> csvRecords = csvFileParser.getRecords();
for (CSVRecord csvRecord : csvRecords) {
System.out.println(csvRecord);
}
csvFileParser.close();
}
Result is
结果是
CSVRecord [comment=null, mapping=null, recordNumber=1, values=[0, "020"1, "BS:5252525 ORDER:99999"4]]
回答by Alan47
I ran into this issue when I forgot to call .withNullString("")
on my CSVFormat
. Basically, this exception always occurs when:
当我忘记调用.withNullString("")
我的CSVFormat
. 基本上,此异常总是在以下情况下发生:
- your quote symbol is wrong
- your null string representation is wrong
- your column separator char is wrong
- 你的引号是错误的
- 你的空字符串表示是错误的
- 您的列分隔符字符错误
Make sure you know the details of your format. Also, some programs use leading byte-order-marks (for example, Excel uses \uFEFF
) to denote the encoding of the file. This can also trip up your parser.
确保您知道格式的详细信息。此外,一些程序使用前导字节顺序标记(例如,Excel 使用\uFEFF
)来表示文件的编码。这也可能会绊倒您的解析器。