Java Scanner 换行识别

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/5918896/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-30 13:26:45  来源:igfitidea点击:

Java Scanner newline recognition

javanewlinejava.util.scanner

提问by Anthony

I can't find the documentation that specifies how a Scanner treats newline patterns by default. I want to read a file line by line and have the scanner be able to handle \r, \n or \r\n line endings regardless of the system the program is actually running on.

我找不到指定扫描仪默认如何处理换行符模式的文档。我想逐行读取文件,并使扫描仪能够处理 \r、\n 或 \r\n 行结尾,而不管程序实际运行在哪个系统上。

If I declare a scanner like so:

如果我像这样声明扫描仪:

Scanner scanner = new Scanner(reader);

what is the default behaviour? Will it handle all three kinds as described above or do I have to tell it explicitly to do it?

默认行为是什么?它会处理上述所有三种类型,还是我必须明确告诉它这样做?

回答by David

Looking at the source code for Sun JDK 1.6, the pattern used is "\r\n|[\n\r\u2028\u2029\u0085]"

查看Sun JDK 1.6的源代码,使用的模式是“\r\n|[\n\r\u2028\u2029\u0085]”

which says "\r\n" or any one of \r, \n or the unicode characters for "line separator", "paragraph separator", and "next line" respectively.

它分别表示“\r\n”或“行分隔符”、“段落分隔符”和“下一行”的任何一个\r、\n或unicode字符。

回答by Stephen C

It is not documented (in Java 1.6) but the JDK code uses this regex to match a line break:

它没有记录(在 Java 1.6 中),但 JDK 代码使用此正则表达式来匹配换行符:

"\r\n|[\n\r\u2028\u2029\u0085]"

Here's a link to the source code: http://cr.openjdk.java.net/~briangoetz/7012540/webrev/src/share/classes/java/util/Scanner.java.html

这是源代码的链接:http: //cr.openjdk.java.net/~briangoetz/7012540/webrev/src/share/classes/java/util/Scanner.java.html

IMO, this ought to be specified, since Scanner's behavior wrt to line separators is different to (for example) BufferedReader's. (I've lodged a bug report ...)

海事组织,这应该被指定,因为Scanner' ' 的行分隔符的行为与(例如)BufferedReader的不同。(我已经提交了错误报告...)