Java 扫描仪问题
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1981497/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Java Scanner question
提问by Razvi
How do you set the delimiter for a scanner to either ; or new line?
您如何将扫描仪的分隔符设置为 ; 还是新线?
I tried:
Scanner.useDelimiter(Pattern.compile("(\n)|;"));But it doesn't work.
我试过:
Scanner.useDelimiter(Pattern.compile("(\n)|;"));但它不起作用。
回答by Powerlord
As a general rule, in patterns, you need to double the \.
作为一般规则,在模式中,您需要将\.
So, try
所以,试试
Scanner.useDelimiter(Pattern.compile("(\n)|;"));`
or
或者
Scanner.useDelimiter(Pattern.compile("[\n;]"));`
Edit: If \r\nis the problem, you might want to try this:
编辑:如果\r\n是问题,你可能想试试这个:
Scanner.useDelimiter(Pattern.compile("[\r\n;]+"));
which matches one or more of \r, \n, and ;.
它匹配的一个或多个\r,\n和;。
Note: I haven't tried these.
注意:我还没有尝试过这些。
回答by Alan Moore
As you've discovered, you needed to look for DOS/network style \r\n(CRLF) line separators instead of the Unix style \n(LF only). But what if the text contains both? That happens a lot; in fact, when I view the source of this very page I see both varieties.
正如您所发现的,您需要查找 DOS/网络样式\r\n(CRLF) 行分隔符,而不是 Unix 样式\n(仅限 LF)。但是如果文本包含两者呢?这种情况经常发生;事实上,当我查看这个页面的来源时,我看到了这两个品种。
You should get in the habit of looking for both kinds of separator, as well as the older Mac style \r(CR only). Here's one way to do that:
您应该养成寻找两种分隔符以及较旧的 Mac 样式\r(仅限 CR)的习惯。这是一种方法:
\r?\n|\r
Plugging that into your sample code you get:
将其插入您的示例代码中,您将得到:
scanner.useDelimiter(";|\r?\n|\r");
This is assuming you want to match exactly one newline or semicolon at a time. If you want to match one or moreyou can do this instead:
这是假设您希望一次只匹配一个换行符或分号。如果你想匹配一个或多个,你可以这样做:
scanner.useDelimiter("[;\r\n]+");
Notice, too, how I passed in a regex stringinstead of a Pattern; all regexes get cached automatically, so pre-compiling the regex doesn't get you any performance gain.
还要注意,我是如何传入正则表达式字符串而不是模式的;所有正则表达式都会自动缓存,因此预编译正则表达式不会让您获得任何性能提升。
回答by Joshua McKinnon
Looking at the OP's comment, it looks like it was a different line ending (\r\n or CRLF) that was the problem.
查看 OP 的评论,问题似乎是不同的行尾(\r\n 或 CRLF)。
Here's my answer, which would handle multiple semicolons and line endings in either format (may or may not be desired)
这是我的答案,它将以任一格式处理多个分号和行结尾(可能需要也可能不需要)
Scanner.useDelimiter(Pattern.compile("([\n;]|(\r\n))+"));
e.g. an input file that looks like this:
例如一个看起来像这样的输入文件:
1
2;3;;4
5
would result in 1,2,3,4,5
将导致 1,2,3,4,5
I tried normal \n and \\n - both worked in my case, though I agree if you need a normal backslash you would want to double it as it is an escape character. It just so happens that in this case, "\n" becomes the desired character with or without the extra '\'
我尝试了普通的 \n 和 \\n - 两者都适用于我的情况,但我同意如果您需要普通的反斜杠,您会希望将其加倍,因为它是一个转义字符。碰巧在这种情况下,“\n”成为所需的字符,有或没有额外的 '\'

