java Java正则表达式转义逗号
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/14432290/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Java regular expression escaped commas
提问by CodeKingPlusPlus
I have a csv file that I would like to use the String split()
method on. I want each element of the array returned by split()
to be the comma separated values in the csv. However, there are other commas in the csv file.
我有一个 csv 文件,我想在其上使用该String split()
方法。我希望返回的数组的每个元素都是split()
csv 中的逗号分隔值。但是,csv 文件中还有其他逗号。
Fortunately, these other commas are escaped like '\,'
幸运的是,这些其他逗号像 '\,'
I am having trouble getting the right regex for the split()
method. I want to split by commas that are not preceded by the escape character.
我无法为该split()
方法获取正确的正则表达式。我想用前面没有转义字符的逗号分隔。
My current code is:
我目前的代码是:
String[] columns = new String[CONST];
columns = someString.split("*^\,*");
To me this says: split by a comma but the character before the comma must not be the escape character. Any number of characters before or after the comma are allowed.
对我来说这是说:用逗号分隔,但逗号前的字符不能是转义字符。逗号前后允许有任意数量的字符。
- How do I get the correct regular expression?
- 如何获得正确的正则表达式?
回答by Adrian Shum
First, comma doesn't have special meaning at the position you are using, therefore you can omit the escape
首先,逗号在您使用的位置没有特殊含义,因此您可以省略转义
The biggest problem in your regex is, *
alone doesn't give you any meaning. *
means any occurrence of previous token.
正则表达式中最大的问题是,*
单独没有任何意义。*
表示任何先前标记的出现。
So the regex should be
所以正则表达式应该是
.*,.*
(I think escaping the comma should still be fine .*\,.*
)
.*,.*
(我认为转义逗号应该还是可以的 .*\,.*
)
Then, come to usage, you are using the regex in String.split()
. String.split()
expect for the regex for the delimiter. Therefore you should only pass a ,
as regex. Having .*,.*
as "delimiter" is going to give you unexpected result (You may have a try).
然后,开始使用,您正在使用String.split()
. String.split()
期望用于分隔符的正则表达式。因此,您应该只将 a,
作为正则表达式传递。有.*,.*
作为“分隔符”将会给你带来意想不到的结果(你可以试试)。
回答by EngineerWithJava54321
Since I hit this page on a search, I will answer the question as stated and put the correct pattern (and for completeness):
由于我在搜索中点击了此页面,因此我将按照所述回答问题并输入正确的模式(为了完整性):
columns = someString.split("[^\\],");
Note that you need 4 escape characters because you need 2 escape characters to create 1 escape character in a string. In other words, "\\" creates the string \ . So "\\\\" creates the string \\, which escapes the escape in the regex to create the char \ in the regex. Therefore you need 4 escape characters in a string to create one in a regex. The brackets and the carat are one way to make a not statement (specifically for a single character).
请注意,您需要 4 个转义字符,因为您需要 2 个转义字符才能在字符串中创建 1 个转义字符。换句话说, "\\" 创建字符串 \ 。所以“\\\\”创建字符串\\,它转义正则表达式中的转义符以在正则表达式中创建字符\。因此,您需要在字符串中使用 4 个转义字符才能在正则表达式中创建一个。方括号和克拉是作出 not 声明的一种方式(特别是对于单个字符)。
You can also surround CSV entries that you don't want to split with quotes. Then use the following solution: Java: splitting a comma-separated string but ignoring commas in quotes.
您还可以将不想用引号拆分的 CSV 条目括起来。然后使用以下解决方案:Java:拆分逗号分隔的字符串但忽略引号中的逗号。
My personal preference would be to use split over a 3rd party parser because of the environment I code in.
由于我编码的环境,我个人的偏好是在第 3 方解析器上使用拆分。
回答by user1133275
The correct way is to use a parser (to deal with \\,
\,
,
) but using a simple regex can work;
正确的方法是使用解析器(来处理\\,
\,
,
)但使用简单的正则表达式可以工作;
jshell> "a,b".split("(?!\\),")
==> String[2] { "a", "b" }
How to test things that don't work;
如何测试不起作用的东西;
jshell> "a,b".split("[^\\],")
==> String[2] { "", "b" }
and
和
jshell> "a,b".split("*^\,*")
| java.util.regex.PatternSyntaxException thrown: Dangling meta character '*' near index 0
*^\,*
^
| at Pattern.error (Pattern.java:1997)
| at Pattern.sequence (Pattern.java:2172)
| at Pattern.expr (Pattern.java:2038)
| at Pattern.compile (Pattern.java:1760)
| at Pattern.<init> (Pattern.java:1409)
| at Pattern.compile (Pattern.java:1065)
| at String.split (String.java:2307)
| at String.split (String.java:2354)
| at (#6:1)