在java中按单词拆分字符串
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/20728050/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Split Strings in java by words
提问by user2095165
How can I split the following word in to an array
如何将以下单词拆分为数组
That's the code
That's the code
into
进入
array
0 That
1 s
2 the
3 code
I tried something like this
我试过这样的事情
String str = "That's the code";
String[] strs = str.split("\'");
for (String sstr : strs) {
System.out.println(sstr);
}
But the output is
但输出是
That
s the code
采纳答案by Kevin Bowersox
To specifically split on white space and the apostrophe:
要专门拆分空白和撇号:
public class Split {
public static void main(String[] args) {
String [] tokens = "That's the code".split("[\s']");
for(String s:tokens){
System.out.println(s);
}
}
}
or to split on any non word character:
或拆分任何非单词字符:
public class Split {
public static void main(String[] args) {
String [] tokens = "That's the code".split("[\W]");
for(String s:tokens){
System.out.println(s);
}
}
}
回答by Maroun
You can split according to non-characters chars:
您可以根据非字符字符进行拆分:
String str = "That's the code";
String[] splitted = str.split("[\W]");
For your input, output will be:
对于您的输入,输出将是:
That
s
the
code
回答by Szymon
You can split by a regex that would be one of the two characters - quote or space:
您可以通过一个正则表达式来拆分,该正则表达式是两个字符之一 - 引号或空格:
String[] strs = str.split("['\s]");
回答by Pshemo
split
uses regex and in regex '
is not special character so you don't need to escape it with \
. To represent whitespaces you can use \s
(which in String needs to be written as "\\s"
). Also to create set of characters you can use "OR" operator |
like a|b|c|d
, or just use character class [abcd]
which means exactly the same as (a|b|c|d)
.
split
使用正则表达式并且在正则表达式'
中不是特殊字符,因此您无需使用\
. 要表示您可以使用的空格\s
(在 String 中需要写为"\\s"
)。同样要创建字符集,您可以使用“OR”运算符,|
例如a|b|c|d
,或者只使用字符类[abcd]
,这意味着与(a|b|c|d)
.
To makes things simple you can use
为了使事情变得简单,您可以使用
String[] strs = str.split("'| ");
or
或者
String[] strs = str.split("'|\s");//to include all whitespaces
or
或者
String[] strs = str.split("['\s]");//equivalent of "'|\s"
回答by Tareq Salah
If you want to split on non alphabetic chars
如果要拆分非字母字符
String str = "That's the code";
String[] strs = str.split("\P{Alpha}+");
for (String sstr : strs) {
System.out.println(sstr);
}
\P{Alpha} matches any non-alphabetic character and this is called POSIX character you can read more about it in this linkIt is very useful. + indicates that we should split on any continuous string of such characters.
\P{Alpha} 匹配任何非字母字符,这称为 POSIX 字符,您可以在此链接中阅读有关它的更多信息它非常有用。+ 表示我们应该拆分任何此类字符的连续字符串。
and the output will be
输出将是
That
s
the
code
回答by umanganiello
You should first replace the '
with " "
(blank space), using str.replaceAll("'", " ")
and then you can split the string on the blank space separator, using str.split(" ")
.You could alternatively use a regular expression to split on ' OR space.
您应该首先替换'
with " "
(blank space), usingstr.replaceAll("'", " ")
然后您可以在空格分隔符上拆分字符串,使用str.split(" ")
。您也可以使用正则表达式在 ' OR 空间上拆分。
回答by Keerthivasan
You can use OR
in regular expression
您可以OR
在正则表达式中使用
public static void main(String[] args) {
String str = "That's the code";
String[] strs = str.split("'|\s");
for (String sstr : strs) {
System.out.println(sstr);
}
}
The string will be split by single quote (') or space. The single quote doesn't need to be escaped. The output would be
字符串将被单引号 (') 或空格分割。单引号不需要转义。输出将是
run:
That
s
the
code
BUILD SUCCESSFUL (total time: 0 seconds)
回答by Pierre C
The best solution I've found to split by words if your string contains accentuated letters is :
如果您的字符串包含重音字母,我发现按单词拆分的最佳解决方案是:
String[] listeMots = phrase.split("\P{L}+");
For instance, if your String is
例如,如果您的字符串是
String phrase = "Salut mon homme, comment ?a va aujourd'hui? Ce sera No?l puis Paques bient?t.";
Then you will get the following words (enclosed within quotes and comma separated for clarity) :
然后你会得到以下单词(为了清楚起见,用引号括起来并用逗号分隔):
"Salut", "mon", "homme", "comment", "?a", "va", "aujourd", "hui", "Ce",
"sera", "No?l", "puis", "Paques", "bient?t".
Hope this helps!
希望这可以帮助!