在java中按单词拆分字符串

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/20728050/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-13 03:49:13  来源:igfitidea点击:

Split Strings in java by words

java

提问by user2095165

How can I split the following word in to an array

如何将以下单词拆分为数组

That's the code

That's the code

into

进入

array
0 That
1 s
2 the
3 code

I tried something like this

我试过这样的事情

String str = "That's the code";

        String[] strs = str.split("\'");
        for (String sstr : strs) {
            System.out.println(sstr);
        }

But the output is

但输出是

That
s the code

采纳答案by Kevin Bowersox

To specifically split on white space and the apostrophe:

要专门拆分空白和撇号:

public class Split {
    public static void main(String[] args) {
        String [] tokens = "That's the code".split("[\s']");
        for(String s:tokens){
            System.out.println(s);
        }
    }
}

or to split on any non word character:

或拆分任何非单词字符:

public class Split {
    public static void main(String[] args) {
        String [] tokens = "That's the code".split("[\W]");
        for(String s:tokens){
            System.out.println(s);
        }
    }
}

回答by Maroun

You can split according to non-characters chars:

您可以根据非字符字符进行拆分:

String str = "That's the code";
String[] splitted = str.split("[\W]");

For your input, output will be:

对于您的输入,输出将是:

That
s
the
code

回答by Szymon

You can split by a regex that would be one of the two characters - quote or space:

您可以通过一个正则表达式来拆分,该正则表达式是两个字符之一 - 引号或空格:

String[] strs = str.split("['\s]");

回答by Pshemo

splituses regex and in regex 'is not special character so you don't need to escape it with \. To represent whitespaces you can use \s(which in String needs to be written as "\\s"). Also to create set of characters you can use "OR" operator |like a|b|c|d, or just use character class [abcd]which means exactly the same as (a|b|c|d).

split使用正则表达式并且在正则表达式'中不是特殊字符,因此您无需使用\. 要表示您可以使用的空格\s(在 String 中需要写为"\\s")。同样要创建字符集,您可以使用“OR”运算符,|例如a|b|c|d,或者只使用字符类[abcd],这意味着与(a|b|c|d).

To makes things simple you can use

为了使事情变得简单,您可以使用

String[] strs = str.split("'| ");

or

或者

String[] strs = str.split("'|\s");//to include all whitespaces

or

或者

String[] strs = str.split("['\s]");//equivalent of "'|\s"

回答by Tareq Salah

If you want to split on non alphabetic chars

如果要拆分非字母字符

String str = "That's the code";
String[] strs = str.split("\P{Alpha}+");
for (String sstr : strs) {
        System.out.println(sstr);
}

\P{Alpha} matches any non-alphabetic character and this is called POSIX character you can read more about it in this linkIt is very useful. + indicates that we should split on any continuous string of such characters.

\P{Alpha} 匹配任何非字母字符,这称为 POSIX 字符,您可以在此链接中阅读有关它的更多信息它非常有用。+ 表示我们应该拆分任何此类字符的连续字符串。

and the output will be

输出将是

That
s
the
code

回答by umanganiello

You should first replace the 'with " "(blank space), using str.replaceAll("'", " ")and then you can split the string on the blank space separator, using str.split(" ").You could alternatively use a regular expression to split on ' OR space.

您应该首先替换'with " "(blank space), usingstr.replaceAll("'", " ")然后您可以在空格分隔符上拆分字符串,使用str.split(" ")。您也可以使用正则表达式在 ' OR 空间上拆分。

回答by Keerthivasan

You can use ORin regular expression

您可以OR在正则表达式中使用

public static void main(String[] args) {
    String str = "That's the code";
        String[] strs = str.split("'|\s");
        for (String sstr : strs) {
            System.out.println(sstr);
        }
   }

The string will be split by single quote (') or space. The single quote doesn't need to be escaped. The output would be

字符串将被单引号 (') 或空格分割。单引号不需要转义。输出将是

run:
That
s
the
code
BUILD SUCCESSFUL (total time: 0 seconds)

回答by Pierre C

The best solution I've found to split by words if your string contains accentuated letters is :

如果您的字符串包含重音字母,我发现按单词拆分的最佳解决方案是:

String[] listeMots = phrase.split("\P{L}+");

For instance, if your String is

例如,如果您的字符串是

String phrase = "Salut mon homme, comment ?a va aujourd'hui? Ce sera No?l puis Paques bient?t.";

Then you will get the following words (enclosed within quotes and comma separated for clarity) :

然后你会得到以下单词(为了清楚起见,用引号括起来并用逗号分隔):

"Salut", "mon", "homme", "comment", "?a", "va", "aujourd", "hui", "Ce", 
"sera", "No?l", "puis", "Paques", "bient?t".

Hope this helps!

希望这可以帮助!