Java - 按数字和字母拆分字符串

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/36423633/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-03 01:26:12  来源:igfitidea点击:

Java - Split String by Number and Letters

javaregexstringsplitq

提问by Azazel

So I have, for example, a string such as this C3H20IO

所以我有,例如,这样的字符串 C3H20IO

What I wanna do is split this string so I get the following:

我想要做的是拆分这个字符串,所以我得到以下内容:

Array1 = {C,H,I,O}
Array2 = {3,20,1,1}

The 1as the third element of the Array2is indicative of the monoatomic nature of the Ielement. Same for O. That is actually the part I am struggling with.

1作为对所述第三元件Array2是指示所述的单原子性质I元件。对于O. 这实际上是我正在努力解决的部分。

This is a chemical equation, so I need to separate the elements according to their names and the amount of atoms there are etc.

这是一个化学方程式,所以我需要根据元素的名称和原子的数量等来分离元素。

采纳答案by Maljam

You could try this approach:

你可以试试这个方法:

String formula = "C3H20IO";

//insert "1" in atom-atom boundry 
formula = formula.replaceAll("(?<=[A-Z])(?=[A-Z])|(?<=[a-z])(?=[A-Z])|(?<=\D)$", "1");

//split at letter-digit or digit-letter boundry
String regex = "(?<=\D)(?=\d)|(?<=\d)(?=\D)";
String[] atoms = formula.split(regex);

Output:

输出:

atoms: [C, 3, H, 20, I, 1, O, 1]

原子:[C, 3, H, 20, I, 1, O, 1]

Now all even even indices (0, 2, 4...) are atoms and odd ones are the associated number:

现在所有偶数索引 (0, 2, 4...) 都是原子,奇数是相关数:

String[] a = new String[ atoms.length/2 ];
int[] n = new int[ atoms.length/2 ];

for(int i = 0 ; i < a.length ; i++) {
    a[i] = atoms[i*2];
    n[i] = Integer.parseInt(atoms[i*2+1]);
}

Output:

输出:

a: [C, H, I, O]
n: [3, 20, 1, 1]

a: [C, H, I, O]
n: [3, 20, 1, 1]

回答by Alexander

You can use a regular expression to slide over your input using the Matcher.find()method.

您可以使用正则表达式通过Matcher.find()方法在您的输入上滑动。

Here a rough example of what it may look like:

这是它可能是什么样子的粗略示例:

    String input = "C3H20IO";

    List<String> array1 = new ArrayList<String>();
    List<Integer> array2 = new ArrayList<Integer>();

    Pattern pattern = Pattern.compile("([A-Z][a-z]*)([0-9]*)");
    Matcher matcher = pattern.matcher(input);               
    while(matcher.find()){
        array1.add(matcher.group(1));

        String atomAmount = matcher.group(2);
        int atomAmountInt = 1;
        if((atomAmount != null) && (!atomAmount.isEmpty())){
            atomAmountInt = Integer.valueOf(atomAmount);
        }
        array2.add(atomAmountInt);
    }

I know, the conversion from List to Array is missing, but it should give you an idea of how to approach your problem.

我知道,缺少从 List 到 Array 的转换,但它应该让您了解如何解决您的问题。

回答by mmuzahid

An approach without REGEXand data stored using ArrayList:

一种不REGEX使用和存储数据的方法ArrayList

String s = "C3H20IO";

char Chem = '-';
String val = "";
boolean isFisrt = true;
List<Character> chemList = new ArrayList<Character>();
List<Integer> weightList = new ArrayList<Integer>();
for (char c : s.toCharArray()) {
    if (Character.isLetter(c)) {
        if (!isFisrt) {
            chemList.add(Chem);
            weightList.add(Integer.valueOf(val.equals("") ? "1" : val));
            val = "";
        }
        Chem = c;
    } else if (Character.isDigit(c)) {
        val += c;
    } 
    isFisrt = false;
}
chemList.add(Chem);
weightList.add(Integer.valueOf(val.equals("") ? "1" : val));

System.out.println(chemList);
System.out.println(weightList);

OUTPUT:

输出:

[C, H, I, O]
[3, 20, 1, 1]

回答by anaxin

This works assuming each element starts with a capital letter, i.e. if you have "Fe" you don't represent it in String as "FE". Basically, you split the string on each capital letter then split each new string by letters and numbers, adding "1" if the new split contains no numbers.

这假设每个元素都以大写字母开头,即如果您有“Fe”,则不会在字符串中将其表示为“FE”。基本上,您在每个大写字母上拆分字符串,然后按字母和数字拆分每个新字符串,如果新拆分不包含数字,则添加“1”。

        String s = "C3H20IO";
        List<String> letters = new ArrayList<>();
        List<String> numbers = new ArrayList<>();

        String[] arr = s.split("(?=\p{Upper})");  // [C3, H20, I, O]
        for (String str : arr) {  //[C, 3]:[H, 20]:[I]:[O]
            String[] temp = str.split("(?=\d)", 2);
            letters.add(temp[0]);
            if (temp.length == 1) {
                numbers.add("1");
            } else {
                numbers.add(temp[1]);
            }
        }
        System.out.println(Arrays.asList(letters)); //[[C, H, I, O]]
        System.out.println(Arrays.asList(numbers)); //[[3, 20, 1, 1]]

回答by Shree Krishna

I did this as following

我这样做如下

ArrayList<Integer> integerCharacters = new ArrayList();
ArrayList<String> stringCharacters = new ArrayList<>();

String value = "C3H20IO"; //Your value 
String[] strSplitted = value.split("(?<=\D)(?=\d)|(?<=\d)(?=\D)"); //Split numeric and strings

for(int i=0; i<strSplitted.length; i++){

    if (Character.isLetter(strSplitted[i].charAt(0))){
        stringCharacters.add(strSplitted[i]); //If string then add to strings array
    }
    else{
        integerCharacters.add(Integer.parseInt(strSplitted[i])); //else add to integer array
    }
}

回答by rock321987

Is this good? (Not using split)

这个好吗?(不使用split

Regex Demo

正则表达式演示

String line = "C3H20ZnO2ABCD";
String pattern = "([A-Z][a-z]*)(((?=[A-Z][a-z]*|$))|\d+)";

Pattern r = Pattern.compile(pattern);

Matcher m = r.matcher(line);

while (m.find( )) {
     System.out.print(m.group(1));
     if (m.group(2).length() == 0) {
         System.out.println(" 1");
     } else {
         System.out.println(" " + m.group(2));
     }
  }

IDEONE DEMO

IDEONE 演示

回答by Thesoham24

make (for loop) with size of input length and add following condition

使用输入长度的大小制作(for循环)并添加以下条件

if(i==number)
// add it to the number array

if(i==character)
//add it into character array

回答by Alex Salauyou

I suggest splitting by uppercase letter using zero-width lookahead regex (to extract items like C12, O2, Si), then split each item into element and its numeric weight:

我建议使用零宽度前瞻正则表达式按大写字母拆分(提取C12, O2, 之类的项目Si),然后将每个项目拆分为元素及其数字权重:

List<String> elements = new ArrayList<>();
List<Integer> weights = new ArrayList<>();

String[] items = "C6H12Si6OH".split("(?=[A-Z])");  // [C6, H12, Si6, O, H]
for (String item : items) {
    String[] pair = item.split("(?=[0-9])", 2);    // e.g. H12 => [H, 12], O => [O]
    elements.add(pair[0]);
    weights.add(pair.length > 1 ? Integer.parseInt(pair[1]) : 1);
}
System.out.println(elements);  // [C, H, Si, O, H]
System.out.println(weights);   // [6, 12, 6, 1, 1]

回答by abyversin

You can use two patterns :

您可以使用两种模式:

  • [0-9]
  • [a-zA-Z]
  • [0-9]
  • [a-zA-Z]

Split twice by each of them.

由他们每个人拆分两次。

List<String> letters = Arrays.asList(test.split("[0-9]"));
List<String> numbers = Arrays.asList(test.split("[a-zA-Z]"))
            .stream()
            .filter(s -> !s.equals(""))
            .collect(Collectors.toList());

if(letters.size() != numbers.size()){
        numbers.add("1");
    }

回答by Karthika

You can split the string by using a regular expression like (?<=\D)(?=\d). Try this :

您可以使用像 (?<=\D)(?=\d) 这样的正则表达式来拆分字符串。试试这个 :

String alphanum= "abcd1234";
String[] part = alphanum.split("(?<=\D)(?=\d)");
System.out.println(part[0]);
System.out.println(part[1]);

will output

会输出

abcd 1234

第 1234 章