Java中的字符串标记器

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2356251/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-13 06:30:09  来源:igfitidea点击:

string tokenizer in Java

javastringtoken

提问by ASD

I have a text file which contains data seperated by '|'. I need to get each field(seperated by '|') and process it. The text file can be shown as below :

我有一个文本文件,其中包含由“|”分隔的数据。我需要获取每个字段(以“|”分隔)并对其进行处理。文本文件可以如下所示:

ABC|DEF||FGHT

ABC|DEF||FGHT

I am using string tokenizer(JDK 1.4) for getting each field value. Now the problem is, I should get an empty string after DEF.However, I am not getting the empty space between DEF & FGHT.

我正在使用字符串标记器(JDK 1.4)来获取每个字段值。现在的问题是,我应该在 DEF 之后得到一个空字符串。但是,我没有得到 DEF 和 FGHT 之间的空白空间。

My result should be - ABC,DEF,"",FGHTbut I am getting ABC,DEF,FGHT

我的结果应该是 - ABC,DEF,"",FGHT但我得到ABC,DEF,FGHT

采纳答案by Desintegr

From StringTokenizerdocumentation :

StringTokenizer文档:

StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split method of String or the java.util.regex package instead.

StringTokenizer 是一个遗留类,出于兼容性原因保留,但不鼓励在新代码中使用它。建议任何寻求此功能的人改用 String 的 split 方法或 java.util.regex 包。

The following code should work :

以下代码应该可以工作:

String s = "ABC|DEF||FGHT";
String[] r = s.split("\|");

回答by Omry Yadan

you can use the constructor that takes an extra 'returnDelims' boolean, and pass true to it. this way you will receive the delimiters, which will allow you to detect this condition.

您可以使用带有额外 'returnDelims' 布尔值的构造函数,并将 true 传递给它。通过这种方式,您将收到分隔符,这将允许您检测这种情况。

alternatively you can just implement your own string tokenizer that does what you need, it's not that hard.

或者,您可以实现自己的字符串标记器来满足您的需要,这并不难。

回答by Ryan Emerle

StringTokenizer ignores empty elements. Consider using String.split, which is also available in 1.4.

StringTokenizer 忽略空元素。考虑使用 String.split,它也在 1.4 中可用。

From the javadocs:

从javadocs:

StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split method of String or the java.util.regex package instead.

StringTokenizer 是一个遗留类,出于兼容性原因保留,但不鼓励在新代码中使用它。建议任何寻求此功能的人改用 String 的 split 方法或 java.util.regex 包。

回答by sfussenegger

Use the returnDelimsflag and check two subsequent occurrences of the delimiter:

使用该returnDelims标志并检查两个后续出现的分隔符:

String str = "ABC|DEF||FGHT";
String delim = "|";
StringTokenizer tok = new StringTokenizer(str, delim, true);

boolean expectDelim = false;
while (tok.hasMoreTokens()) {
    String token = tok.nextToken();
    if (delim.equals(token)) {
        if (expectDelim) {
            expectDelim = false;
            continue;
        } else {
            // unexpected delim means empty token
            token = null;
        }
    }

    System.out.println(token);
    expectDelim = true;
}

this prints

这打印

ABC
DEF
null
FGHT

The API isn't pretty and therefore considered legacy (i.e. "almost obsolete"). Use it only with where pattern matching is too expensive (which should only be the case for extremely long strings) or where an API expects an Enumeration.

API 不漂亮,因此被认为是遗留的(即“几乎过时”)。仅在模式匹配过于昂贵的情况下使用它(这应该只适用于极长的字符串)或 API 需要枚举的情况。

In case you switch to String.split(String), make sure to quote the delimiter. Either manually ("\\|") or automatically using string.split(Pattern.quote(delim));

如果您切换到String.split(String),请确保引用分隔符。手动 ( "\\|") 或自动使用string.split(Pattern.quote(delim));

回答by Ashik ali

package com.java.String;

import java.util.StringTokenizer;

public class StringWordReverse {

    public static void main(String[] kam) {
        String s;
        String sReversed = "";
        System.out.println("Enter a string to reverse");
        s = "THIS IS ASHIK SKLAB";
        StringTokenizer st = new StringTokenizer(s);


        while (st.hasMoreTokens()) {
            sReversed = st.nextToken() + " " + sReversed;
        }

        System.out.println("Original string is : " + s);
        System.out.println("Reversed string is : " + sReversed);

    }
}

Output:

输出:

Enter a string to reverse

输入要反转的字符串

Original string is : THIS IS ASHIK SKLAB

原始字符串是:这是 ASHIK SKLAB

Reversed string is : SKLAB ASHIK IS THIS

反转的字符串是:SKLAB ASHIK IS THIS

回答by Hariharan Sathya Narayanan

Here is another way to solve this problem

这是解决此问题的另一种方法

   String str =  "ABC|DEF||FGHT";
   StringTokenizer s = new StringTokenizer(str,"|",true);
   String currentToken="",previousToken="";


   while(s.hasMoreTokens())
   {
    //Get the current token from the tokenize strings
     currentToken = s.nextToken();

    //Check for the empty token in between ||
     if(currentToken.equals("|") && previousToken.equals("|"))
     {
        //We denote the empty token so we print null on the screen
        System.out.println("null");
     }

     else
     {
        //We only print the tokens except delimiters
        if(!currentToken.equals("|"))
        System.out.println(currentToken);
     }

     previousToken = currentToken;
   }

回答by Justin Gorny

Here is a way to split a string into tokens (a token is one or more letters)

这是一种将字符串拆分为标记的方法(一个标记是一个或多个字母)

public static void main(String[] args) {
    Scanner scan = new Scanner(System.in);
    String s = scan.nextLine();
    s = s.replaceAll("[^A-Za-z]", " ");
    StringTokenizer arr = new StringTokenizer(s, " ");
    int n = arr.countTokens();
    System.out.println(n);
    while(arr.hasMoreTokens()){
        System.out.println(arr.nextToken());
    }
    scan.close();
}