C# 有像 Java 那样的 String Tokenizer 吗?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/70405/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-03 11:07:43  来源:igfitidea点击:

Does C# have a String Tokenizer like Java's?

提问by andrewrk

I'm doing simple string input parsing and I am in need of a string tokenizer. I am new to C# but have programmed Java, and it seems natural that C# should have a string tokenizer. Does it? Where is it? How do I use it?

我正在做简单的字符串输入解析,我需要一个字符串标记器。我是 C# 的新手,但已经编写了 Java,C# 应该有一个字符串标记器似乎很自然。可以?它在哪里?我如何使用它?

采纳答案by Davy Landman

You could use String.Split method.

您可以使用String.Split 方法

class ExampleClass
{
    public ExampleClass()
    {
        string exampleString = "there is a cat";
        // Split string on spaces. This will separate all the words in a string
        string[] words = exampleString.Split(' ');
        foreach (string word in words)
        {
            Console.WriteLine(word);
            // there
            // is
            // a
            // cat
        }
    }
}

For more information see Sam Allen's article about splitting strings in c#(Performance, Regex)

有关更多信息,请参阅Sam Allen 关于在 c# 中拆分字符串的文章(性能、正则表达式)

回答by Steve Morgan

I think the nearest in the .NET Framework is

我认为 .NET Framework 中最近的是

string.Split()

回答by Tim Jarvis

The split method of a string is what you need. In fact the tokenizer class in Java is deprecated in favor of Java's string split method.

字符串的 split 方法正是您所需要的。事实上,Java 中的 tokenizer 类已被弃用,取而代之的是 Java 的字符串拆分方法。

回答by Davy Landman

For complex splitting you could use a regex creating a match collection.

对于复杂的拆分,您可以使用正则表达式创建匹配集合。

回答by Paul Shannon

If you are using C# 3.5 you could write an extension method to System.String that does the splitting you need. You then can then use syntax:

如果您使用的是 C# 3.5,您可以为 System.String 编写一个扩展方法来执行您需要的拆分。然后,您可以使用语法:

string.SplitByMyTokens();

More info and a useful example from MS here http://msdn.microsoft.com/en-us/library/bb383977.aspx

更多信息和来自 MS 的有用示例http://msdn.microsoft.com/en-us/library/bb383977.aspx

回答by Paul Shannon

use Regex.Split(string,"#|#");

Regex.Split(string,"#|#");

回答by Musa

read this, split function has an overload takes an array consist of seperators http://msdn.microsoft.com/en-us/library/system.stringsplitoptions.aspx

读这个,split 函数有一个重载需要一个由分隔符组成的数组 http://msdn.microsoft.com/en-us/library/system.stringsplitoptions.aspx

回答by demongolem

I just want to highlight the power of C#'s Split method and give a more detailed comparison, particularly from someone who comes from a Java background.

我只想强调 C# 的 Split 方法的强大功能并进行更详细的比较,尤其是来自具有 Java 背景的人。

Whereas StringTokenizer in Java only allows a single delimiter, we can actually split on multiple delimiters making regular expressions less necessary (although if one needs regex, use regex by all means!) Take for example this:

虽然 Java 中的 StringTokenizer 只允许一个分隔符,但我们实际上可以拆分多个分隔符,从而减少正则表达式的必要性(尽管如果需要正则表达式,请务必使用正则表达式!)例如:

str.Split(new char[] { ' ', '.', '?' })

This splits on three different delimiters returning an array of tokens. We can also remove empty arrays with what would be a second parameter for the above example:

这将拆分为三个不同的分隔符,返回一个令牌数组。我们还可以使用上面示例的第二个参数删除空数组:

str.Split(new char[] { ' ', '.', '?' }, StringSplitOptions.RemoveEmptyEntries)

One thing Java's String tokenizer does have that I believe C# is lacking (at least Java 7 has this feature) is the ability to keep the delimiter(s) as tokens. C#'s Split will discard the tokens. This could be important in say some NLP applications, but for more general purpose applications this might not be a problem.

Java 的 String 标记器确实具有的一件事我认为 C# 缺乏(至少 Java 7 具有此功能)是将分隔符保留为标记的能力。C# 的 Split 将丢弃令牌。这在一些 NLP 应用程序中可能很重要,但对于更通用的应用程序,这可能不是问题。

回答by Skyler

_words = new List<string>(YourText.ToLower().Trim('\n', '\r').Split(' ').
            Select(x => new string(x.Where(Char.IsLetter).ToArray()))); 

Or

或者

_words = new List<string>(YourText.Trim('\n', '\r').Split(' ').
            Select(x => new string(x.Where(Char.IsLetterOrDigit).ToArray()))); 

回答by neronovs

The similar to Java's method is:

类似于Java的方法是:

Regex.Split(string, pattern);

where

在哪里

  • string- the text you need to split
  • pattern- string type pattern, what is splitting the text
  • string- 您需要拆分的文本
  • pattern- 字符串类型模式,什么是拆分文本