C# string.substring 与 string.take

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/15406072/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-10 16:42:21  来源:igfitidea点击:

string.substring vs string.take

c#substring

提问by Williams

If you want to only take a part of a string, the substring method is mostly used. This has a drawback that you must first test on the length of the string to avoid errors. For example you want to save data into a database, and want to cut off a value to the first 20 characters.

如果只想取字符串的一部分,则多使用 substring 方法。这有一个缺点,您必须首先测试字符串的长度以避免错误。比如你想把数据存入数据库,想截取一个值到前20个字符。

If you do temp.substring(0,20) but temp only holds 10 chars, an exception is thrown.

如果您执行 temp.substring(0,20) 但 temp 仅包含 10 个字符,则会引发异常。

There are 2 solutions that I see :

我看到有两种解决方案:

  1. test on the length, and do the substring if needed
  2. use the extension method Take

        string temp = "1234567890";
        var data= new string( temp.Take(20).ToArray());
        --> data now holds "1234657890"
    
  1. 测试长度,并根据需要执行子字符串
  2. 使用扩展方法Take

        string temp = "1234567890";
        var data= new string( temp.Take(20).ToArray());
        --> data now holds "1234657890"
    

Is there any disadvantage in terms of speed or memory use , when one uses the Take method. The benefit is that you do not have to write all those if statements.

当使用 Take 方法时,在速度或内存使用方面是否有任何缺点。好处是您不必编写所有这些 if 语句。

采纳答案by Matthew Watson

If you find yourself doing this a lot, why not write an extension method?

如果你发现自己经常这样做,为什么不写一个扩展方法呢?

For example:

例如:

using System;

namespace Demo
{
    public static class Program
    {
        public static void Main(string[] args)
        {
            Console.WriteLine("123456789".Left(5));
            Console.WriteLine("123456789".Left(15));
        }
    }

    public static class StringExt
    {
        public static string Left(this string @this, int count)
        {
            if (@this.Length <= count)
            {
                return @this;
            }
            else
            {
                return @this.Substring(0, count);
            }
        }
    }
}

回答by Henk Holterman

Is there any disadvantage in terms of speed or memory use when one uses the Take method

使用 Take 方法时,在速度或内存使用方面是否有任何劣势?

Yes. Take()involves creating an IEnumerator<char>first and, for each char, going through the hoops of MoveNext()and yield return;etc. Also note the ToArray and the string constructor.

是的。Take()包括创建一个IEnumerator<char>第一,并为每个字符,通过对篮球去MoveNext()yield return;等。另外还要注意的ToArray和字符串的构造函数。

Not an issue for small numbers of strings but in a large loop the specialized string functions are a lot better.

对于少量字符串不是问题,但在大循环中,专用字符串函数要好得多。

回答by Tim Schmelter

The Takeextension method does not create a substring, it returns a query which can be used to create a Char[](ToArray) or a List<Char>(ToList). But you actually want to have that substring.

Take扩展方法不创建的子字符串,它返回其可用于创建一个查询Char[](ToArray的)或List<Char>(ToList)。但您实际上想要拥有该子字符串。

Then you need other methods as well:

那么你还需要其他方法:

string  data = new string(temp.Take(20).ToArray());

This implicitely uses a foreachto enumerate the chars, creates a new char[] (which might allocate too much size due to the doubling algorithm). Finally a new string is created from the char[].

这隐含地使用 aforeach来枚举字符,创建一个新的 char[] (由于加倍算法可能会分配太多大小)。最后从char[].

The Substringon the other hand uses optimized methods.

Substring另一方面,使用优化方法

So you pay this little convenience with memory which might be negligible but not always.

因此,您为内存付出了一点点便利,这可能可以忽略不计,但并非总是如此。

回答by Daniel Pe?alba

As Henk Holtermand said, Take()creates an IEnumeratorand then you need the ToArray()call.

正如 Henk Holtermand 所说,Take()创建一个IEnumerator然后你需要ToArray()调用。

So, if the performanceis important in your application, or you will perform substrings several times in your process, the performance could be a problem.

因此,如果性能在您的应用程序中很重要,或者您将在您的过程中多次执行子字符串,则性能可能是一个问题。

I wrote an example program to benchmark exactly how slower is the Take()method here are the results:

我编写了一个示例程序来准确衡量Take()方法的速度有多慢,结果如下:

Tested with ten million times:

千万次测试:

  • Time performing substring: 266 ms
  • Time performing take operation: 1437 ms
  • 执行子串的时间:266 毫秒
  • 执行 take 操作的时间:1437 ms

And here is the code:

这是代码:

    internal const int RETRIES = 10000000;

    static void Main(string[] args)
    {
        string testString = Guid.NewGuid().ToString();

        long timeSubstring = MeasureSubstring(testString);
        long timeTake = MeasureTake(testString);

        Console.WriteLine("Time substring: {0} ms, Time take: {1} ms",
            timeSubstring, timeTake);
    }

    private static long MeasureSubstring(string test)
    {
        long ini = Environment.TickCount;

        for (int i = 0; i < RETRIES; i++)
        {
            if (test.Length > 4)
            {
                string tmp = test.Substring(4);
            }
        }

        return Environment.TickCount - ini;
    }

    private static long MeasureTake(string test)
    {
        long ini = Environment.TickCount;

        for (int i = 0; i < RETRIES; i++)
        {
            var data = new string(test.Take(4).ToArray());
        }

        return Environment.TickCount - ini;
    }

回答by ken2k

Firstly I didn't want to answer (as there already are valid answers), but I would like to add something that doesn't fit as a comment:

首先我不想回答(因为已经有有效的答案),但我想添加一些不适合作为评论的内容:

You're talking about performance / memory issues. Right. As others said, string.SubStringis way more efficient, because of how it is internally optimized and because of how LINQ works with string.Take()(enumeration of chars...etc.).

你在谈论性能/内存问题。对。正如其他人所说,string.SubString效率更高,因为它是如何进行内部优化的,也因为 LINQ 的工作方式string.Take()(字符枚举......等)。

What no one said is that the main disadvantage of Take()in your case is that it totally destroys the simplicity of a substring. As Tim said, to get the actual string you want, you'll have to write:

没有人说Take()在你的情况下的主要缺点是它完全破坏了 substring 的简单性。正如蒂姆所说,要获得您想要的实际字符串,您必须编写:

string myString = new string(temp.Take(20).ToArray());

Damn... this is so much harder to understand than (see Matthew's extension method):

该死的......这比(参见马修的扩展方法)更难理解:

string myString = temp.Left(20);

LINQ is great for lots of use cases, but shouldn't be used if not necessary. Even a simple loop is sometimes better (i.e. faster, more readable/understandable) than LINQ, so imagine for a simple substring...

LINQ 非常适合许多用例,但如果没有必要,不应使用。即使是一个简单的循环有时也比 LINQ 更好(即更快、更易读/更易理解),所以想象一个简单的子字符串......

To summarize about LINQ in your case:

在您的情况下总结 LINQ:

  • worse performances
  • less readable
  • less understandable
  • requires LINQ (so won't work with .Net 2.0 for instance)
  • 表现更差
  • 不太可读
  • 不太好理解
  • 需要 LINQ(例如,不能与 .Net 2.0 一起使用)

回答by Noctis

A variation of @Daniel answer that seems more accurate to me.
a Guid's length is 36. We're creating a list with a variable length of strings from 1 to 36, and we'll aim for taking 18 with the substring/ takemethods, so around half will go through.

@Daniel 答案的一种变体,对我来说似乎更准确。
Guid 的长度是 36。我们正在创建一个字符串长度从 1 到 36 的可变列表,我们的目标是使用substring/take方法取 18 ,所以大约一半会通过。

The results I'm getting suggest that Takewill be 6-10 times slowerthan Substring.

我得到的结果表明,Take慢6-10倍Substring

Results example :

结果示例:

Build time: 3812 ms
Time substring: 391 ms, Time take: 1828 ms

Build time: 4172 ms
Time substring: 406 ms, Time take: 2141 ms

so, for 5 million strings, doing roughly 2.5 millions operations, total time is 2.1 seconds, or around 0.0008564 milliseconds = ~ 1 micro secondper operation. If you feel you need to cut it by 5 for substring, go for it, but I doubt in real life situations, outside of tights loop, you'll ever feel the difference.

因此,对于500 万个字符串,执行大约250 万次操作,总时间为2.1 秒,即每次操作大约0.0008564 毫秒 = ~ 1 微秒。如果你觉得你需要将子串削减 5,那就去吧,但我怀疑在现实生活中,在紧身衣循环之外,你永远不会感觉到不同。

void Main()
{
    Console.WriteLine("Build time: {0} ms", BuildInput());
    Console.WriteLine("Time substring: {0} ms, Time take: {1} ms", MeasureSubstring(), MeasureTake());
}

internal const int RETRIES = 5000000;
static internal List<string> input;

// Measure substring time
private static long MeasureSubstring()
{
    var v = new List<string>();
    long ini = Environment.TickCount;

    foreach (string test in input)
        if (test.Length > 18)
        {
            v.Add(test.Substring(18));
        }
    //v.Count().Dump("entries with substring");
    //v.Take(5).Dump("entries with Sub");

    return Environment.TickCount - ini;
}

// Measure take time
private static long MeasureTake()
{
    var v = new List<string>();
    long ini = Environment.TickCount;

    foreach (string test in input)
        if (test.Length > 18) v.Add(new string(test.Take(18).ToArray()));

    //v.Count().Dump("entries with Take");
    //v.Take(5).Dump("entries with Take");

    return Environment.TickCount - ini;
}

// Create a list with random strings with random lengths
private static long BuildInput()
{
    long ini = Environment.TickCount;
    Random r = new Random();
    input = new List<string>();

    for (int i = 0; i < RETRIES; i++)
        input.Add(Guid.NewGuid().ToString().Substring(1,r.Next(0,36)));

    return Environment.TickCount - ini;
}