C++ 等效于 StringBuffer/StringBuilder?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2462951/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 23:33:27  来源:igfitidea点击:

C++ equivalent of StringBuffer/StringBuilder?

c++stlstring-concatenation

提问by An???drew

Is there a C++ Standard Template Library class that provides efficient string concatenation functionality, similar to C#'s StringBuilderor Java's StringBuffer?

是否有 C++ 标准模板库类提供高效的字符串连接功能,类似于 C# 的StringBuilder或 Java 的StringBuffer

采纳答案by iain

NOTE this answer has received some attention recently. I am not advocating this as a solution (it is a solution I have seen in the past, before the STL). It is an interesting approach and should only be applied over std::stringor std::stringstreamif after profiling your code you discover this makes an improvement.

注意这个答案最近受到了一些关注。我不提倡将其作为解决方案(这是我过去在 STL 之前见过的解决方案)。这是一个有趣的方法,应该只应用在std::string或者std::stringstream如果剖析你的代码后,你会发现这使得改善。

I normally use either std::stringor std::stringstream. I have never had any problems with these. I would normally reserve some room first if I know the rough size of the string in advance.

我通常使用std::stringstd::stringstream。我从来没有遇到过这些问题。如果我事先知道绳子的粗略尺寸,我通常会先预订一些房间。

I have seen other people make their own optimized string builder in the distant past.

我在很久以前见过其他人制作自己的优化字符串生成器。

class StringBuilder {
private:
    std::string main;
    std::string scratch;

    const std::string::size_type ScratchSize = 1024;  // or some other arbitrary number

public:
    StringBuilder & append(const std::string & str) {
        scratch.append(str);
        if (scratch.size() > ScratchSize) {
            main.append(scratch);
            scratch.resize(0);
        }
        return *this;
    }

    const std::string & str() {
        if (scratch.size() > 0) {
            main.append(scratch);
            scratch.resize(0);
        }
        return main;
    }
};

It uses two strings one for the majority of the string and the other as a scratch area for concatenating short strings. It optimise's appends by batching the short append operations in one small string then appending this to the main string, thus reducing the number of reallocations required on the main string as it gets larger.

它使用两个字符串,一个用于字符串的大部分,另一个用作连接短字符串的临时区域。它通过在一个小字符串中批处理短追加操作然后将其追加到主字符串来优化追加,从而减少主字符串变大时所需的重新分配次数。

I have not required this trick with std::stringor std::stringstream. I think it was used with a third party string library before std::string, it was that long ago. If you adopt a strategy like this profile your application first.

我没有用std::string或要求这个技巧std::stringstream。我认为它是在 std::string 之前与第三方字符串库一起使用的,那是很久以前的事了。如果您首先采用这种配置文件的策略,您的应用程序。

回答by jk.

The C++ way would be to use std::stringstreamor just plain string concatenations. C++ strings are mutable so the performance considerations of concatenation are less of a concern.

C++ 方法是使用std::stringstream或只是简单的字符串连接。C++ 字符串是可变的,因此连接的性能考虑不太重要。

with regards to formatting, you can do all the same formatting on a stream, but in a different way, similar to cout. or you can use a strongly typed functor which encapsulates this and provides a String.Format like interface e.g. boost::format

关于格式化,您可以对流执行所有相同的格式化,但采用不同的方式,类似于cout. 或者你可以使用一个强类型函子来封装它并提供一个类似 String.Format 的接口,例如boost::format

回答by Stu

The std::string.appendfunction isn't a good option because it doesn't accept many forms of data. A more useful alternative is to use std::stringstream; like so:

std::string.append函数不是一个好的选择,因为它不接受多种形式的数据。一个更有用的替代方法是使用std::stringstream; 像这样:

#include <sstream>
// ...

std::stringstream ss;

//put arbitrary formatted data into the stream
ss << 4.5 << ", " << 4 << " whatever";

//convert the stream buffer into a string
std::string str = ss.str();

回答by dan04

std::stringisthe C++ equivalent: It's mutable.

std::stringC++ 的等价物:它是可变的。

回答by Andy Shellam

You can use .append() for simply concatenating strings.

您可以使用 .append() 来简单地连接字符串。

std::string s = "string1";
s.append("string2");

I think you might even be able to do:

我认为你甚至可以做到:

std::string s = "string1";
s += "string2";

As for the formatting operations of C#'s StringBuilder, I believe snprintf(or sprintfif you want to risk writing buggy code ;-) ) into a character array and convert back to a string is about the only option.

至于 C# 的格式化操作StringBuilder,我相信snprintf(或者sprintf如果你想冒险编写错误代码 ;-) )到字符数组并转换回字符串是唯一的选择。

回答by Daemin

Since std::stringin C++ is mutable you can use that. It has a += operatorand an appendfunction.

由于std::string在 C++ 中是可变的,因此您可以使用它。它有一个+= operator和一个append功能。

If you need to append numerical data use the std::to_stringfunctions.

如果您需要附加数字数据,请使用这些std::to_string函数。

If you want even more flexibility in the form of being able to serialise any object to a string then use the std::stringstreamclass. But you'll need to implement your own streaming operator functions for it to work with your own custom classes.

如果您希望以能够将任何对象序列化为字符串的形式获得更大的灵活性,请使用std::stringstream该类。但是您需要实现您自己的流操作符函数才能使用您自己的自定义类。

回答by sergeys

std::string's += doesn't work with const char* (what stuff like "string to add" appear to be), so definitely using stringstream is the closest to what is required - you just use << instead of +

std::string 的 += 不适用于 const char*(类似于“要添加的字符串”之类的东西),因此肯定使用 stringstream 最接近所需的内容 - 您只需使用 << 而不是 +

回答by user2328447

A convenient string builder for c++

一个方便的 C++ 字符串生成器

Like many people answered before, std::stringstream is the method of choice. It works good and has a lot of conversion and formatting options. IMO it has one pretty inconvenient flaw though: You can not use it as a one liner or as an expression. You always have to write:

就像之前回答的许多人一样,std::stringstream 是首选方法。它运行良好,并有很多转换和格式选项。IMO 它有一个非常不方便的缺陷:你不能将它用作单衬或表达。你总是要写:

std::stringstream ss;
ss << "my data " << 42;
std::string myString( ss.str() );

which is pretty annoying, especially when you want to initialize strings in the constructor.

这很烦人,尤其是当您想在构造函数中初始化字符串时。

The reason is, that a) std::stringstream has no conversion operator to std::string and b) the operator << ()'s of the stringstream don't return a stringstream reference, but a std::ostream reference instead - which can not be further computed as a string stream.

原因是,a) std::stringstream 没有到 std::string 的转换运算符,并且 b) stringstream 的运算符 << () 不返回 stringstream 引用,而是返回 std::ostream 引用- 不能作为字符串流进一步计算。

The solution is to override std::stringstream and to give it better matching operators:

解决方案是覆盖 std::stringstream 并为其提供更好的匹配运算符:

namespace NsStringBuilder {
template<typename T> class basic_stringstream : public std::basic_stringstream<T>
{
public:
    basic_stringstream() {}

    operator const std::basic_string<T> () const                                { return std::basic_stringstream<T>::str();                     }
    basic_stringstream<T>& operator<<   (bool _val)                             { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (char _val)                             { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (signed char _val)                      { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (unsigned char _val)                    { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (short _val)                            { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (unsigned short _val)                   { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (int _val)                              { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (unsigned int _val)                     { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (long _val)                             { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (unsigned long _val)                    { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (long long _val)                        { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (unsigned long long _val)               { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (float _val)                            { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (double _val)                           { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (long double _val)                      { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (void* _val)                            { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (std::streambuf* _val)                  { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (std::ostream& (*_val)(std::ostream&))  { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (std::ios& (*_val)(std::ios&))          { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (std::ios_base& (*_val)(std::ios_base&)){ std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (const T* _val)                         { return static_cast<basic_stringstream<T>&>(std::operator << (*this,_val)); }
    basic_stringstream<T>& operator<<   (const std::basic_string<T>& _val)      { return static_cast<basic_stringstream<T>&>(std::operator << (*this,_val.c_str())); }
};

typedef basic_stringstream<char>        stringstream;
typedef basic_stringstream<wchar_t>     wstringstream;
}

With this, you can write things like

有了这个,你可以写这样的东西

std::string myString( NsStringBuilder::stringstream() << "my data " << 42 )

even in the constructor.

即使在构造函数中。

I have to confess I didn't measure the performance, since I have not used it in an environment which makes heavy use of string building yet, but I assume it won't be much worse than std::stringstream, since everything is done via references (except the conversion to string, but thats a copy operation in std::stringstream as well)

我必须承认我没有测量性能,因为我还没有在大量使用字符串构建的环境中使用它,但我认为它不会比 std::stringstream 差多少,因为一切都已完成通过引用(除了转换为字符串,但这也是 std::stringstream 中的复制操作)

回答by Igor

The Ropecontainer may be worth if have to insert/delete string into the random place of destination string or for a long char sequences. Here is an example from SGI's implementation:

如果必须将字符串插入/删除到目标字符串的随机位置或长字符序列,则Rope容器可能是值得的。下面是 SGI 实现的一个例子:

crope r(1000000, 'x');          // crope is rope<char>. wrope is rope<wchar_t>
                                // Builds a rope containing a million 'x's.
                                // Takes much less than a MB, since the
                                // different pieces are shared.
crope r2 = r + "abc" + r;       // concatenation; takes on the order of 100s
                                // of machine instructions; fast
crope r3 = r2.substr(1000000, 3);       // yields "abc"; fast.
crope r4 = r2.substr(1000000, 1000000); // also fast.
reverse(r2.mutable_begin(), r2.mutable_end());
                                // correct, but slow; may take a
                                // minute or more.

回答by CoffeDeveloper

I wanted to add something new because of the following:

由于以下原因,我想添加一些新内容:

At a first attemp I failed to beat

第一次尝试我没能打败

std::ostringstream's operator<<

std::ostringstreamoperator<<

efficiency, but with more attemps I was able to make a StringBuilder that is faster in some cases.

效率,但通过更多的尝试,我能够制作在某些情况下更快的 StringBuilder。

Everytime I append a string I just store a reference to it somewhere and increase the counter of the total size.

每次我附加一个字符串时,我只是在某处存储对它的引用并增加总大小的计数器。

The real way I finally implemented it (Horror!) is to use a opaque buffer(std::vector < char > ):

我最终实现它的真正方法(恐怖!)是使用不透明的缓冲区(std::vector < char > ):

  • 1 byte header (2 bits to tell if following data is :moved string, string or byte[])
  • 6 bits to tell lenght of byte[]
  • 1 字节标头(2 位表示以下数据是:移动的字符串、字符串还是字节 [])
  • 6 位表示字节 [] 的长度

for byte [ ]

对于字节 [ ]

  • I store directly bytes of short strings (for sequential memory access)
  • 我直接存储短字符串的字节(用于顺序内存访问)

for moved strings(strings appended with std::move)

用于移动的字符串(附加有 的字符串std::move

  • The pointer to a std::stringobject (we have ownership)
  • set a flag in the class if there are unused reserved bytes there
  • 指向std::string对象的指针(我们拥有所有权)
  • 如果有未使用的保留字节,则在类中设置一个标志

for strings

对于字符串

  • The pointer to a std::stringobject (no ownership)
  • 指向std::string对象的指针(无所有权)

There's also one small optimization, if last inserted string was mov'd in, it checks for free reserved but unused bytes and store further bytes in there instead of using the opaque buffer (this is to save some memory, it actually make it slightly slower, maybe depend also on the CPU, and it is rare to see strings with extra reserved space anyway)

还有一个小的优化,如果最后插入的字符串被移动,它会检查空闲的保留但未使用的字节并在那里存储更多字节而不是使用不透明的缓冲区(这是为了节省一些内存,它实际上使它稍微慢一些, 可能也取决于 CPU,反正很少看到有额外保留空间的字符串)

This was finally slightly faster than std::ostringstreambut it has few downsides:

这最终比略快,std::ostringstream但它几乎没有缺点:

  • I assumed fixed lenght char types (so 1,2 or 4 bytes, not good for UTF8), I'm not saying it will not work for UTF8, Just I don't checked it for laziness.
  • I used bad coding practise (opaque buffer, easy to make it not portable, I believe mine is portable by the way)
  • Lacks all features of ostringstream
  • If some referenced string is deleted before mergin all the strings: undefined behaviour.
  • 我假设固定长度的字符类型(所以 1,2 或 4 个字节,不适合 UTF8),我并不是说它不适用于 UTF8,只是我没有检查它是否有惰性。
  • 我使用了糟糕的编码习惯(不透明的缓冲区,很容易使它不便携,顺便说一下,我相信我的是便携的)
  • 缺乏所有功能 ostringstream
  • 如果在合并所有字符串之前删除了某些引用的字符串:未定义的行为。

conclusion? use std::ostringstream

结论?用 std::ostringstream

It already fix the biggest bottleneck while ganing few % points in speed with mine implementation is not worth the downsides.

它已经解决了最大的瓶颈,同时通过我的实施提高了几个百分点的速度并不值得。