C++ 以字节为单位获取 std::string 字符串的大小

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/6235555/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-28 19:44:21  来源:igfitidea点击:

Get size of a std::string's string in bytes

c++stringmultibyte

提问by u359653

I would like to get the bytes a std::string's string occupies in memory, not the number of characters. The string contains a multibyte string. Would std::string::size()do this for me?

我想获取 astd::string的字符串在内存中占用的字节数,而不是字符数。该字符串包含一个多字节字符串。会std::string::size()为我做这个吗?

EDIT: Also, does size()also include the terminating NULL?

编辑:另外,size()还包括终止NULL吗?

回答by Luká? Lalinsky

std::stringoperates on bytes, not on Unicode characters, so std::string::size()will indeed return the size of the data in bytes (without the overhead that std::stringneeds to store the data, of course).

std::string对字节进行操作,而不是对 Unicode 字符进行操作,因此std::string::size()确实会以字节为单位返回数据的大小(当然,没有std::string需要存储数据的开销)。

No, std::stringstores only the data you tell it to store (it does not need the trailing NULLcharacter). So it will not be included in the size, unless you explicitly create a string with a trailing NULLcharacter.

不,std::string只存储您告诉它存储的数据(它不需要尾随NULL字符)。因此它不会包含在大小中,除非您明确创建一个带有尾随NULL字符的字符串。

回答by Martin York

You could be pedantic about it:

你可能会迂腐:

std::string x("X");

std::cout << x.size() * sizeof(std::string::value_type);

But std::string::value_type is char and sizeof(char) is defined as 1.

但是 std::string::value_type 是 char 并且 sizeof(char) 被定义为 1。

This only becomes important if you typedef the string type (because it may change in the future or because of compiler options).

这仅在您 typedef 字符串类型时才变得重要(因为它可能会在未来更改或由于编译器选项)。

// Some header file:
typedef   std::basic_string<T_CHAR>  T_string;

// Source a million miles away
T_string   x("X");

std::cout << x.size() * sizeof(T_string::value_type);

回答by Will A

std::string::size()is indeed the size in bytes.

std::string::size()确实是以字节为单位的大小。

回答by David Rodríguez - dribeas

To get the amount of memory in use by the string you would have to sum the capacity()with the overhead used for management. Note that it is capacity()and not size(). The capacity determines the number of characters (charT) allocated, while size()tells you how many of them are actually in use.

要获得字符串使用的内存量,您必须将capacity()与用于管理的开销相加。请注意,它是capacity()和不是size()。容量决定charT分配的字符数 ( ),同时size()告诉您实际使用的字符数。

In particular, std::stringimplementations don't usually *shrink_to_fit* the contents, so if you create a string and then remove elements from the end, the size()will be decremented, but in most cases (this is implementation defined) capacity()will not.

特别是,std::string实现通常不会*shrink_to_fit* 内容,因此如果您创建一个字符串,然后从末尾删除元素,size()则将递减,但在大多数情况下(这是实现定义的)capacity()不会。

Some implementations might not allocate the exact amount of memory required, but rather obtain blocks of given sizes to reduce memory fragmentation. In an implementation that used power of two sized blocks for the strings, a string with size 17could be allocating as much as 32characters.

某些实现可能不会分配所需的确切内存量,而是获取给定大小的块以减少内存碎片。在对字符串使用两个大小块的幂的实现中,具有大小的字符串17可以分配与32字符一样多的数量。

回答by AProgrammer

Yes, size() will give you the number of charin the string. One character in multibyte encoding take up multiple char.

是的, size() 会给你char字符串中的数字。多字节编码中的一个字符占用多个char.

回答by JayRock

There is inherent conflict in the question as written: std::stringis defined as std::basic_string<char,...>-- that is, its element type is char(1-byte), but later you stated "the string contains a multibyte string" ("multibyte" == wchar_t?).

所写的问题存在固有的冲突: std::string被定义为std::basic_string<char,...>- 也就是说,它的元素类型是char(1-byte),但后来你说“该字符串包含一个多字节字符串”(“multibyte”== wchar_t?)。

The size()member function does not count a trailing null. It's value represents the number of characters (not bytes).

size()成员函数不计尾随空。它的值表示字符数(而不是字节数)。

Assuming you intended to say your multibyte string is std::wstring(alias for std::basic_string<wchar_t,...>), the memory footprint for the std::wstring's characters, including the null-terminator is:

假设您打算说您的多字节字符串是std::wstring( 的别名std::basic_string<wchar_t,...>),std::wstring的字符(包括空终止符)的内存占用为:

std::wstring myString;
 ...
size_t bytesCount = (myString.size() + 1) * sizeof(wchar_t);

It's instructive to consider how one would write a reusable template function that would work for ANY potential instantiation of std::basic_string<> like this**:

考虑如何编写一个可重用的模板函数是有益的,该函数适用于 std::basic_string<> 的任何潜在实例化,如下所示**:

// Return number of bytes occupied by null-terminated inString.c_str().
template <typename _Elem>
inline size_t stringBytes(const std::basic_string<typename _Elem>& inString, bool bCountNull)
{
   return (inString.size() + (bCountNull ? 1 : 0)) * sizeof(_Elem);
}

** For simplicity, ignores the traits and allocator types rarely specified explicitly for std::basic_string<>(they have defaults).

** 为简单起见,忽略很少明确指定的特征和分配器类型std::basic_string<>(它们具有默认值)。