C++ 中的动态缓冲区类型?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1874354/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
A dynamic buffer type in C++?
提问by Vilx-
I'm not exactly a C++ newbie, but I have had little serious dealings with it in the past, so my knowledge of its facilities is rather sketchy.
我并不是一个 C++ 新手,但我过去很少认真对待它,所以我对它的设施的了解相当粗略。
I'm writing a quick proof-of-concept program in C++ and I need a dynamically sizeable buffer of binary data. That is, I'm going to receive data from a network socket and I don't know how much there will be (although not more than a few MB). I could write such a buffer myself, but why bother if the standard library probably has something already? I'm using VS2008, so some Microsoft-specific extension is just fine by me. I only need four operations:
我正在用 C++ 编写一个快速的概念验证程序,我需要一个动态大小的二进制数据缓冲区。也就是说,我将从网络套接字接收数据,但我不知道会有多少(尽管不会超过几 MB)。我可以自己编写这样的缓冲区,但是如果标准库可能已经有了一些东西,为什么还要麻烦呢?我使用的是 VS2008,所以一些 Microsoft 特定的扩展对我来说很好。我只需要四个操作:
- Create the buffer
- Write data to the buffer (binary junk, not zero-terminated)
- Get the written data as a char array (together with its length)
- Free the buffer
- 创建缓冲区
- 将数据写入缓冲区(二进制垃圾,不以零结尾)
- 以字符数组的形式获取写入的数据(连同其长度)
- 释放缓冲区
What is the name of the class/function set/whatever that I need?
我需要的类/函数集的名称是什么?
Added:Several votes go to std::vector
. All nice and fine, but I don't want to push several MB of data byte-by-byte. The socket will give data to me in few-KB large chunks, so I'd like to write them all at once. Also, at the end I will need to get the data as a simple char*, because I will need to pass the whole blob along to some Win32 API functions unmodified.
补充:几票去std::vector
。一切都很好,但我不想逐字节推送几 MB 数据。套接字将以几 KB 大块的形式向我提供数据,因此我想一次将它们全部写入。此外,最后我需要将数据作为简单的 char* 获取,因为我需要将整个 blob 传递给一些未修改的 Win32 API 函数。
回答by GManNickG
You want a std::vector
:
你想要一个std::vector
:
std::vector<char> myData;
vector
will automatically allocate and deallocate its memory for you. Use push_back
to add new data (vector
will resize for you if required), and the indexing operator []
to retrieve data.
vector
将自动为您分配和释放其内存。使用push_back
添加新数据(vector
将如果需要你调整),以及索引操作符[]
来获取数据。
If at any point you can guess how much memory you'll need, I suggest calling reserve
so that subsequent push_back
's won't have to reallocate as much.
如果您在任何时候都可以猜出需要多少内存,我建议调用,reserve
以便后续push_back
's 不必重新分配那么多。
If you want to read in a chunk of memory and append it to your buffer, easiest would probably be something like:
如果你想读入一块内存并将其附加到你的缓冲区,最简单的可能是这样的:
std::vector<char> myData;
for (;;) {
const int BufferSize = 1024;
char rawBuffer[BufferSize];
const unsigned bytesRead = get_network_data(rawBuffer, sizeof(rawBuffer));
if (bytesRead <= 0) {
break;
}
myData.insert(myData.end(), rawBuffer, rawBuffer + bytesRead);
}
myData
now has all the read data, reading chunk by chunk. However, we're copying twice.
myData
现在拥有所有读取数据,逐块读取。但是,我们复制了两次。
We instead try something like this:
我们改为尝试这样的事情:
std::vector<char> myData;
for (;;) {
const int BufferSize = 1024;
const size_t oldSize = myData.size();
myData.resize(myData.size() + BufferSize);
const unsigned bytesRead = get_network_data(&myData[oldSize], BufferSize);
myData.resize(oldSize + bytesRead);
if (bytesRead == 0) {
break;
}
}
Which reads directly into the buffer, at the cost of occasionally over-allocating.
它直接读入缓冲区,代价是偶尔会过度分配。
This can be made smarter by e.g. doubling the vector size for each resize to amortize resizes, as the first solution does implicitly. And of course, you can reserve()
a much larger buffer up front if you have a priori knowledge of the probable size of the final buffer, to minimize resizes.
这可以通过例如将每次调整大小的向量大小加倍以摊销调整大小而变得更智能,因为第一个解决方案是隐式的。当然,reserve()
如果您对最终缓冲区的可能大小有先验知识,则可以预先设置更大的缓冲区,以最小化调整大小。
Both are left as an exercise for the reader. :)
两者都留给读者作为练习。:)
Finally, if you need to treat your data as a raw-array:
最后,如果您需要将数据视为原始数组:
some_c_function(myData.data(), myData.size());
std::vector
is guaranteed to be contiguous.
std::vector
保证是连续的。
回答by Nikola Smiljani?
std::vector<unsigned char> buffer;
Every push_back will add new char at the end (reallocating if needed). You can call reserve to minimize the number of allocations if you roughly know how much data you expect.
每个 push_back 都会在最后添加新字符(如果需要重新分配)。如果您大致知道您期望多少数据,您可以调用reserve 来最小化分配的数量。
buffer.reserve(1000000);
If you have something like this:
如果你有这样的事情:
unsigned char buffer[1000];
std::vector<unsigned char> vec(buffer, buffer + 1000);
回答by Wyzard
std::string
would work for this:
std::string
将为此工作:
- It supports embedded nulls.
- You can append multi-byte chunks of data to it by calling
append()
on it with a pointer and a length. - You can get its contents as a char array by calling
data()
on it, and the current length by callingsize()
orlength()
on it. - Freeing the buffer is handled automatically by the destructor, but you can also call
clear()
on it to erase its contents without destroying it.
- 它支持嵌入的空值。
- 您可以通过
append()
使用指针和长度调用它来将多字节数据块附加到它。 - 您可以通过调用
data()
它来获取其内容作为字符数组,并通过调用size()
或获取当前长度length()
。 - 释放缓冲区由析构函数自动处理,但您也可以调用
clear()
它来擦除其内容而不破坏它。
回答by sbk
One more vote for std::vector. Minimal code, skips the extra copy GMan's code do:
对 std::vector 再投一票。最少的代码,跳过额外的副本 GMan 的代码做:
std::vector<char> buffer;
static const size_t MaxBytesPerRecv = 1024;
size_t bytesRead;
do
{
const size_t oldSize = buffer.size();
buffer.resize(oldSize + MaxBytesPerRecv);
bytesRead = receive(&buffer[oldSize], MaxBytesPerRecv); // pseudo, as is the case with winsock recv() functions, they get a buffer and maximum bytes to write to the buffer
myData.resize(oldSize + bytesRead); // shrink the vector, this is practically no-op - it only modifies the internal size, no data is moved/freed
} while (bytesRead > 0);
As for calling WinAPI functions - use &buffer[0] (yeah, it's a little bit clumsy, but that's the way it is) to pass to the char* arguments, buffer.size() as length.
至于调用 WinAPI 函数 - 使用 &buffer[0](是的,它有点笨拙,但这就是它的方式)传递给 char* 参数,buffer.size() 作为长度。
And a final note, you can use std::string instead of std::vector, there shouldn't be any difference (except you can write buffer.data() instead of &buffer[0] if you buffer is a string)
最后一点,您可以使用 std::string 而不是 std::vector,应该没有任何区别(除非您可以编写 buffer.data() 而不是 &buffer[0] 如果您的缓冲区是字符串)
回答by Jerry Coffin
I'd take a look at Boost basic_streambuf, which is designed for this kind of purpose. If you can't (or don't want to) use Boost, I'd consider std::basic_streambuf
, which is quite similar, but a little more work to use. Either way, you basically derive from that base class and overload underflow()
to read data from the socket into the buffer. You'll normally attach an std::istream
to the buffer, so other code reads from it about the same way as they would user input from the keyboard (or whatever).
我会看看 Boost basic_streambuf,它是为这种目的而设计的。如果您不能(或不想)使用 Boost,我会考虑std::basic_streambuf
,它非常相似,但要使用更多的工作。无论哪种方式,您基本上都从该基类派生并重载underflow()
以将数据从套接字读取到缓冲区中。您通常会将 an 附加std::istream
到缓冲区,因此其他代码从中读取的方式与用户从键盘(或其他任何方式)输入的方式大致相同。
回答by Jerry Coffin
An alternative which is not from STL but might be of use - Boost.Circular buffer
不是来自 STL 但可能有用的替代方案 - Boost.Circular 缓冲区
回答by Xavier Nodet
Use std::vector, a growing array that guarantees the storage is contiguous (your third point).
使用std::vector,一个不断增长的数组,保证存储是连续的(你的第三点)。
回答by Useless
If you do use std::vector, you're just using it to manage the raw memory for you.
You could just malloc
the biggest buffer you think you'll need, and keep track of the write offset/total bytes read so far (they're the same thing).
If you get to the end ... either realloc
or choose a way to fail.
如果您确实使用 std::vector,那么您只是在使用它来为您管理原始内存。您可以只malloc
使用您认为需要的最大缓冲区,并跟踪到目前为止读取的写入偏移量/总字节数(它们是相同的)。如果你走到最后......要么realloc
选择失败的方式。
I know, it isn't very C++y, but this is a simple problem and the other proposals seem like heavyweight ways to introduce an unnecessary copy.
我知道,它不是很 C++y,但这是一个简单的问题,其他建议似乎是引入不必要副本的重量级方法。
回答by Brian D. Coryell
Regarding your comment "I don't see an append()", ineserting at the end is the same thing.
关于您的评论“我没有看到 append()”,最后插入是同一回事。
vec.insert(vec.end,
vec.insert(vec.end,