C++ Memcpy、字符串和终止符
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/5952512/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Memcpy, string and terminator
提问by Danilo
I have to write a function that fills a char* buffer for an assigned length with the content of a string. If the string is too long, I just have to cut it. The buffer is not allocated by me but by the user of my function. I tried something like this:
我必须编写一个函数,用字符串的内容填充指定长度的 char* 缓冲区。如果绳子太长,我只需要剪掉它。缓冲区不是由我分配的,而是由我的函数的用户分配的。我试过这样的事情:
int writebuff(char* buffer, int length){
string text="123456789012345";
memcpy(buffer, text.c_str(),length);
//buffer[length]='int writebuff(char* buffer, int length)
{
string text="123456789012345";
if (length <= 0)
return text.size();
if (text.size() < length)
{
memcpy(buffer, text.c_str(), text.size()+1);
return text.size();
}
memcpy(buffer, text.c_str(), length-1);
buffer[length-1]='int writebuff(char* buffer, int length)
{
string text = "123456789012345";
std::fill_n(buffer, length, 0); // reset the entire buffer
// use the built-in copy method from std::string, it will decide what's best.
text.copy(buffer, length);
// only over-write the last character if source is greater than length
if (length < text.size())
buffer[length-1] = 0;
return 1; // eh?
}
';
return length-1;
}
';
return 1;
}
int main(){
char* buffer = new char[10];
writebuff(buffer,10);
cout << "After: "<<buffer<<endl;
}
my question is about the terminator: should it be there or not? This function is used in a much wider code and sometimes it seems I get problems with strange characters when the string needs to be cut.
我的问题是关于终结者:它应该存在还是不存在?此函数用于更广泛的代码中,有时似乎在需要剪切字符串时遇到奇怪字符的问题。
Any hints on the correct procedure to follow?
有关要遵循的正确程序的任何提示?
回答by Mark Ransom
A C-style string mustbe terminated with a zero character '\0'
.
C 样式的字符串必须以零字符结尾'\0'
。
In addition you have another problem with your code - it may try to copy from beyond the end of your source string. This is classic undefined behavior. It may look like it works, until the one time that the string is allocated at the end of a heap memory block and the copy goes off into a protected area of memory and fails spectacularly. You should copy only until the minimumof the length of the buffer or the length of the string.
此外,您的代码还有另一个问题 - 它可能会尝试从源字符串的末尾复制。这是典型的未定义行为。它可能看起来有效,直到有一次在堆内存块的末尾分配字符串并且副本进入内存的受保护区域并严重失败。您应该只复制到缓冲区长度或字符串长度中的最小值。
P.S. For completeness here's a good version of your function. Thanks to Naveenfor pointing out the off-by-one error in your terminating null. I've taken the liberty of using your return value to indicate the length of the returned string, or the number of characters required if the length passed in was <= 0.
PS 为了完整起见,这里有一个很好的函数版本。感谢Naveen指出终止 null 中的一对一错误。我冒昧地使用您的返回值来指示返回字符串的长度,或者如果传入的长度 <= 0 时所需的字符数。
int writebuff(char* buffer, int length){
char* text="123456789012345";
strncpy(buffer, text, length);
buffer[length-1]='int writebuff(char* buffer, int length){
string text="123456789012345";
memcpy(buffer, text.c_str(),length);
buffer[length-1]='##代码##';
return 1;
}
int main(){
char* buffer = new char[10];
writebuff(buffer,10);
cout << "After: "<<buffer<<endl;
}
';
return 1;
}
回答by Naveen
If you want to treat the buffer as a string you should NULL terminate it. For this you need to copy length-1
characters using memcpy
and set the length-1
character as \0
.
如果要将缓冲区视为字符串,则应 NULL 终止它。为此,您需要使用复制length-1
字符并将字符memcpy
设置length-1
为\0
.
回答by Nim
it seems you are using C++ - given that, the simplest approach is (assuming that NUL termination is required by the interface spec)
看来您正在使用 C++ - 鉴于此,最简单的方法是(假设接口规范需要 NUL 终止)
##代码##回答by Alok Save
char * Buffers must be null terminated unless you are explicitly passing out the length with it everywhere and saying so that the buffer is not null terminated.
char * 缓冲区必须是空终止的,除非你在任何地方都明确地用它传递长度并说缓冲区不是空终止的。
回答by jtniehof
I agree with Necrolis that strncpy is the way to go, but it will not get the null terminator if the string is too long. You had the right idea in putting an explicit terminator, but as written your code puts it one past the end. (This is in C, since you seemed to be doing more C than C++?)
我同意 Necrolis 的观点,strncpy 是可行的方法,但如果字符串太长,它将不会获得空终止符。您在放置一个明确的终止符时有正确的想法,但正如编写的那样,您的代码将其放在最后。(这是用 C 语言编写的,因为您似乎用 C 语言做的比 C++ 多?)
##代码##回答by Rob?
First, I don't know whether writerbuff
should terminate the string or not. That is a design question, to be answered by the person who decided that writebuff
should exist at all.
首先,我不知道是否writerbuff
应该终止字符串。这是一个设计问题,由决定writebuff
应该存在的人来回答。
Second, taking your specific example as a whole, there are two problems. One is that you pass an unterminated string to operator<<(ostream, char*)
. Second is the commented-out line writes beyond the end of the indicated buffer. Both of these invoke undefined behavior.
其次,就你的具体例子而言,有两个问题。一种是您将未终止的字符串传递给operator<<(ostream, char*)
. 其次是注释掉的行写入超出指示缓冲区的末尾。这两者都会调用未定义的行为。
(Third is a design flaw -- can you know that length
is always less than the length of text
?)
(第三个是设计缺陷——你知道它length
总是小于 的长度text
吗?)
Try this:
尝试这个:
##代码##回答by Necrolis
It should most defiantly be there*, this prevents strings that are too long for the buffer from filling it completely and causing an overflow later on when its accessed. though imo, strncpy
should be used instead of memcpy
, but you'll still have to null terminate it. (also your example leaks memory).
它最应该在那里*,这可以防止对于缓冲区来说太长的字符串完全填充它并在以后访问它时导致溢出。虽然 imo,strncpy
应该使用而不是memcpy
,但你仍然必须空终止它。(您的示例也会泄漏内存)。
*if you're ever in doubt, go the safest route!
*如果您有任何疑问,请走最安全的路线!
回答by Nawaz
my question is about the terminator: should it be there or not?
我的问题是关于终结者:它应该存在还是不存在?
Yes. It should be there. Otherwise how would you later know where the string ends? And how would cout
would know? It would keep printing garbage till it encounters a garbage whose value happens to be \0
. Your program might even crash.
是的。它应该在那里。否则你以后怎么知道字符串在哪里结束?又怎么cout
会知道?它会一直打印垃圾,直到遇到值恰好为 的垃圾\0
。您的程序甚至可能会崩溃。
As a sidenote, your program is leaking memory. It doesn't free the memory it allocates. But since you're exiting from the main()
, it doesn't matter much; after all once the program ends, all the memory would go back to the OS, whether you deallocate it or not. But its good practice in general, if you don't forget deallocating memory (or any other resource ) yourself.
作为旁注,您的程序正在泄漏内存。它不会释放它分配的内存。但既然你是从 退出的main()
,那就没什么关系了;毕竟一旦程序结束,所有的内存都会回到操作系统,无论你是否释放它。但是,如果您不忘记自己释放内存(或任何其他资源),那么它通常是一种很好的做法。
回答by rlc
Whether or not you should terminate the string with a \0
depends on the specification of your writebuff
function. If what you have in buffer
should be a valid C-style string after calling your function, you should terminate it with a \0
.
是否应该用 a 终止字符串\0
取决于writebuff
函数的规范。如果buffer
在调用函数后你的内容应该是一个有效的 C 风格字符串,你应该用\0
.
Note, though, that c_str()
will terminate with a \0
for you, so you coulduse text.size() + 1
as the size of the source string. Also note that if length
is larger than the size of the string, you will copy further than what text
provides with your current code (you can use min(length - 2, text.size() + 1/*trailing \0*/)
to prevent that, and set buffer[length - 1] = 0
to cap it off).
但是请注意,这c_str()
将以 a 结尾\0
,因此您可以将其text.size() + 1
用作源字符串的大小。另请注意,如果length
大于字符串的大小,您将复制比text
当前代码提供的内容更多的内容(您可以使用它min(length - 2, text.size() + 1/*trailing \0*/)
来防止这种情况,并设置buffer[length - 1] = 0
为将其关闭)。
The buffer
allocated in main
is leaked, btw
将buffer
在分配main
被泄露,顺便说一句
回答by JohnMcG
In
main()
, you shoulddelete
the buffer you allocated withnew.
, or allocate it statically (char buf[10]
). Yes, it's only 10 bytes, and yes, it's a memory "pool," not a leak, since it's a one-time allocations, and yes, you need that memory around for the entire running time of the program. But it's still a good habit to be into.In C/C++ the general contract with character buffers is that they be null-terminiated, so I would include it unless I had been explicitly told not to do it. And if I did, I would comment it, and maybe even use a typedef or name on the
char *
parameter indicating that the result is a string that is not null terminated.
在 中
main()
,您应该delete
使用分配的缓冲区new.
,或者静态分配它 (char buf[10]
)。是的,它只有 10 个字节,是的,它是一个内存“池”,而不是泄漏,因为它是一次性分配,是的,您需要在程序的整个运行时间内使用该内存。但这仍然是一个好习惯。在 C/C++ 中,字符缓冲区的一般约定是它们以 null 结尾,所以除非明确告诉我不要这样做,否则我会包含它。如果我这样做了,我会评论它,甚至可能在
char *
参数上使用 typedef 或名称,表明结果是一个非空终止的字符串。