C语言 为什么 strtok() 被认为是不安全的?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/5999418/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-02 08:39:17  来源:igfitidea点击:

Why is strtok() Considered Unsafe?

csecuritystrtok

提问by user541686

What feature(s) of strtokis unsafe (in terms of buffer overflow) that I need to watch out for?

strtok我需要注意哪些不安全的功能(就缓冲区溢出而言)?

What's a little weird to me is that strtok_s(which is "safe") in Visual C++ has an extra "context" parameter, but it looks like it's the same in other ways... is it the same, or is it actually different?

对我来说有点奇怪的是strtok_sVisual C++ 中的(这是“安全的”)有一个额外的“上下文”参数,但它在其他方面看起来是一样的......它是一样的,还是实际上不同?

采纳答案by Heisenbug

According with the strtok_s section of this document:

根据本文档的 strtok_s 部分:

6.7.3.1 The strtok_s function The strtok_s function fixes two problems in the strtok function:

  1. A new parameter, s1max, prevents strtok_s from storing outside of the string being tokenized. (The string being divided into tokens is both an input and output of the function since strtok_s stores null characters into the string.)
  2. A new parameter, ptr, eliminates the static internal state that prevents strtok from being re-entrant (Subclause 1.1.12). (The ISO/IEC 9899 function wcstok and the ISO/IEC 9945 (POSIX) function strtok_r fix this problem identically.)

6.7.3.1 strtok_s 函数 strtok_s 函数修复了 strtok 函数中的两个问题:

  1. 新参数 s1max 可防止 strtok_s 存储在被标记化的字符串之外。(字符串被分成标记既是函数的输入也是输出,因为 strtok_s 将空字符存储到字符串中。)
  2. 新参数 ptr 消除了防止 strtok 重入的静态内部状态(第 1.1.12 款)。(ISO/IEC 9899 函数 wcstok 和 ISO/IEC 9945 (POSIX) 函数 strtok_r 相同地解决了这个问题。)

回答by Bob

There is nothing unsafe about it. You just need to understand how it works and how to use it. After you write your code and unit test, it only takes a couple of extra minutes to re-run the unit test with valgrind to make sure you are operating withing memory bounds. The man page says it all:

没有什么不安全的。您只需要了解它是如何工作的以及如何使用它。编写代码和单元测试后,只需额外花几分钟时间即可使用 valgrind 重新运行单元测试,以确保在内存限制下运行。手册页说明了一切:

BUGS

Be cautious when using these functions. If you do use them, note that:

  • These functions modify their first argument.
  • These functions cannot be used on constant strings.
  • The identity of the delimiting character is lost.
  • The strtok()function uses a static buffer while parsing, so it's not thread safe. Use strtok_r()if this matters to you.

错误

使用这些功能时要小心。如果您确实使用它们,请注意:

  • 这些函数修改它们的第一个参数。
  • 这些函数不能用于常量字符串。
  • 定界符的身份丢失。
  • strtok()函数在解析时使用静态缓冲区,因此它不是线程安全的。使用strtok_r()如果这个问题给您。

回答by Vladislav Vaintroub

strtok is safe in Visual C++ (but nowhere else), as it uses thread local storage to save its state between calls. Everywhere else, global variable is used to save strtok() state.

strtok 在 Visual C++ 中是安全的(但在其他地方),因为它使用线程本地存储来保存调用之间的状态。在其他地方,全局变量用于保存 strtok() 状态。

However even in VC++, where strtok is thread-safe it is still still a bit weird - you cannot use strtok()s on different strings in the same thread at the same time. For example this would not work well:

然而,即使在 VC++ 中,strtok 是线程安全的,它仍然有点奇怪——你不能同时在同一线程中的不同字符串上使用 strtok()s。例如,这不会很好地工作:

     token = strtok( string, seps );
     while(token)
     {
        printf("token=%s\n", token)
        token2 = strtok(string2, seps);
        while(token2)  
        {
            printf("token2=%s", token2);
            token2 = strtok( NULL, seps );
        }
        token = strtok( NULL, seps );
     }

The reason why it would not work well- for every thread only single state can be saved in thread local storage, and here one would need 2 states - for the first string and for the second string. So while strtok is thread-safe with VC++, it is not reentrant.

它不能正常工作的原因 - 对于每个线程,线程本地存储中只能保存单个状态,而这里需要 2 个状态 - 对于第一个字符串和第二个字符串。因此,虽然 strtok 对 VC++ 是线程安全的,但它不是可重入的。

What strtok_s (or strtok_r everywhere else) provides - an explicit state, and with that strtok becomes reentrant.

strtok_s(或其他地方的 strtok_r )提供的内容 - 显式状态,并且 strtok 变得可重入。

回答by Suroot

If you do not have a properly null terminated string; you will end up in a buffer overflow. Also note (this is something that I learned the hard way) strtok does NOT seem to care about internal strings. I.E. having "hello"/"world" will parse "hello"/"world" whereas "hello/world" will parse into "hello world". Notice that it splits on the / and ignores the fact that it is within a parenthesis.

如果您没有正确的空终止字符串;你最终会出现缓冲区溢出。还要注意(这是我通过艰难的方式学到的东西)strtok 似乎并不关心内部字符串。具有“hello”/“world”的IE将解析“hello”/“world”,而“hello/world”将解析为“hello world”。请注意,它在 / 上拆分并忽略了它在括号内的事实。