C语言 C 中的 strtok 函数是如何工作的?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/21097253/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-02 10:36:59  来源:igfitidea点击:

How does the strtok function in C work?

cstrtok

提问by user2426316

I found this sample program which explains the strtokfunction:

我找到了这个解释strtok功能的示例程序:

#include <stdio.h>
#include <string.h>

int main ()
{
    char str[] ="- This, a sample string.";
    char * pch;
    printf ("Splitting string \"%s\" into tokens:\n",str);
    pch = strtok (str," ,.-");
    while (pch != NULL)
    {
        printf ("%s\n",pch);
        pch = strtok (NULL, " ,.-");
    }
    return 0;
}

However, I don't see how this is possible to work.

但是,我不明白这是如何工作的。

How is it possible that pch = strtok (NULL, " ,.-");returns a new token. I mean, we are calling strtokwith NULL. This doesen't make a lot sense to me.

怎么可能pch = strtok (NULL, " ,.-");返回一个新的令牌。我的意思是,我们要求strtokNULL。这对我来说没有多大意义。

回答by Floris

Two things to know about strtok. As was mentioned, it "maintains internal state". Also, it messes up the string you feed it. Essentially, it will write a '\0'where it finds the token you supplied, and returns a pointer to the start of the string. Internally it maintains the location of the last token; and next time you call it, it starts from there.

有两件事要知道strtok。如前所述,它“保持内部状态”。此外,它会弄乱你喂它的字符串。本质上,它会'\0'在找到您提供的标记的地方写一个,并返回一个指向字符串开头的指针。在内部,它维护最后一个令牌的位置;下次你调用它时,它从那里开始。

The important corollary is that you cannot use strtokon a const char* "hello world";type of string, since you will get an access violation when you modify contents of a const char*string.

重要的推论是您不能strtokconst char* "hello world";字符串类型上使用,因为当您修改const char*字符串的内容时会遇到访问冲突。

The "good" thing about strtokis that it doesn't actually copy strings - so you don't need to manage additional memory allocation etc. But unless you understand the above, you will have trouble using it correctly.

“好”之strtok处在于它实际上并不复制字符串——因此您不需要管理额外的内存分配等。但是除非您理解上述内容,否则您将无法正确使用它。

Example - if you have "this,is,a,string", successive calls to strtokwill generate pointers as follows (the ^is the value returned). Note that the '\0'is added where the tokens are found; this means the source string is modified:

示例 - 如果您有“this,is,a,string”,对 的连续调用strtok将生成如下指针(这^是返回的值)。请注意,在'\0'找到令牌的地方添加了 ;这意味着源字符串被修改:

t  h  i  s  ,  i  s  ,  a  ,  s  t  r  i  n  g 
char *
strtok(s, delim)
    register char *s;
    register const char *delim;
{
    register char *spanp;
    register int c, sc;
    char *tok;
    static char *last;


    if (s == NULL && (s = last) == NULL)
        return (NULL);

    /*
     * Skip (span) leading delimiters (s += strspn(s, delim), sort of).
     */
cont:
    c = *s++;
    for (spanp = (char *)delim; (sc = *spanp++) != 0;) {
        if (c == sc)
            goto cont;
    }

    if (c == 0) {       /* no non-delimiter characters */
        last = NULL;
        return (NULL);
    }
    tok = s - 1;

    /*
     * Scan token (scan for delimiters: s += strcspn(s, delim), sort of).
     * Note that delim must have one NUL; we stop if we see that, too.
     */
    for (;;) {
        c = *s++;
        spanp = (char *)delim;
        do {
            if ((sc = *spanp++) == c) {
                if (c == 0)
                    s = NULL;
                else
                    s[-1] = 0;
                last = s;
                return (tok);
            }
        } while (sc != 0);
    }
    /* NOTREACHED */
}
this,is,a,string t h i s ##代码## i s , a , s t r i n g ##代码## this ^ t h i s ##代码## i s ##代码## a , s t r i n g ##代码## is ^ t h i s ##代码## i s ##代码## a ##代码## s t r i n g ##代码## a ^ t h i s ##代码## i s ##代码## a ##代码## s t r i n g ##代码## string ^

Hope it makes sense.

希望这是有道理的。

回答by Sean

strtokmaintains internal state. When you call it with non-NULL it re-initializes itself to use the string you supply. When you call it with NULLit uses that string, and any other state its currently got to return the next token.

strtok保持内部状态。当您使用非 NULL 调用它时,它会重新初始化自己以使用您提供的字符串。当您使用NULL它调用它时,它使用该字符串,以及它当前必须返回下一个令牌的任何其他状态。

Because of the way strtokworks you need to ensure that you link with a multithreaded version of the C runtime if you're writing a multithreaded application. This will ensure that each thread get its own internal state for strtok.

由于工作方式strtok,如果您正在编写多线程应用程序,则需要确保与 C 运行时的多线程版本链接。这将确保每个线程获得自己的内部状态strtok

回答by Andy Thomas

The strtok()function stores data between calls. It uses that data when you call it with a NULL pointer.

strtok()函数在调用之间存储数据。当您使用 NULL 指针调用它时,它会使用该数据。

From http://www.cplusplus.com/reference/cstring/strtok/:

http://www.cplusplus.com/reference/cstring/strtok/

The point where the last token was found is kept internally by the function to be used on the next call (particular library implementations are not required to avoid data races).

找到最后一个标记的点由要在下一次调用时使用的函数内部保存(不需要特定的库实现来避免数据竞争)。

回答by Juan Ramirez

The strtokfunction stores data in an internal static variable which is shared among all threads.

strtok函数将数据存储在所有线程共享的内部静态变量中。

For thread safety you should use strtok_r

为了线程安全,你应该使用 strtok_r

From http://www.opensource.apple.com/source/Libc/Libc-167/string.subproj/strtok.c

来自http://www.opensource.apple.com/source/Libc/Libc-167/string.subproj/strtok.c

Take a look to static char *last;

看看 static char *last;

##代码##