C语言 strchr 实现是如何工作的

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/14367727/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-02 05:04:00  来源:igfitidea点击:

How does strchr implementation work

cpointersconststrchr

提问by Marc

I tried to write my own implementation of the strchr() method.

我尝试编写自己的 strchr() 方法实现。

It now looks like this:

现在看起来像这样:

char *mystrchr(const char *s, int c) {
    while (*s != (char) c) {
        if (!*s++) {
            return NULL;
        }
    }
    return (char *)s;
}

The last line originally was

最后一行原来是

return s;

But this didn't work because s is const. I found out that there needs to be this cast (char *), but I honestly don't know what I am doing there :( Can someone explain?

但这不起作用,因为 s 是常量。我发现需要有这个演员表(char *),但老实说我不知道​​我在那里做什么:(有人可以解释吗?

回答by Keith Thompson

I believe this is actually a flaw in the C Standard's definition of the strchr()function. (I'll be happy to be proven wrong.) (Replying to the comments, it's arguable whether it's really a flaw; IMHO it's still poor design. It canbe used safely, but it's too easy to use it unsafely.)

我相信这实际上是 C 标准对strchr()函数定义的一个缺陷。(我会很高兴被证明是错误的。)(回复评论,它是否真的是一个缺陷是有争议的;恕我直言,它仍然是糟糕的设计。它可以安全使用,但使用它太容易不安全。)

Here's what the C standard says:

这是 C 标准所说的:

char *strchr(const char *s, int c);

The strchrfunction locates the first occurrence of c(converted to a char) in the string pointed to by s. The terminating null character is considered to be part of the string.

和strchr函数定位的第一次出现Ç(转化为由指向的字符串中)小号。终止空字符被认为是字符串的一部分。

Which means that this program:

这意味着这个程序:

#include <stdio.h>
#include <string.h>

int main(void) {
    const char *s = "hello";
    char *p = strchr(s, 'l');
    *p = 'L';
    return 0;
}

even though it carefully defines the pointer to the string literal as a pointer to constchar, has undefined behavior, since it modifies the string literal. gcc, at least, doesn't warn about this, and the program dies with a segmentation fault.

即使它小心地将指向字符串文字的指针定义为指向 的指针,也具有未定义的行为,因为它修改了字符串文字。至少 gcc 不会对此发出警告,并且程序会因分段错误而终止。constchar

The problem is that strchr()takes a const char*argument, which means it promises not to modify the data that spoints to -- but it returns a plain char*, which permits the caller to modify the same data.

问题在于它strchr()接受一个const char*参数,这意味着它承诺不会修改s指向的数据——但它返回一个普通的char*,这允许调用者修改相同的数据。

Here's another example; it doesn't have undefined behavior, butit quietly modifies a constqualified object without any casts (which, on further thought, I believe has undefined behavior):

这是另一个例子;它没有未定义的行为,但它悄悄地修改了一个const没有任何强制转换的合格对象(进一步思考,我相信它具有未定义的行为):

#include <stdio.h>
#include <string.h>

int main(void) {
    const char s[] = "hello";
    char *p = strchr(s, 'l');
    *p = 'L';
    printf("s = \"%s\"\n", s);
    return 0;
}

Which means, I think, (to answer your question) that a C implementation of strchr()has to cast its result to convert it from const char*to char*, or do something equivalent.

这意味着,我认为(回答你的问题)C 实现strchr()必须转换其结果以将其从 转换const char*char*,或者做一些等效的事情。

This is why C++, in one of the few changes it makes to the C standard library, replaces strchr()with two overloaded functions of the same name:

这就是为什么在 C++ 对 C 标准库所做的少数更改之一中,strchr()用两个同名的重载函数替换的原因:

const char * strchr ( const char * str, int character );
      char * strchr (       char * str, int character );

Of course C can't do this.

C当然不能这样做。

An alternative would have been to replace strchrby two functions, one taking a const char*and returning a const char*, and another taking a char*and returning a char*. Unlike in C++, the two functions would have to have different names, perhaps strchrand strcchr.

另一种方法是strchr用两个函数替换,一个接受 aconst char*并返回 a const char*,另一个接受 achar*并返回 a char*。与 C++ 不同,这两个函数必须具有不同的名称,可能是strchrstrcchr

(Historically, constwas added to C after strchr()had already been defined. This was probably the only way to keep strchr()without breaking existing code.)

(历史上,conststrchr()已经定义之后被添加到 C 中。这可能是保持strchr()不破坏现有代码的唯一方法。)

strchr()is not the only C standard library function that has this problem. The list of affected function (I thinkthis list is complete but I don't guarantee it) is:

strchr()不是唯一存在此问题的 C 标准库函数。受影响的功能列表(我认为这个列表是完整的,但我不保证)是:

void *memchr(const void *s, int c, size_t n);
char *strchr(const char *s, int c);
char *strpbrk(const char *s1, const char *s2);
char *strrchr(const char *s, int c);
char *strstr(const char *s1, const char *s2);

(all declared in <string.h>) and:

(均在 中声明<string.h>)和:

void *bsearch(const void *key, const void *base,
    size_t nmemb, size_t size,
    int (*compar)(const void *, const void *));

(declared in <stdlib.h>). All these functions take a pointer to constdata that points to the initial element of an array, and return a non-constpointer to an element of that array.

(在 中声明<stdlib.h>)。所有这些函数都接受一个指向const数组初始元素的数据指针,并返回一个const指向该数组元素的非指针。

回答by AnT

The practice of returning non-const pointers to const data from non-modifying functions is actually an idiomrather widely used in C language. It is not always pretty, but it is rather well established.

从非修改函数返回指向常量数据的非常量指针的做法实际上是C 语言中相当广泛使用的一种习惯用法。它并不总是很漂亮,但它相当成熟。

The reationale here is simple: strchrby itself is a non-modifying operation. Yet we need strchrfunctionality for both constant strings and non-constant strings, which would also propagate the constness of the input to the constness of the output. Neither C not C++ provide any elegant support for this concept, meaning that in both languages you will have to write twovirtually identical functions in order to avoid taking any risks with const-correctness.

这里的原理很简单:strchr它本身就是一个非修改操作。然而,我们需要strchr常量字符串和非常量字符串的功能,这也会将输入的常量性传播到输出的常量性。C 和 C++ 都没有为这个概念提供任何优雅的支持,这意味着在这两种语言中,您必须编写两个几乎相同的函数,以避免承担任何常量正确性的风险。

In C++ you wild be able to use function overloading by declaring two functions with the same name

在 C++ 中,您可以通过声明两个具有相同名称的函数来使用函数重载

const char *strchr(const char *s, int c);
char *strchr(char *s, int c);

In C you have no function overloading, so in order to fully enforce const-correctness in this case you would have to provide two functions with differentnames, something like

在 C 中,您没有函数重载,因此为了在这种情况下完全强制执行常量正确性,您必须提供两个具有不同名称的函数,例如

const char *strchr_c(const char *s, int c);
char *strchr(char *s, int c);

Although in some cases this might be the right thing to do, it is typically (and rightfully) considered too cumbersome and involving by C standards. You can resolve this situation in a more compact (albeit more risky) way by implementing only one function

尽管在某些情况下,这可能是正确的做法,但通常(并且理所当然地)认为 C 标准过于繁琐且涉及。您可以通过仅实现一个功能以更紧凑(尽管风险更大)的方式解决这种情况

char *strchr(const char *s, int c);

which returns non-const pointer into the input string (by using a cast at the exit, exactly as you did it). Note, that this approach does not violate any rules of the language, although it provides the callerwith the means to violate them. By casting away the constness of the data this approach simply delegates the responsibility to observe const-correctness from the function itself to the caller. As long as the caller is aware of what's going on and remembers to "play nice", i.e. uses a const-qualified pointer to point to const data, any temporary breaches in the wall of const-correctness created by such function are repaired instantly.

它将非常量指针返回到输入字符串中(通过在出口处使用强制转换,正如您所做的那样)。请注意,这种方法不会违反任何语言规则,尽管它为调用者提供违反这些规则的方法。通过抛弃数据的常量性,这种方法只是将观察函数本身的常量正确性的责任委托给调用者。只要调用者知道发生了什么并记得“玩得开心”,即使用一个 const 限定的指针来指向 const 数据,由此类函数创建的 const 正确性墙中的任何临时破坏都会立即修复。

I see this trick as a perfectly acceptable approach to reducing unnecessary code duplication (especially in absence of function overloading). The standard library uses it. You have no reason to avoid it either, assuming you understand what you are doing.

我认为这个技巧是一种完全可以接受的减少不必要代码重复的方法(特别是在没有函数重载的情况下)。标准库使用它。假设您了解自己在做什么,您也没有理由避免它。

Now, as for your implementation of strchr, it looks weird to me from the stylistic point of view. I would use the cycle header to iterate over the full range we are operating on (the full string), and use the inner ifto catch the early termination condition

现在,至于您的 实现strchr,从文体的角度来看,我觉得很奇怪。我会使用循环头来迭代我们正在操作的整个范围(完整的字符串),并使用内部if来捕捉提前终止条件

for (; *s != '
for (; *s != '
const char *mystrchr1(const char *s, int c) {
    if (s == NULL) {
        return NULL;
    }
    if ((c > 255) || (c < 0)) {
        return NULL;
    }
    int s_len;
    int i;
    s_len = strlen(s);
    for (i = 0; i < s_len; i++) {
        if ((char) c == s[i]) {
            return (const char*) &s[i];
        }
    }
    return NULL;
}
' && *s != c; ++s) ; return *s == c ? (char *) s : NULL;
'; ++s) if (*s == c) return (char *) s; return NULL;

But things like that are always a matter of personal preference. Someone might prefer to just

但这样的事情始终是个人喜好的问题。有人可能更喜欢

##代码##

Some might say that modifying function parameter (s) inside the function is a bad practice.

有人可能会说在函数s内部修改函数参数 ( ) 是一种不好的做法。

回答by Alberto Miranda

The constkeyword means that the parameter cannot be modified.

const关键字的装置,该参数可以不能被修改。

You couldn't return sdirectly because sis declared as const char *sand the return type of the function is char *. If the compiler allowed you to do that, it would be possible to override the constrestriction.

你不能s直接返回,因为s被声明为const char *s并且函数的返回类型是char *。如果编译器允许您这样做,则可以覆盖const限制。

Adding a explicit cast to char*tells the compiler that you know what you're doing (though as Eric explained, it would be better if you didn't do it).

添加一个显式转换来char*告诉编译器你知道你在做什么(尽管正如 Eric 解释的那样,如果你不这样做会更好)。

UPDATE: For the sake of context I'm quoting Eric's answer, since he seems to have deleted it:

更新:为了上下文,我引用了 Eric 的回答,因为他似乎已经删除了它:

You should not be modifying s since it is a const char *.

Instead, define a local variable that represents the result of type char * and use that in place of s in the method body.

您不应该修改 s,因为它是一个 const char *。

相反,定义一个表示 char * 类型结果的局部变量,并在方法体中使用它代替 s。

回答by A B

The Function Return Value should be a Constant Pointer to a Character:

函数返回值应该是一个指向字符的常量指针:

strchraccepts a const char*and should return const char*also. You are returning a non constant which is potentially dangerous since the return value points into the input character array (the caller might be expecting the constant argument to remain constant, but it is modifiable if any part of it is returned as as a char *pointer).

strchr接受 aconst char*并且const char*也应该返回。您正在返回一个具有潜在危险的非常量,因为返回值指向输入字符数组(调用者可能希望常量参数保持不变,但如果它的任何部分作为char *指针返回,则它是可修改的)。

The Function return Value should be NULL if No matching Character is Found:

如果未找到匹配的字符,则函数返回值应为 NULL:

Also strchris supposed to return NULLif the sought character is not found. If it returns non-NULL when the character is not found, or s in this case, the caller (if he thinks the behavior is the same as strchr) might assume that the first character in the result actually matches (without the NULL return value there is no way to tell whether there was a match or not).

如果找不到所寻找的字符,也strchr应该返回NULL。如果在找不到字符时返回非 NULL,或者在这种情况下返回 s,调用者(如果他认为行为与 strchr 相同)可能会假设结果中的第一个字符实际上匹配(没有 NULL 返回值无法判断是否匹配)。

(I'm not sure if that is what you intended to do.)

(我不确定这是否是您打算做的。)

Here is an Example of a Function that Does This:

这是执行此操作的函数示例:

I wrote and ran several tests on this function; I added a few really obvious sanity checks to avoid potential crashes:

我对这个函数编写并运行了几个测试;我添加了一些非常明显的健全性检查以避免潜在的崩溃:

##代码##