C语言 C中的字符串输入和输出

Question

提问by rookie

I have this snippet of the code:

我有这个代码片段：

char* receiveInput(){
    char *s;
    scanf("%s",s);

    return s;
}

int main()
{
    char *str = receiveInput();
    int length = strlen(str);

    printf("Your string is %s, length is %d\n", str, length);

    return 0;
}

I receive this output:

我收到这个输出：

Your string is hellà?", length is 11

my input was:

我的意见是：

helloworld!

can somebody explain why, and why this style of the coding is bad, thanks in advance

有人可以解释为什么，为什么这种编码风格不好，提前致谢

Answer 1

回答by Chris Lutz

Several questions have addressed what you've done wrong and how to fix it, but you also said (emphasis mine):

几个问题已经解决了你做错了什么以及如何解决它，但你也说（强调我的）：

can somebody explain why, and why this style of the coding is bad

有人可以解释为什么，为什么这种编码风格不好

I think scanfis a terrible way to read input. It's inconsistent with printf, makes it easy to forget to check for errors, makes it hard to recover from errors, and is incompatable with ordinary (and easier to do correctly) read operations (like fgetsand company).

我认为这scanf是一种阅读输入的可怕方式。它与不一致printf，容易忘记检查错误，难以从错误中恢复，并且与普通（更容易正确执行）读取操作（例如fgets和公司）不兼容。

First, note that the "%s"format will read only until it sees whitespace. Why whitespace? Why does "%s"print out an entire string, but reads in strings in such a limited capacity?

首先，请注意该"%s"格式将只读，直到它看到空格。为什么是空白？为什么"%s"打印出整个字符串，但读取的字符串容量如此有限？

If you'd like to read in an entire line, as you may often be wont to do, scanfprovides... with "%[^\n]". What? What is that? When did this become Perl?

如果您想阅读整行内容（您可能通常不会这样做），请scanf提供...与"%[^\n]". 什么？那是什么？这什么时候变成 Perl 了？

But the real problem is that neither of those are safe. They both freely overflow with no bounds checking. Want bounds checking? Okay, you got it: "%10s"(and "%10[^\n]"is starting to look even worse). That will only read 9 characters, and add a terminating nul-character automatically. So that's good... for when our array size never needs to change.

但真正的问题是，这两者都不安全。它们都可以自由溢出，没有边界检查。想要边界检查？好的，你明白了：（"%10s"并且"%10[^\n]"开始看起来更糟）。那只会读取 9 个字符，并自动添加一个终止的空字符。所以这很好......当我们的数组大小永远不需要改变时。

What if we want to pass the size of our array as an argument to scanf? printfcan do this:

如果我们想将数组的大小作为参数传递给scanf怎么办？printf可以这样做：

char string[] = "Hello, world!";
printf("%.*s\n", sizeof string, string); // prints whole message;
printf("%.*s\n", 6, string); // prints just "Hello,"

Want to do the same thing with scanf? Here's how:

想要做同样的事情scanf吗？就是这样：

static char tmp[/*bit twiddling to get the log10 of SIZE_MAX plus a few*/];
// if we did the math right we shouldn't need to use snprintf
snprintf(tmp, sizeof tmp, "%%%us", bufsize);
scanf(tmp, buffer);

That's right - scanfdoesn't support the "%.*s"variable precision printfdoes, so to do dynamic bounds checking with scanfwe have to construct our own format stringin a temporary buffer. This is all kinds of bad, and even though it's actually safe here it will look like a really bad idea to anyone just dropping in.

没错 -scanf不支持"%.*s"可变精度printf，所以要进行动态边界检查，scanf我们必须在临时缓冲区中构造我们自己的格式字符串。这真是太糟糕了，尽管这里实际上是安全的，但对于任何刚进来的人来说，这似乎是一个非常糟糕的主意。

Meanwhile, let's look at another world. Let's look at the world of fgets. Here's how we read in a line of data with fgets:

与此同时，让我们看看另一个世界。让我们看看世界fgets。以下是我们如何读取一行数据fgets：

fgets(buffer, bufsize, stdin);

Infinitely less headache, no wasted processor time converting an integer precision into a string that will only be reparsed by the library back into an integer, and all the relevant elements are sitting there on one linefor us to see how they work together.

无限少的麻烦，没有浪费处理器时间将整数精度转换为字符串，该字符串只会被库重新解析为整数，并且所有相关元素都在一行上，让我们看看它们是如何协同工作的。

Granted, this may not read an entire line. It will only read an entire line if the line is shorter than bufsize - 1characters. Here's how we can read an entire line:

当然，这可能不会读取整行。如果行短于bufsize - 1字符，它只会读取整行。以下是我们如何读取整行：

char *readline(FILE *file)
{
    size_t size  = 80; // start off small
    size_t curr  = 0;
    char *buffer = malloc(size);
    while(fgets(buffer + curr, size - curr, file))
      {
        if(strchr(buffer + curr, '\n')) return buffer; // success
        curr = size - 1;
        size *= 2;
        char *tmp = realloc(buffer, size);
        if(tmp == NULL) /* handle error */;
        buffer = tmp;
      }
    /* handle error */;
}

The currvariable is an optimization to prevent us from rechecking data we've already read, and is unnecessary (although useful as we read more data). We could even use the return value of strchrto strip off the ending "\n"character if you preferred.

该curr变量是一种优化，可防止我们重新检查已读取的数据，并且是不必要的（尽管在我们读取更多数据时很有用）。如果您愿意，我们甚至可以使用的返回值strchr去除结束"\n"字符。

Notice also that size_t size = 80;as a starting place is completely arbitrary. We could use 81, or 79, or 100, or add it as a user-supplied argument to the function. We could even add an int (*inc)(int)argument, and change size *= 2;to size = inc(size);, allowing the user to control how fast the array grows. These can be useful for efficiency, when reallocations get costly and boatloads of lines of data need to be read and processed.

另请注意，size_t size = 80;作为起点是完全任意的。我们可以使用 81、79 或 100，或者将其作为用户提供的参数添加到函数中。我们甚至可以添加一个int (*inc)(int)参数，并更改size *= 2;为size = inc(size);，允许用户控制数组增长的速度。当重新分配成本高昂并且需要读取和处理大量数据行时，这些对于提高效率很有用。

We could write the same with scanf, but think of how many times we'd have to rewrite the format string. We could limit it to a constant increment, instead of the doubling (easily) implemented above, and never have to adjust the format string; we could give in and just store the number, do the math with as above, and use snprintfto convert it to a format string every time we reallocateso that scanfcan convert it back to the same number; we could limit our growth and starting position in such a way that we can manually adjust the format string (say, just increment the digits), but this could get hairy after a while and may require recursion (!) to work cleanly.

我们可以用来写同样的东西scanf，但想想我们必须重写多少次格式字符串。我们可以将它限制为一个恒定的增量，而不是上面实现的加倍（很容易），而且永远不必调整格式字符串；我们可以放弃并只存储数字，使用上述方法进行数学运算，并snprintf在每次重新分配时将其转换为格式字符串，以便scanf将其转换回相同的数字；我们可以限制我们的增长和起始位置，我们可以手动调整格式字符串（例如，只增加数字），但这可能会在一段时间后变得毛茸茸，并且可能需要递归（！）才能正常工作。

Furthermore, it's hard to mix reading with scanfwith reading with other functions. Why? Say you want to read an integer from a line, then read a string from the next line. You try this:

此外，很难将阅读scanf与阅读与其他功能相结合。为什么？假设您想从一行读取一个整数，然后从下一行读取一个字符串。你试试这个：

int i;
char buf[BUSIZE];
scanf("%i", &i);
fgets(buf, BUFSIZE, stdin);

That will read the "2" but then fgetswill read an empty line because scanfdidn't read the newline! Okay, take two:

这将读取“2”但随后fgets将读取一个空行，因为scanf没有读取换行符！好吧，拿两个：

...
scanf("%i\n", &i);
...

You think this eats up the newline, and it does - but it also eats up leading whitespace on the next line, because scanfcan't tell the difference between newlines and other forms of whitespace. (Also, turns out you're writing a Python parser, and leading whitespace in lines is important.) To make this work, you have to call getcharor something to read in the newline and throw it away it:

您认为这会占用换行符，并且确实如此 - 但它也会占用下一行的前导空格，因为scanf无法区分换行符和其他形式的空格之间的区别。（另外，事实证明您正在编写一个 Python 解析器，并且行中的前导空格很重要。）要使其工作，您必须调用getchar或读取换行符并将其丢弃：

...
scanf("%i", &i);
getchar();
...

Isn't that silly? What happens if you use scanfin a function, but don't call getcharbecause you don't know whether the next read is going to be scanfor something saner (or whether or not the next character is even going to be a newline)? Suddenly the best way to handle the situation seems to be to pick one or the other: do we use scanfexclusively and never have access to fgets-style full-control input, or do we use fgetsexclusively and make it harder to perform complex parsing?

这不是傻吗？如果你scanf在函数中使用，但不调用会发生什么，getchar因为你不知道下一个读取是否会是scanf更理智的（或者下一个字符是否会成为换行符）？突然之间，处理这种情况的最佳方法似乎是选择一个或另一个：我们是scanf专门使用并且永远无法访问fgets-style 完全控制输入，还是我们fgets专门使用并使其更难执行复杂的解析？

Actually, the answer is we don't. We use fgets(or non-scanffunctions) exclusively, and when we need scanf-like functionality, we just call sscanfon the strings!We don't need to have scanfmucking up our filestreams unnecessarily! We can have all the precise control over our input we want and stillget all the functionality of scanfformatting. And even if we couldn't, many scanfformat options have near-direct corresponding functions in the standard library, like the infinitely more flexible strtoland strtodfunctions (and friends). Plus, i = strtoumax(str, NULL)for C99 sized integer types is a lot cleaner looking than scanf("%" SCNuMAX, &i);, and a lot safer (we can use that strtoumaxline unchanged for smaller types and let the implicit conversion handle the extra bits, but with scanfwe have to make a temporary uintmax_tto read into).

实际上，答案是我们没有。我们专门使用fgets（或非scanf函数），当我们需要scanf类似的功能时，我们只需调用sscanf字符串！我们不需要scanf不必要地破坏我们的文件流！我们可以对我们想要的输入进行所有精确控制，并且仍然可以获得所有scanf格式化功能。即使我们不能，许多scanf格式选项在标准库附近直接对应的功能，如无限更灵活strtol和strtod功能（和朋友）。另外，i = strtoumax(str, NULL)对于 C99 大小的整数类型，它看起来比更干净scanf("%" SCNuMAX, &i);，也更安全（我们可以使用它strtoumax较小类型的行保持不变，并让隐式转换处理额外的位，但scanf我们必须临时uintmax_t读取）。

The moral of this story: avoid scanf. If you need the formatting it provides, and don't want to (or can't) do it (more efficiently) yourself, use fgets/ sscanf.

这个故事的寓意是：避免scanf。如果您需要它提供的格式，并且不想（或不能）自己（更有效地）自己做，请使用fgets/ sscanf。

Answer 2

回答by peoro

scanfdoesn't allocate memory for you.

scanf不会为你分配内存。

You need to allocate memory for the variable passed to scanf.

您需要为传递给的变量分配内存scanf。

You could do like this:

你可以这样做：

char* receiveInput(){
    char *s = (char*) malloc( 100 );
    scanf("%s",s);
    return s;
}

But warning:

但警告：

the function that calls receiveInputwill take the ownership of the returned memory: you'll have to free(str)after you print it in main. (Giving the ownership away in this way is usually not considered a good practice).
An easy fix is getting the allocated memory as a parameter.
if the input string is longer than 99(in my case) your program will suffer of buffer overflow (which is what it's already happening).
An easy fix is to pass to scanfthe length of your buffer:
```
scanf("%99s",s);
```

调用的函数receiveInput将获得返回内存的所有权：free(str)在main. （以这种方式放弃所有权通常不被认为是一种好的做法）。
一个简单的解决方法是将分配的内存作为参数。
如果输入字符串比99（在我的情况下）长，您的程序将遭受缓冲区溢出（这已经发生了）。
一个简单的解决方法是传递到scanf缓冲区的长度：
```
scanf("%99s",s);
```

A fixed code could be like this:

一个固定的代码可能是这样的：

// s must be of at least 100 chars!!!
char* receiveInput( char *s ){
    scanf("%99s",s);
    return s;
}
int main()
{
    char str[100];
    receiveInput( str );
    int length = strlen(str);

    printf("Your string is %s, length is %d\n", str, length);

    return 0;
}

Answer 3

回答by Joze

You have to first allocate memory to your s object in your receiveInput() method. Such as:

您必须首先在您的 receiveInput() 方法中为您的 s 对象分配内存。如：

s = (char *)calloc(50, sizeof(char));

C语言 C中的字符串输入和输出

提问by rookie

回答by Chris Lutz

回答by peoro

回答by Joze

相关推荐

最近更新

标签

C语言 C中的字符串输入和输出

提问by rookie

回答by Chris Lutz

回答by peoro

回答by Joze

相关推荐

C语言 在 C 中将 char* 转换为 wchar*

C语言 套接字编程中的 htons() 函数

C语言 如何使用 C 函数执行 Shell 内置命令？

C语言 如何仅使用堆栈操作对堆栈进行排序？

相关推荐

最近更新

标签

C语言在 C 中将 char* 转换为 wchar*

C语言套接字编程中的 htons() 函数

C语言如何使用 C 函数执行 Shell 内置命令？

C语言如何仅使用堆栈操作对堆栈进行排序？