C语言 计算 ASCII 文件中换行符的最简单方法是什么?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4278845/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-02 07:12:07  来源:igfitidea点击:

What is the easiest way to count the newlines in an ASCII file?

c

提问by Sunscreen

Which is the fastest way to get the lines of an ASCII file?

获取 ASCII 文件行的最快方法是什么?

回答by Jerry Coffin

Normally you read files in C using fgets. You can also use scanf("%[^\n]"), but quite a few people reading the code are likely to find that confusing and foreign.

通常,您使用fgets. 您也可以使用scanf("%[^\n]"),但是很多阅读代码的人可能会发现它令人困惑和陌生。

Edit: on the other hand, if you really do just want to count lines, a slightly modified version of the scanfapproach can work quite nicely:

编辑:另一方面,如果您真的只想计算行数,则该scanf方法的稍微修改版本可以很好地工作:

while (EOF != (scanf("%*[^\n]"), scanf("%*c"))) 
    ++lines;

The advantage of this is that with the '*' in each conversion, scanfreads and matches the input, but does nothing with the result. That means we don't have to waste memory on a large buffer to hold the content of a line that we don't care about (and still take a chance of getting a line that's even larger than that, so our count ends up wrong unless we got to even morework to figure out whether the input we read ended with a newline).

这样做的好处是在每次转换中使用“*”时,scanf读取并匹配输入,但不处理结果。这意味着我们不必在大缓冲区上浪费内存来保存我们不关心的行的内容(并且仍然有机会获得比这更大的行,因此我们的计数最终是错误的除非我们需要做更多的工作来弄清楚我们读取的输入是否以换行符结尾)。

Unfortunately, we do have to break up the scanfinto two pieces like this. scanfstops scanning when a conversion fails, and if the input contains a blank line (two consecutive newlines) we expect the first conversion to fail. Even if that fails, however, we want the second conversion to happen, to read the next newline and move on to the next line. Therefore, we attempt the first conversion to "eat" the content of the line, and then do the %cconversion to read the newline (the part we really care about). We continue doing both until the second call to scanfreturns EOF(which will normally be at the end of the file, though it can also happen in case of something like a read error).

不幸的是,我们必须scanf像这样将它们分成两部分。scanf当转换失败时停止扫描,如果输入包含一个空行(两个连续的换行符),我们预计第一次转换会失败。然而,即使失败了,我们也希望发生第二次转换,读取下一个换行符并移至下一行。因此,我们尝试先转换为“吃”行的内容,然后再%c转换为读取换行符(我们真正关心的部分)。我们继续执行这两项操作,直到第二次调用scanf返回EOF(通常在文件末尾,但也可能发生在读取错误之类的情况下)。

Edit2: Of course, there is another possibility that's (at least arguably) simpler and easier to understand:

Edit2:当然,还有另一种可能性(至少可以说)更简单、更容易理解:

int ch;

while (EOF != (ch=getchar()))
    if (ch=='\n')
        ++lines;

The only part of this that some people find counterintuitive is that chmustbe defined as an int, not a charfor the code to work correctly.

有些人认为这违反直觉的唯一部分是ch必须将其定义为int,而不是 achar才能使代码正常工作。

回答by Kamal

Here's a solution based on fgetc() which will work for lines of any length and doesn't require you to allocate a buffer.

这是一个基于 fgetc() 的解决方案,它适用于任何长度的行,并且不需要您分配缓冲区。

#include <stdio.h>

int main()
{
    FILE                *fp = stdin;    /* or use fopen to open a file */
    int                 c;              /* Nb. int (not char) for the EOF */
    unsigned long       newline_count = 0;

        /* count the newline characters */
    while ( (c=fgetc(fp)) != EOF ) {
        if ( c == '\n' )
            newline_count++;
    }

    printf("%lu newline characters\n", newline_count);
    return 0;
}

回答by Krzysztof Szewczyk

Common, why You compare all characters? It is very slow. In 10MB file it is ~3s.
Under solution is faster.

Common,为什么你比较所有的字符?它非常缓慢。在 10MB 文件中,它是 ~3s。
在解决方案下更快。

unsigned long count_lines_of_file(char *file_patch) {
    FILE *fp = fopen(file_patch, "r");
    unsigned long line_count = 0;

    if(fp == NULL){
        return 0;
    }
    while ( fgetline(fp) )
        line_count++;

    fclose(fp);
    return line_count;
}

回答by vlabrecque

Maybe I'm missing something, but why not simply:

也许我错过了一些东西,但为什么不简单地:

#include <stdio.h>
int main(void) {
  int n = 0;
  int c;
  while ((c = getchar()) != EOF) {
    if (c == '\n')
      ++n;
  }
  printf("%d\n", n);
}

if you want to count partial lines (i.e. [^\n]EOF):

如果要计算部分行(即 [^\n]EOF):

#include <stdio.h>
int main(void) {
  int n = 0;
  int pc = EOF;
  int c;
  while ((c = getchar()) != EOF) {
    if (c == '\n')
      ++n;
    pc = c;
  }
  if (pc != EOF && pc != '\n')
    ++n;
  printf("%d\n", n);
}

回答by icanhasserver

What about this?

那这个呢?

#include <stdio.h>
#include <string.h>

#define BUFFER_SIZE 4096

int main(int argc, char** argv)
{
    int count;
    int bytes;
    FILE* f;
    char buffer[BUFFER_SIZE + 1];
    char* ptr;

    if (argc != 2 || !(f = fopen(argv[1], "r")))
    {
        return -1;
    }

    count = 0;
    while(!feof(f))
    {
        bytes = fread(buffer, sizeof(char), BUFFER_SIZE, f);
        if (bytes <= 0)
        {
            return -1;
        }

        buffer[bytes] = '##代码##';
        for (ptr = buffer; ptr; ptr = strchr(ptr, '\n'))
        {
            ++count;
            ++ptr;
        }
    }

    fclose(f);

    printf("%d\n", count - 1);

    return 0;
}