C语言计算 ASCII 文件中换行符的最简单方法是什么？

Question

提问by Sunscreen

Which is the fastest way to get the lines of an ASCII file?

获取 ASCII 文件行的最快方法是什么？

Answer 1

回答by Jerry Coffin

Normally you read files in C using fgets. You can also use scanf("%[^\n]"), but quite a few people reading the code are likely to find that confusing and foreign.

通常，您使用fgets. 您也可以使用scanf("%[^\n]")，但是很多阅读代码的人可能会发现它令人困惑和陌生。

Edit: on the other hand, if you really do just want to count lines, a slightly modified version of the scanfapproach can work quite nicely:

编辑：另一方面，如果您真的只想计算行数，则该scanf方法的稍微修改版本可以很好地工作：

while (EOF != (scanf("%*[^\n]"), scanf("%*c"))) 
    ++lines;

The advantage of this is that with the '*' in each conversion, scanfreads and matches the input, but does nothing with the result. That means we don't have to waste memory on a large buffer to hold the content of a line that we don't care about (and still take a chance of getting a line that's even larger than that, so our count ends up wrong unless we got to even morework to figure out whether the input we read ended with a newline).

这样做的好处是在每次转换中使用“*”时，scanf读取并匹配输入，但不处理结果。这意味着我们不必在大缓冲区上浪费内存来保存我们不关心的行的内容（并且仍然有机会获得比这更大的行，因此我们的计数最终是错误的除非我们需要做更多的工作来弄清楚我们读取的输入是否以换行符结尾）。

Unfortunately, we do have to break up the scanfinto two pieces like this. scanfstops scanning when a conversion fails, and if the input contains a blank line (two consecutive newlines) we expect the first conversion to fail. Even if that fails, however, we want the second conversion to happen, to read the next newline and move on to the next line. Therefore, we attempt the first conversion to "eat" the content of the line, and then do the %cconversion to read the newline (the part we really care about). We continue doing both until the second call to scanfreturns EOF(which will normally be at the end of the file, though it can also happen in case of something like a read error).

不幸的是，我们必须scanf像这样将它们分成两部分。scanf当转换失败时停止扫描，如果输入包含一个空行（两个连续的换行符），我们预计第一次转换会失败。然而，即使失败了，我们也希望发生第二次转换，读取下一个换行符并移至下一行。因此，我们尝试先转换为“吃”行的内容，然后再%c转换为读取换行符（我们真正关心的部分）。我们继续执行这两项操作，直到第二次调用scanf返回EOF（通常在文件末尾，但也可能发生在读取错误之类的情况下）。

Edit2: Of course, there is another possibility that's (at least arguably) simpler and easier to understand:

Edit2：当然，还有另一种可能性（至少可以说）更简单、更容易理解：

int ch;

while (EOF != (ch=getchar()))
    if (ch=='\n')
        ++lines;

The only part of this that some people find counterintuitive is that chmustbe defined as an int, not a charfor the code to work correctly.

有些人认为这违反直觉的唯一部分是ch必须将其定义为int，而不是 achar才能使代码正常工作。

Answer 2

回答by Kamal

Here's a solution based on fgetc() which will work for lines of any length and doesn't require you to allocate a buffer.

这是一个基于 fgetc() 的解决方案，它适用于任何长度的行，并且不需要您分配缓冲区。

#include <stdio.h>

int main()
{
    FILE                *fp = stdin;    /* or use fopen to open a file */
    int                 c;              /* Nb. int (not char) for the EOF */
    unsigned long       newline_count = 0;

        /* count the newline characters */
    while ( (c=fgetc(fp)) != EOF ) {
        if ( c == '\n' )
            newline_count++;
    }

    printf("%lu newline characters\n", newline_count);
    return 0;
}

Answer 3

回答by Krzysztof Szewczyk

Common, why You compare all characters? It is very slow. In 10MB file it is ~3s.
Under solution is faster.

Common，为什么你比较所有的字符？它非常缓慢。在 10MB 文件中，它是 ~3s。
在解决方案下更快。

unsigned long count_lines_of_file(char *file_patch) {
    FILE *fp = fopen(file_patch, "r");
    unsigned long line_count = 0;

    if(fp == NULL){
        return 0;
    }
    while ( fgetline(fp) )
        line_count++;

    fclose(fp);
    return line_count;
}

Answer 4

回答by vlabrecque

Maybe I'm missing something, but why not simply:

也许我错过了一些东西，但为什么不简单地：

#include <stdio.h>
int main(void) {
  int n = 0;
  int c;
  while ((c = getchar()) != EOF) {
    if (c == '\n')
      ++n;
  }
  printf("%d\n", n);
}

if you want to count partial lines (i.e. [^\n]EOF):

如果要计算部分行（即 [^\n]EOF）：

#include <stdio.h>
int main(void) {
  int n = 0;
  int pc = EOF;
  int c;
  while ((c = getchar()) != EOF) {
    if (c == '\n')
      ++n;
    pc = c;
  }
  if (pc != EOF && pc != '\n')
    ++n;
  printf("%d\n", n);
}

Answer 5

回答by icanhasserver

What about this?

那这个呢？

#include <stdio.h>
#include <string.h>

#define BUFFER_SIZE 4096

int main(int argc, char** argv)
{
    int count;
    int bytes;
    FILE* f;
    char buffer[BUFFER_SIZE + 1];
    char* ptr;

    if (argc != 2 || !(f = fopen(argv[1], "r")))
    {
        return -1;
    }

    count = 0;
    while(!feof(f))
    {
        bytes = fread(buffer, sizeof(char), BUFFER_SIZE, f);
        if (bytes <= 0)
        {
            return -1;
        }

        buffer[bytes] = '##代码##';
        for (ptr = buffer; ptr; ptr = strchr(ptr, '\n'))
        {
            ++count;
            ++ptr;
        }
    }

    fclose(f);

    printf("%d\n", count - 1);

    return 0;
}

C语言计算 ASCII 文件中换行符的最简单方法是什么？

提问by Sunscreen

回答by Jerry Coffin

回答by Kamal

回答by Krzysztof Szewczyk

回答by vlabrecque

回答by icanhasserver

相关推荐

最近更新

标签

C语言 计算 ASCII 文件中换行符的最简单方法是什么？

提问by Sunscreen

回答by Jerry Coffin

回答by Kamal

回答by Krzysztof Szewczyk

回答by vlabrecque

回答by icanhasserver

相关推荐

C语言 在编译时确定字节序

C语言 使用strtok在C中解析字符串

C语言 C Switch-case 花括号在每个 case 之后

C语言 如何从文本文件中读取并存储在c中的矩阵中

相关推荐

最近更新

标签

C语言计算 ASCII 文件中换行符的最简单方法是什么？

C语言在编译时确定字节序

C语言使用strtok在C中解析字符串

C语言如何从文本文件中读取并存储在c中的矩阵中