C语言 查找文本文件中每一行的行大小

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2137156/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-02 04:15:39  来源:igfitidea点击:

Finding line size of each row in a text file

c

提问by MRP

How can you count the number of characters or numbers in each line? Is there something like a EOF thats more like a End of Line?

如何计算每行中的字符数或数字数?有没有更像行尾的 EOF 之类的东西?

回答by Sam

You can iterate through each character in the line and keep incrementing a counter until the end-of-line ('\n') is encountered. Make sure to open the file in text mode ("r") and not binary mode ("rb"). Otherwise the stream won't automatically convert different platforms' line ending sequences into '\n'characters.

您可以遍历行中的每个字符并不断增加计数器,直到'\n'遇到行尾 ( )。确保以文本模式 ( "r") 而不是二进制模式 ( "rb")打开文件。否则流不会自动将不同平台的行结束序列转换为'\n'字符。

Here is an example:

下面是一个例子:

int charcount( FILE *const fin )
{
    int c, count;

    count = 0;
    for( ;; )
    {
        c = fgetc( fin );
        if( c == EOF || c == '\n' )
            break;
        ++count;
    }

    return count;
}

Here's an example program to test the above function:

这是测试上述功能的示例程序:

#include <stdio.h>

int main( int argc, char **argv )
{
    FILE *fin;

    fin = fopen( "test.txt", "r" );
    if( fin == NULL )
        return 1;

    printf( "Character count: %d.\n", charcount( fin ) );

    fclose( fin );
    return 0;
}

回答by danben

Regarding reading a file line by line, look at fgets.

关于逐行读取文件,请查看fgets

char *fgets(char *restrict s, int n, FILE *restrict stream);

The fgets() function shall read bytes from stream into the array pointed to by s, until n-1 bytes are read, or a is read and transferred to s, or an end-of-file condition is encountered. The string is then terminated with a null byte.

fgets() 函数应从流中读取字节到 s 指向的数组中,直到读取了 n-1 个字节,或者读取了 a 并将其传输到 s,或者遇到了文件结束条件。然后该字符串以空字节终止。

The only problem here may be if you can't guarantee a maximum line size in your file. If that is the case, you can iterate over characters until you see a line feed.

这里唯一的问题可能是您不能保证文件中的最大行大小。如果是这种情况,您可以遍历字符直到看到换行符。

Regarding end of line:

关于行尾:

Short answer: \nis the newline character (also called a line feed).

简短回答:\n是换行符(也称为换行符)。

Long answer, from Wikipedia:

来自维基百科的长答案:

Systems based on ASCII or a compatible character set use either LF (Line feed, 0x0A, 10 in decimal) or CR (Carriage return, 0x0D, 13 in decimal) individually, or CR followed by LF (CR+LF, 0x0D 0x0A); see below for the historical reason for the CR+LF convention. These characters are based on printer commands: The line feed indicated that one line of paper should feed out of the printer, and a carriage return indicated that the printer carriage should return to the beginning of the current line.

基于 ASCII 或兼容字符集的系统单独使用 LF(换行,0x0A,十进制 10)或 CR(回车,0x0D,十进制 13),或 CR 后跟 LF(CR+LF,0x0D 0x0A);有关 CR+LF 约定的历史原因,请参见下文。这些字符基于打印机命令:换行表示应将一行纸送出打印机,回车表示打印机回车应返回到当前行的开头。

* LF:    Multics, Unix and Unix-like systems (GNU/Linux, AIX, Xenix, Mac OS X, FreeBSD, etc.), BeOS, Amiga, RISC OS, and others
* CR+LF: DEC RT-11 and most other early non-Unix, non-IBM OSes, CP/M, MP/M, DOS, OS/2, Microsoft Windows, Symbian OS
* CR:    Commodore 8-bit machines, Apple II family, Mac OS up to version 9 and OS-9

But since you are not likely to be working with a representation that uses carriage return only, looking for a line feed should be fine.

但由于您不太可能使用仅使用回车的表示,因此寻找换行符应该没问题。

回答by David

\nis the newline character in C. In other languages, such as C#, you may use something like C#'s Environment.EndLineto overcome platform difficulties.

\n是 C 中的换行符。在其他语言中,例如 C#,您可以使用类似 C# 的东西Environment.EndLine来克服平台困难。

If you already know that your string is one line (let's call it line), use strlen(line)to get the number of characters in it. Subtract 1 if it ends with the '\n'.

如果您已经知道您的字符串是一行(我们称之为行),请使用strlen(line)获取其中的字符数。如果以 结尾,则减 1 '\n'

If the string has new line characters in it, you'll need to split it around the new line characters and then call strlen()on each substring.

如果字符串中有换行符,则需要将其围绕换行符拆分,然后调用strlen()每个子字符串。

回答by Alok Singhal

If you open a file in text mode, i.e., without a bin the second argument to fopen(), you can read characters one-by-one until you hit a '\n'to determine the line size. The underlying system should take care of translating the end of line terminators to just one character, '\n'. The last line of a text file, on some systems, may not end with a '\n', so that is a special case.

如果您以文本模式打开文件,即b在 的第二个参数中没有 a fopen(),您可以一个一个地读取字符,直到您点击 a'\n'来确定行大小。底层系统应该负责将行结束符转换为一个字符'\n'. 在某些系统上,文本文件的最后一行可能不以 结尾'\n',因此这是一种特殊情况。

Pseudocode:

伪代码:

count := 0
c := next()
while c != EOF and c != '\n'"
    count := count + 1

the above will count the number of characters in a given line. next()is a function to return the next character from your file.

以上将计算给定行中的字符数。 next()是从文件中返回下一个字符的函数。

Alternatively, you can use fgets()with a buffer:

或者,您可以使用fgets()缓冲区:

char buf[SIZE];
count = 0;
while (fgets(buf, sizeof buf, fp) != NULL) {
    /* see if the string represented by buf has a '\n' in it,
       if yes, add the index of that '\n' to count, and that's
       the number of characters on that line, which you can
       return to the caller.  If not, add sizeof buf - 1 to count */
}
/* If count is non-zero here, the last line ended without a newline */

回答by whamalai

The original question was how to get the number of characters in "each line" (given a line? or the current line?), while the answers have mostly given solutions how to determine the length of the first line in a file. One can easily apply some of them to determine length of current line (without guessing beforehand maximum length for a buffer).

最初的问题是如何获取“每行”中的字符数(给定一行?还是当前行?),而答案大多给出了如何确定文件中第一行长度的解决方案。人们可以轻松地应用其中的一些来确定当前行的长度(无需事先猜测缓冲区的最大长度)。

However, what one often needs in practice is the maximum length of any linein a file. Then one can reserve a buffer and use fgets to read the file line by line and use some nice functions (strtok, strtod etc.) to parse lines. In practice, you can use any of the previous solutions to determine length of one line, and just scan through all lines and take the maximum.

但是,在实践中经常需要的是文件中任何行最大长度。然后可以保留一个缓冲区并使用 fgets 逐行读取文件,并使用一些不错的函数(strtok、strtod 等)来解析行。在实践中,您可以使用任何先前的解决方案来确定一条线的长度,只需扫描所有线并取最大值。

An easy script that reads the file character by character:

一个逐字符读取文件的简单脚本:

    max=0; i=0;
    do 
        if ((c=fgetc(f))!= EOF && c!='\n') i++; 
        else { 
            if (i>max) max=i;
            i=0;
            }
    while (c!=EOF);
    return max;

Note: In practice, it would suffice to have an upperbound for the maximum length. A dirty solution would be to use the file size as an upperbound for the maximum length of lines.

注意:在实践中,有一个最大长度的上限就足够了。一个肮脏的解决方案是使用文件大小作为最大行长度的上限。