C语言 修改c中文件的现有内容
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/21958155/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
modify existing contents of file in c
提问by zee
int main()
{
FILE *ft;
char ch;
ft=fopen("abc.txt","r+");
if(ft==NULL)
{
printf("can not open target file\n");
exit(1);
}
while(1)
{
ch=fgetc(ft);
if(ch==EOF)
{
printf("done");
break;
}
if(ch=='i')
{
fputc('a',ft);
}
}
fclose(ft);
return 0;
}
As one can see that I want to edit abc.txtin such a way that iis replaced by ain it.
The program works fine but when I open abc.txtexternally, it seemed to be unedited.
Any possible reason for that?
如您所见,我想以abc.txt一种i被ain替换的方式进行编辑。
该程序运行良好,但当我在abc.txt外部打开时,它似乎未经编辑。
有什么可能的原因吗?
Why in this case the character after iis not replace by a, as the answers suggest?
为什么在这种情况下,后面的字符i没有替换为a,正如答案所暗示的那样?
回答by Jonathan Leffler
Analysis
分析
There are multiple problems:
有多个问题:
fgetc()returns anint, not achar; it has to return every validcharvalue plus a separate value, EOF. As written, you can't reliably detect EOF. Ifcharis an unsigned type, you'll never find EOF; ifcharis a signed type, you'll misidentify some valid character (often ?, y-umlaut, U+00FF, LATIN SMALL LETTER Y WITH DIAERESIS) as EOF.If you switch between input and output on a file opened for update mode, you must use a file positioning operation (
fseek(),rewind(), nominallyfsetpos()) between reading and writing; and you must use a positioning operation orfflush()between writing and reading.It is a good idea to close what you open (now fixed in the code).
If your writes worked, you'd overwrite the character after the
iwitha.
fgetc()返回一个int,而不是一个char;它必须返回每个有效值char加上一个单独的值 EOF。正如所写,您无法可靠地检测 EOF。如果char是无符号类型,您将永远找不到EOF;如果char是有符号类型,您会将某些有效字符(通常是 ?、y-元音变音、U+00FF、带分音符的拉丁文小写字母 Y)错误识别为 EOF。如果在为更新模式打开的文件上在输入和输出之间切换,则必须在读取和写入之间使用文件定位操作(
fseek(),rewind(),名义上fsetpos());并且您必须使用定位操作或fflush()在写入和读取之间。关闭您打开的内容是个好主意(现在已在代码中修复)。
如果您的写入有效,您将覆盖
iwith之后的字符a。
Synthesis
合成
These changes lead to:
这些变化导致:
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
FILE *ft;
char const *name = "abc.txt";
int ch;
ft = fopen(name, "r+");
if (ft == NULL)
{
fprintf(stderr, "cannot open target file %s\n", name);
exit(1);
}
while ((ch = fgetc(ft)) != EOF)
{
if (ch == 'i')
{
fseek(ft, -1, SEEK_CUR);
fputc('a',ft);
fseek(ft, 0, SEEK_CUR);
}
}
fclose(ft);
return 0;
}
There is room for more error checking.
有更多的错误检查空间。
Exegesis
注释
Input followed by output requires seeks
输入后跟输出需要搜索
The fseek(ft, 0, SEEK_CUR);statement is required by the C standard.
该fseek(ft, 0, SEEK_CUR);语句是 C 标准所要求的。
ISO/IEC 9899:2011 §7.21.5.3 The
fopenfunction?7 When a file is opened with update mode ('+' as the second or third character in the above list of mode argument values), both input and output may be performed on the associated stream. However, output shall not be directly followed by input without an intervening call to the
fflushfunction or to a file positioning function (fseek,fsetpos, orrewind), and input shall not be directly followed by output without an intervening call to a file positioning function, unless the input operation encounters end-of- file.Opening (or creating) a text file with update mode may instead open (or create) a binary stream in some implementations.
ISO/IEC 9899:2011 §7.21.5.3
fopen功能?7 当文件以更新模式打开时('+' 作为上述模式参数值列表中的第二个或第三个字符),输入和输出都可以在相关的流上执行。但是,输出后面不能直接跟在没有对
fflush函数或文件定位函数(fseek、fsetpos、 或rewind)的中间调用的输入之后,并且输入不能在没有对文件定位函数的中间调用的情况下直接跟随输出,除非输入操作遇到文件尾。在某些实现中,使用更新模式打开(或创建)文本文件可能会打开(或创建)二进制流。
(Emphasis added.)
(加了重点。)
fgetc()returns an int
fgetc()返回一个 int
Quotes from ISO/IEC 9899:2011, the current C standard.
引用当前 C 标准 ISO/IEC 9899:2011。
§7.21 Input/output
<stdio.h>§7.21.1 Introduction
EOFwhich expands to an integer constant expression, with type int and a negative value, that is returned by several functions to indicate end-of-file, that is, no more input from a stream;§7.21.7.1 The
fgetcfunction
int fgetc(FILE *stream);?2 If the end-of-file indicator for the input stream pointed to by stream is not set and a next character is present, the
fgetcfunction obtains that character as anunsigned charconverted to anintand advances the associated file position indicator for the stream (if defined).Returns
?3 If the end-of-file indicator for the stream is set, or if the stream is at end-of-file, the end-of-file indicator for the stream is set and the
fgetcfunction returns EOF. Otherwise, thefgetcfunction returns the next character from the input stream pointed to by stream. If a read error occurs, the error indicator for the stream is set and thefgetcfunction returns EOF.289)289)An end-of-file and a read error can be distinguished by use of the
feofandferrorfunctions.
§7.21 输入/输出
<stdio.h>§7.21.1 介绍
EOF它扩展为整数常量表达式,类型为 int 且为负值,由多个函数返回以指示文件结束,即不再有来自流的输入;§7.21.7.1
fgetc功能
int fgetc(FILE *stream);?2 如果 stream 所指向的输入流的文件结束指示符未设置且存在下一个字符,则该
fgetc函数获取该字符作为unsigned char转换为 anint并推进流的关联文件位置指示符(如果定义)。退货
?3 如果设置了流的文件尾指示符,或者如果流位于文件尾,则设置流的文件尾指示符并且
fgetc函数返回EOF。否则,该fgetc函数返回 stream 指向的输入流中的下一个字符。如果发生读取错误,则设置流的错误指示符并且fgetc函数返回 EOF。289)289)使用
feof和ferror函数可以区分文件结束和读取错误。
So, EOFis a negative integer (conventionally it is -1, but the standard does not require that). The fgetc()function either returns EOF or the value of the character as an unsigned char(in the range 0..UCHAR_MAX, usually 0..255).
所以,EOF是一个负整数(通常它是 -1,但标准不要求)。该fgetc()函数返回 EOF 或字符值作为 an unsigned char(在 0..UCHAR_MAX 范围内,通常为 0..255)。
§6.2.5 Types
?3 An object declared as type
charis large enough to store any member of the basic execution character set. If a member of the basic execution character set is stored in acharobject, its value is guaranteed to be nonnegative. If any other character is stored in acharobject, the resulting value is implementation-defined but shall be within the range of values that can be represented in that type.?5 An object declared as type
signed charoccupies the same amount of storage as a ‘‘plain''charobject.§6 For each of the signed integer types, there is a corresponding (but different) unsigned integer type (designated with the keyword
unsigned) that uses the same amount of storage (including sign information) and has the same alignment requirements.§15 The three types
char,signed char, andunsigned charare collectively called the character types. The implementation shall definecharto have the same range, representation, and behavior as eithersigned charorunsigned char.45)45)
CHAR_MIN, defined in<limits.h>, will have one of the values0orSCHAR_MIN, and this can be used to distinguish the two options. Irrespective of the choice made,charis a separate type from the other two and is not compatible with either.
§6.2.5 类型
?3 声明为类型的对象
char足够大,可以存储基本执行字符集的任何成员。如果基本执行字符集的成员存储在char对象中,则保证其值为非负。如果任何其他字符存储在char对象中,则结果值是实现定义的,但应在该类型可以表示的值范围内。?5 声明为类型的对象
signed char与“普通”char对象占用的存储量相同。§6 对于每一种有符号整数类型,都有一个对应的(但不同的)无符号整数类型(用关键字指定
unsigned),它使用相同的存储量(包括符号信息)并具有相同的对齐要求。§15这三种类型
char,signed char以及unsigned char统称为字符类型。实现应定义char为具有相同的范围,表示和行为如任一signed char或unsigned char。45)45)
CHAR_MIN,在 中定义<limits.h>,将具有值0or 之一SCHAR_MIN,这可用于区分这两个选项。无论做出char何种选择,它都是与其他两个不同的类型,并且与任何一个都不兼容。
This justifies my assertion that plain charcan be a signed or an unsigned type.
这证明了我的断言,plainchar可以是有符号或无符号类型。
Now consider:
现在考虑:
char c = fgetc(fp);
if (c == EOF)
…
Suppose fgetc()returns EOF, and plain charis an unsigned (8-bit) type, and EOF is -1. The assignment puts the value 0xFF into c, which is a positive integer. When the comparison is made, cis promoted to an int(and hence to the value 255), and 255 is not negative, so the comparison fails.
假设fgetc()返回 EOF,而 plainchar是无符号(8 位)类型,而 EOF 是-1. 赋值将值 0xFF 放入 中c,它是一个正整数。进行比较时,c将提升为 an int(并因此提升为值 255),并且 255 不是负数,因此比较失败。
Conversely, suppose that plain charis a signed (8-bit) type and the character set is ISO 8859-15. If fgetc()returns ?, the value assigned will be the bit pattern 0b11111111, which is the same as -1, so in the comparison, cwill be converted to -1and the comparison c == EOFwill return true even though a valid character was read.
相反,假设plainchar是有符号(8 位)类型并且字符集是ISO 8859-15。如果fgetc()返回 ?,分配的值将是位模式 0b11111111,它与 相同-1,因此在比较中,c将转换为-1并且c == EOF即使读取了有效字符,比较也将返回 true。
You can tweak the details, but the basic argument remains valid while sizeof(char) < sizeof(int). There are DSP chips where that doesn't apply; you have to rethink the rules. Even so, the basic point remains; fgetc()returns an int, not a char.
您可以调整细节,但基本参数在sizeof(char) < sizeof(int). 有些 DSP 芯片不适用;你必须重新考虑规则。即便如此,基本点仍然存在;fgetc()返回一个int,而不是一个char。
If your data is truly ASCII (7-bit data), then all characters are in the range 0..127 and you won't run into the misinterpretation of ? problem. However, if your chartype is unsigned, you still have the 'cannot detect EOF' problem, so your program will run for a long time. If you need to consider portability, you will take this into account. These are the professional grade issues that you need to handle as a C programmer. You can kludge your way to programs that work on your system for your data relatively easily and without taking all these nuances into account. But your program won't work on other people's systems.
如果您的数据是真正的 ASCII(7 位数据),那么所有字符都在 0..127 范围内,您就不会遇到 ? 问题。但是,如果您的char类型是无符号的,您仍然会遇到“无法检测到 EOF”的问题,因此您的程序将运行很长时间。如果你需要考虑便携性,你会考虑到这一点。这些是作为 C 程序员需要处理的专业级问题。您可以相对轻松地使用在您的系统上运行的程序来处理您的数据,而无需考虑所有这些细微差别。但是你的程序不能在其他人的系统上运行。
回答by Lee Duhem
You are not changing the 'i' in abc.txt, you are changing the next character after 'i'. Try to put fseek(ft, -1, SEEK_CUR);before your fputc('a', ft);.
您没有更改 'i' 中的 'i',而是更改了 'i'abc.txt之后的下一个字符。尝试放在fseek(ft, -1, SEEK_CUR);您的fputc('a', ft);.
After you read a 'i' character, the file position indicator of ftwill be the character after this 'i', and when you write a character by fputc(), this character will be write at the current file position, i.e. the character after 'i'. See fseek(3)for further details.
读到一个'i'字符后,文件位置指示符ft将是这个'i'后面的字符,当你写一个字符时fputc(),这个字符会写在当前文件位置,即'i'后面的字符. 有关fseek(3)更多详细信息,请参阅。
回答by OregonTrail
After reading 'i' you need to "step back" to write to the correct location.
阅读“i”后,您需要“退后一步”以写入正确的位置。
if(ch=='i')
{
fseek(ft, -1, SEEK_CUR);
fputc('a',ft);
}

