C语言 检查字符是否为换行符
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/15733673/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Checking a character to be a newline
提问by Taygrim
How to check whether a character is a newline character in any encoding in C?
c - 如何检查字符是否为C中任何编码的换行符?
I have a task to write my own wcprogram. And if I use just if (s[i] == '\n')it has another answer than original wcif I call it to itself.
Here is the code:
我有一个任务来编写我自己的厕所程序。如果我使用它,如果(s[i] == '\n')它有另一个答案而不是原始wc如果我调用它自己。
这是代码:
typedef struct
{
int newline;
int word;
int byte;
} info;
info count(int descr)
{
info kol;
kol.newline = 0;
kol.word = 0;
kol.byte = 0;
int len = 512;
char s[512];
int n;
errno = 0;
int flag1 = 1;
int flag2 = 1;
while(n = read(descr, s, len))
{
if(n == -1)
error("Error while reading.", errno);
errno = 0;
kol.byte+=n;
for(int i=0; i<n; i++)
{
if(flag1)
{
kol.newline++;
flag1 = 0;
}
if(isblank(s[i]) || s[i] == '\n')
flag2 = 1;
else
{
if(flag2)
{
kol.word++;
flag2 = 0;
}
}
if(s[i] == '\n')
flag1 = 1;
}
}
return kol;
}
It works fine for all text files, but when I call it to file I got after compiling itself it does't give the answer wcgives.
它适用于所有文本文件,但是当我将它调用到文件时,我在编译后得到它并没有给出wc给出的答案。
采纳答案by Keith Thompson
The way to check whether a character s[i]is a newline character is simply:
检查字符s[i]是否为换行符的方法很简单:
if (s[i] == '\n')
If you're reading from a file that's been opened in text mode (including stdin), then whatever representation the underlying system uses to mark the end of a line will be translated to a single '\n'character.
如果您正在读取以文本模式(包括stdin)打开的文件,那么底层系统用于标记行尾的任何表示都将被转换为单个'\n'字符。
You say you're trying to write your own wcprogram, and by comparing to '\n'you're getting different results than the system's wc. You haven't told us enough to guess why that's happening. Show us your code and tell us exactly what's happening.
您说您正在尝试编写自己的wc程序,并且通过比较'\n'您得到的结果与系统的wc. 你告诉我们的还不够多,无法猜测为什么会发生这种情况。向我们展示您的代码并准确告诉我们发生了什么。
You might run into problems if you're reading a file that's encoded differently -- say, trying to read a Unix-format text file on a Windows system. But then wcwould have the same problem.
如果您正在读取不同编码的文件,您可能会遇到问题——例如,尝试在 Windows 系统上读取 Unix 格式的文本文件。但随后wc就会有同样的问题。
回答by Dave
There are several newline characters in ASCII and Unicode.
ASCII 和 Unicode 中有几个换行符。
The most famous are \rand \n, from ASCII. Technically these are carriage return and line-feed. Windows uses both together \r\n(technically carriage-return means go to column 0, line-feed means go to next line, but nothing I know of obeys that in practice), unix uses just \n. Some (not common) OSs use just \r.
最著名的是\r和\n,来自 ASCII。从技术上讲,这些是回车和换行。Windows 同时使用两者\r\n(从技术上讲,回车意味着转到第 0 列,换行意味着转到下一行,但我所知道的在实践中没有任何东西遵守这一点),而 unix 只使用\n. 一些(不常见的)操作系统只使用\r.
Most apps stop there, and don't suffer for it. What follows is more theoretical.
大多数应用程序就止步于此,并且不会因此而受苦。接下来的内容更具理论性。
Unicode complicates things. U+000A and U+000B are identical to \rand \n(same binary representation in UTF-8). Then there's U+0085 "next line", U+2028 "line separator" and U+2029 "paragraph separator". You can also check vertical tab (U+000B) if you want to check everything. See here: http://en.wikipedia.org/wiki/Newline#Unicode
Unicode 使事情复杂化。U+000A 和 U+000B 与\r和\n相同(UTF-8 中的二进制表示相同)。然后是 U+0085“下一行”、U+2028“行分隔符”和 U+2029“段落分隔符”。如果您想检查所有内容,也可以检查垂直制表符 (U+000B)。见这里:http: //en.wikipedia.org/wiki/Newline#Unicode
回答by Ale
As far as I know, there is no standard function like the isXXXXX()ones (the most close one is isspace(), which is true also for other conditions (space, tab, form feed...). Simply comparing to '\n' should solve your problem; depending on what you consider to be a newline character, you might also want to check for '\r' (carriage return). UNIX standard as line separator is '\n', Mac (before OS X) used '\r' (now '\n' is more common, but '\r' is sometimes still used by some applications, e.g. MS Office), DOS/Windows use the "\r\n" sequence.
据我所知,有喜欢的没有标准功能isXXXXX()者(最密切的一个就是isspace(),这是真的也为其他条件(空格,制表,换...)。简单比较,以“\ n”应解决您的问题;根据您认为是换行符的内容,您可能还想检查 '\r'(回车)。作为行分隔符的 UNIX 标准是 '\n',Mac(在 OS X 之前)使用 '\r '(现在'\n' 更常见,但'\r' 有时仍被某些应用程序使用,例如MS Office),DOS/Windows 使用“\r\n”序列。

