为什么 Windows 使用 CR LF?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/6521685/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Why does Windows use CR LF?
提问by Kyle
I understand the difference between the two so there's no need to go into that, but I'm just wondering what the reasoning is behind why Windows uses both CR and LF to indicate a line break. It seems like the Linux method (just using LF) makes a lot more sense, saves space, and is easier to parse.
我了解两者之间的区别,因此无需深入研究,但我只是想知道为什么 Windows 使用 CR 和 LF 来表示换行符的原因是什么。似乎 Linux 方法(仅使用 LF)更有意义,节省空间,并且更易于解析。
回答by Anders Abel
Historically when using dot-matrix printersteletypesCR would return the carriage to the first position of the line while LF would feed to the next line. Using CR+LF in the file themselves made it possible to send a file directly to the printer, without any kind of printer driver.
历史上使用时 点阵打印机电传打字机CR 将回车到行的第一个位置,而 LF 将送入下一行。在文件本身中使用 CR+LF 可以将文件直接发送到打印机,而无需任何类型的打印机驱动程序。
Thanks @zaph pointing out it was teletypes and not dot matrix printers
感谢@zaph 指出它是电传打字机而不是点阵打印机
回答by OMA
@sshannin posted an URL from Raymond Chen's blog, but it doesn't work anymore. The blog has changed its internal software, so the URLs changed.
@sshannin 从 Raymond Chen 的博客中发布了一个 URL,但它不再起作用了。该博客已更改其内部软件,因此 URL 已更改。
After crawling through the old posts in the new blog I've found it here.
在浏览新博客中的旧帖子后,我在这里找到了它。
Quote from the blog:
引自博客:
Why is the line terminator CR+LF?
This protocol dates back to the days of teletypewriters. CR stands for “carriage return” – the CR control character returned the print head (“carriage”) to column 0 without advancing the paper. LF stands for “linefeed” – the LF control character advanced the paper one line without moving the print head. So if you wanted to return the print head to column zero (ready to print the next line) and advance the paper (so it prints on fresh paper), you need both CR and LF.
If you go to the various internet protocol documents, such as RFC 0821 (SMTP), RFC 1939 (POP), RFC 2060 (IMAP), or RFC 2616 (HTTP), you'll see that they all specify CR+LF as the line termination sequence. So the the real question is not “Why do CP/M, MS-DOS, and Win32 use CR+LF as the line terminator?” but rather “Why did other people choose to differ from these standards documents and use some other line terminator?”
Unix adopted plain LF as the line termination sequence. If you look at the stty options, you'll see that the onlcr option specifies whether a LF should be changed into CR+LF. If you get this setting wrong, you get stairstep text, where
each line begins
where the previous line left off. So even unix, when left in raw mode, requires CR+LF to terminate lines. The implicit CR before LF is a unix invention, probably as an economy, since it saves one byte per line.
The unix ancestry of the C language carried this convention into the C language standard, which requires only “\n” (which encodes LF) to terminate lines, putting the burden on the runtime libraries to convert raw file data into logical lines.
The C language also introduced the term “newline” to express the concept of “generic line terminator”. I'm told that the ASCII committee changed the name of character 0x0A to “newline” around 1996, so the confusion level has been raised even higher.
Here's another discussion of the subject, from a unix perspective
为什么行终止符是CR+LF?
该协议可以追溯到电传打字机的时代。CR 代表“回车”——CR 控制字符将打印头(“回车”)返回到第 0 列,而无需推进纸张。LF 代表“换行”——LF 控制字符在不移动打印头的情况下将纸张前进一行。因此,如果您想将打印头返回到第 0 列(准备打印下一行)并推进纸张(以便在新纸上打印),则需要 CR 和 LF。
如果您访问各种 Internet 协议文档,例如 RFC 0821 (SMTP)、RFC 1939 (POP)、RFC 2060 (IMAP) 或 RFC 2616 (HTTP),您会看到它们都指定 CR+LF 作为行终止序列。所以真正的问题不是“为什么 CP/M、MS-DOS 和 Win32 使用 CR+LF 作为行终止符?” 而是“为什么其他人选择与这些标准文档不同并使用其他一些行终止符?”
Unix 采用普通 LF 作为行终止序列。如果您查看 stty 选项,您将看到 onlcr 选项指定是否应将 LF 更改为 CR+LF。如果你弄错了这个设置,你会得到阶梯文本,其中
each line begins
上一行停止的地方。所以即使是 unix,当处于原始模式时,也需要 CR+LF 来终止行。LF 之前的隐式 CR 是一个 Unix 发明,可能是一种经济,因为它每行节省一个字节。
C 语言的 unix 祖先将这一约定带入了 C 语言标准,它只需要“\n”(编码 LF)来终止行,将负担放在运行时库上,将原始文件数据转换为逻辑行。
C 语言还引入了“换行符”一词来表达“通用行终止符”的概念。我听说 ASCII 委员会在 1996 年左右将字符 0x0A 的名称更改为“换行符”,因此混淆程度进一步提高。
I've changed this second link to a snapshot in The Wayback Machine, since the actual page is not available anymore.
我已将第二个链接更改为 The Wayback Machine 中的快照,因为实际页面不再可用。
I hope this answers your question.
我希望这回答了你的问题。
回答by Dave Markle
It comes from the teletype machines (and typewriters) from the days of yore.
它来自过去的电传打字机(和打字机)。
It used to be that when you were done typing a line, you had to move the typewriter's carriage (which held the paper and slid to the left as you typed) back to the start of the line (CR). You then had to advance the paper down a line (LF) to move to the next line.
过去,当您打完一行字时,您必须将打字机的托架(在您打字时固定纸张并向左滑动)移回行首 (CR)。然后,您必须将纸张向前移动一行 (LF) 才能移动到下一行。
There are cases you might not have wanted to linefeed when returning the carriage, such as if you were going to strikethrough a character with a dash (you'd just overwrite it).
在某些情况下,您可能不想在回车时换行,例如,如果您打算用破折号删除字符(您只需覆盖它)。
But basically, it boils down to convention. DOS used the full CR/LF convention, and UNIX shortened it a bit. Now we're stuck!
但基本上,它归结为惯例。DOS 使用完整的 CR/LF 约定,而 UNIX 将其缩短了一点。现在我们被困住了!
回答by likejudo
Others have given the answer, but I wanted to add... I guess you are too young to have used a typewriter? ;) The carriage is a drum. Moving it horizontally right, brings the stationary type head back to the left margin of the page. Rotating the carriage using your finger and thumb advances the page by one line(s).
其他人已经给出了答案,但我想补充一下......我猜你太年轻了,不会用打字机?;) 车厢是一个鼓。将其水平向右移动,将固定类型的头部带回到页面的左边距。用手指和拇指旋转笔架,将页面前进一行。
回答by nobar
I have seen more than one account to the effect that the reason to send two characters (and sometimes more) instead of one was in order to better match the data transfer rate to the physical printing rate (this was a long time ago). Moving the print-head took longer than printing a single character and sending extra characters was a way of preventing the data transfer from getting ahead of the printing device. So the reason we have multiple characters for end-of-line in Windows is basically the same as the reason we have QWERTY keyboards -- it was intended to slow things down.
我已经看到不止一个帐户,大意是发送两个字符(有时更多)而不是一个字符的原因是为了使数据传输速率与物理打印速率更好地匹配(这是很久以前的事了)。移动打印头比打印单个字符花费的时间更长,发送额外的字符是防止数据传输超过打印设备的一种方式。所以我们在 Windows 中使用多个字符作为行尾的原因与我们使用 QWERTY 键盘的原因基本相同——它旨在减慢速度。
Obviously the reason this practice continues in Windows to this day is based on some notion of ongoing backwards compatibility, and ultimately, just simple inertia.
显然,这种做法在 Windows 中延续至今的原因是基于某种持续向后兼容性的概念,最终只是简单的惯性。
Of note however, this convention is not strictly enforced by Windows at the operating system level. Any Windows application is free to ignore the convention, depending on what other applications it is trying to be compatible with.
然而,值得注意的是,Windows 并未在操作系统级别严格执行此约定。任何 Windows 应用程序都可以随意忽略该约定,具体取决于它尝试与哪些其他应用程序兼容。
Interestingly, the Wikipedia article about "Newline", claims that Windows 8 may introduce a change to using only LF. The article also states that Mac OS X introduced a transition from LF+CR to just LF.
有趣的是,维基百科关于“换行符”的文章声称 Windows 8 可能会引入仅使用 LF 的更改。文章还指出 Mac OS X 引入了从 LF+CR 到仅 LF 的过渡。
回答by Nick Heidke
From Wikipedia:
来自维基百科:
The sequence CR+LF was in common use on many early computer systems that had adopted teletype machines, typically an ASR33, as a console device, because this sequence was required to position those printers at the start of a new line.
CR+LF 序列在许多采用电传打字机(通常是 ASR33)作为控制台设备的早期计算机系统中很常见,因为需要此序列来将这些打印机定位在新行的开头。