为什么 HTML 要求多个空格在浏览器中显示为一个空格?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/433493/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-28 22:58:50  来源:igfitidea点击:

Why does HTML require that multiple spaces show up as a single space in the browser?

htmlformattingwhitespace

提问by Rudd Zwolinski

I have long recognized that any set of whitespace in an HTML file will only be displayed as a single space. For instance, this:

我很早就认识到 HTML 文件中的任何一组空格都只会显示为一个空格。例如,这个:

<p>Hello.        Hello. Hello. Hello.                       Hello.</p>

displays as:

显示为:

Hello. Hello. Hello. Hello. Hello.

你好。你好。你好。你好。你好。

This is perfectly fine, as if you need multiple spaces of pre-formatted text you can just use the <pre> tag. But what is the reason? More precisely, why is this in the specification for HTML?

这非常好,就好像您需要多个空格的预格式化文本一样,您可以使用 <pre> 标签。但原因是什么?更准确地说,为什么在 HTML 规范中会出现这种情况?

采纳答案by tristan

Spaces are compacted in HTML because there's a distinction between how HTML is formatted and how it should be rendered. Consider a page like this:

空格在 HTML 中被压缩,因为 HTML 的格式和呈现方式之间存在差异。考虑这样一个页面:

<html>
    <body>
        <a href="mylink">A link</a>
    </body>
</html>

If the HTML was indented using spaces for example, the link would be preceded by several spaces.

例如,如果 HTML 是使用空格缩进的,则链接前面将有几个空格。

回答by Turnkey

To try to address the "why" it may be because HTML was based on SGMLwhich had specified it that way. It was in turn based on GMLfrom the early 60's. The reason for white space handling could very well be because data was entered one "card" at a time back then which could result in undesired breakup of sentences and paragraphs. One difference in the old GML is that it specified that there has to be two spaces between sentences (like the old typewriter rules) which may have established a precedenct that spaces are independent of the markup.

试图解决“为什么”可能是因为 HTML 是基于SGML 的SGML以这种方式指定了它。它反过来又基于60 年代初的GML。处理空白的原因很可能是因为数据在当时一次输入一张“卡片”,这可能导致意外的句子和段落分解。旧版 GML 的一个不同之处在于它规定句子之间必须有两个空格(就像旧的打字机规则),这可能已经建立了空格独立于标记的先例。

回答by Zach Hirsch

As others have said, it's in the HTML specification.

正如其他人所说,它在 HTML 规范中。

If you want to preserve whitespace in output, you can use the <pre> tag:

如果要在输出中保留空格,可以使用<pre> 标签

<pre>This     text has              extra spaces

and

    newlines</pre>

But this will also generally display the text in a different font.

但这通常也会以不同的字体显示文本。

回答by enobrev

Not only is it in thespecification, but there is some sense to it. If spaces weren't compacted, you would have to put all your html on a single line. so something like this:

它不仅是在规范,但是有一些感觉吧。如果没有压缩空格,则必须将所有 html 放在一行中。所以像这样:

<div>
    <h1>Title</h1>
    <p>
       This is some text
       <a href="#">Read More</a>
    </p>
</div>

Would have some strange alignment with spaces all over the place. The only way to get it right would be to compact that code, which would be difficult to maintain.

与整个地方的空间有一些奇怪的对齐。使其正确的唯一方法是压缩该代码,这将很难维护。

回答by S.Lott

"Why are multiple spaces converted to single spaces?"

“为什么将多个空格转换为单个空格?”

First, "why" questions are hard to answer. It's in the spec. That's pretty much the end of it.

首先,“为什么”的问题很难回答。它在规范中。这几乎是它的结束。

Consider that there are several kinds of white space.

考虑有几种空白。

  • White space between tags. <p>\n<b>hi</b>\n</p>

  • White space in the content within a tag. <p>Hi <i>everyone</i>.</p>

  • White space in a <pre>or CDATA section.

  • 标签之间的空白。 <p>\n<b>hi</b>\n</p>

  • 标签内内容中的空白。 <p>Hi <i>everyone</i>.</p>

  • a<pre>或 CDATA 部分中的空白。

The first two are hard to distinguish. Whitespace between tags, even in XML, is "optional". But when you have what is called a "mixed content model" -- tags intermixed with content -- the subtlety of "between tags" and "in the content but between tags" and "in the content but not between tags" is impossible to sort out.

前两者很难区分。标签之间的空白,即使在 XML 中,也是“可选的”。但是当你拥有所谓的“混合内容模型”——标签与内容混合时——“标签之间”和“内容中但标签之间”和“内容中但不在标签之间”的微妙之处是不可能的整理。

So they don't sort it out. Whitespace between tags and whitespace in the content is all optional.

所以他们不解决。标签之间的空格和内容中的空格都是可选的。

回答by Michael

If browsers did not do this, it could be difficult to format your HTML code to make it easily readable. For example, you might want to format your code like this:

如果浏览器没有这样做,可能很难格式化您的 HTML 代码以使其易于阅读。例如,您可能希望像这样格式化代码:

<html>
<body>
    <div>
        I like to indent all content that is inside div tags.
    </div>
</body>
</html>

If the browser does not ignore the eight or so spaces before the text inside the div tag, your webpage might not look the way you intended it to look.

如果浏览器没有忽略 div 标签内文本之前的八个左右的空格,您的网页可能看起来不像您想要的那样。

回答by BoltClock

Usually, these design decisions are not documented in any specification and can only be gleaned from working group discussion archives that happen to be publicly accessible, or explained by the spec authors themselves. However, in this particular case, HTML 3.2does state the following:

通常,这些设计决策没有记录在任何规范中,只能从碰巧可公开访问的工作组讨论档案中收集,或由规范作者自己解释。但是,在这种特殊情况下,HTML 3.2确实声明了以下内容:

Except within literal text (e.g. the PREelement), HTML treats contiguous sequences of white space characters as being equivalent to a single space character (ASCII decimal 32). These rules allow authors considerable flexibility when editing the marked-up text directly. Note that future revisions to HTML may allow for the interpretation of the horizontal tab character (ASCII decimal 9) with respect to a tab rule defined by an associated style sheet.

除了文字文本(例如PRE元素)之外,HTML 将连续的空白字符序列视为等同于单个空格字符(ASCII 十进制 32)。这些规则允许作者在直接编辑标记文本时具有相当大的灵活性。请注意,未来对 HTML 的修订可能允许根据相关样式表定义的制表符规则来解释水平制表符(ASCII 十进制 9)。

The behavior you see today is of course much more complicated than what was specified in HTML 3.2, but I believe the reasoning still applies. One example of where this flexibility can be useful is when you have a long paragraph that you intend to hard-wrap and indent:

您今天看到的行为当然比 HTML 3.2 中指定的要复杂得多,但我相信推理仍然适用。这种灵活性很有用的一个例子是,当您打算硬包装和缩进一段很长的段落时:

<H1>Lorem ipsum</H1>
<P>Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Fastidii oportere
   consulatu no quo. Vix saepe labores an, pri illud mentitum et, ex suas quas
   duo. Sit utinam volutpat ea, id vis cibo meis dolorum, eam docendi
   accommodare voluptatibus no. Id quaeque electram vim, ut sed singulis
   neglegentur, ne graece alterum has. Simul partiendo quaerendum et his.

If whitespace wasn't collapsed, you would end up with a paragraph with unusually large gaps where the text is hard-wrapped due to the indentation.

如果没有折叠空白,您最终会得到一个具有异常大间隙的段落,其中由于缩进而使文本硬换行。

No other HTML specification suggests any sort of reasoning behind this design decision. In particular HTML 4only describes the collapsing behavior, and HTML5 and the living spec both defer to CSS, which doesn't explain anything either. Earlier versions of HTML also do not contain any explanation, although the following excerpt does appear in an example snippet in HTML 2.0:

没有其他 HTML 规范暗示了这种设计决策背后的任何推理。特别是HTML 4只描述了折叠行为,而 HTML5 和 live 规范都遵循 CSS,它也没有解释任何东西。早期版本的 HTML 也不包含任何解释,尽管以下摘录确实出现在HTML 2.0的示例片段中:

<OL>
...
  <UL COMPACT>
  ...
  <LI> Whitespace may be used to assist in reading the
       HTML source.
  </UL>
...
</OL>

回答by Paul Dixon

To answer why is this in the specification for HTML?you have to consider the origins of HTML.

回答为什么这在 H​​TML 规范中?你必须考虑 HTML 的起源。

Tim Berners-Lee designed HTML for sharing of scientific documents. He based it on pre-existing syntax ideas in SGML, which also has similar treatments of whitespace.

Tim Berners-Lee 设计了用于共享科学文档的 HTML。他基于 SGML 中预先存在的语法思想,它也有类似的空白处理。

One can imagine that earlier writers of HTML at CERN did so without the aid of WYSIWYG tools, and so the ability to treat whitespace in this way aids legibility of such hand-written source files.

可以想象,CERN 的早期 HTML 编写者在没有所见即所得工具的帮助下完成了这项工作,因此以这种方式处理空格的能力有助于此类手写源文件的易读性。

回答by Chris Farmer

It's in the HTML spec. It's the part about inter-word spaces being rendered as an ASCII space.

它在 HTML 规范中。这是关于将字间空间呈现为 ASCII 空间的部分。

http://www.w3.org/TR/html401/struct/text.html

http://www.w3.org/TR/html401/struct/text.html

回答by casperOne

Simple, it's in the specification.

很简单,它在规范中。

From the HTML specification, section 9.1:

来自 HTML 规范的第 9.1 节

In particular, user agents should collapse input white space sequences when producing output inter-word space.

特别是,用户代理在产生输出词间空间时应该折叠输入空白序列。