HTML 中字符串的不可见分隔符

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2812253/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-29 03:02:24  来源:igfitidea点击:

Invisible Delimiter for Strings in HTML

htmlnon-printing-characters

提问by noah

I need a way to identify certain strings in HTML markup. I know what the strings are, but it is possible that they could be substrings of other strings in the document. To find them, I output a special delimiter character (currently using \032). On page load, we go through the HTML and record the location of the strings, and remove the delimiter.

我需要一种方法来识别 HTML 标记中的某些字符串。我知道这些字符串是什么,但它们可能是文档中其他字符串的子字符串。为了找到它们,我输出了一个特殊的分隔符(当前使用\032)。在页面加载时,我们浏览 HTML 并记录字符串的位置,并删除分隔符。

Unfortunately, most browsers show the delimiter character until we can find and remove them all. I'd like to avoid that if possible. Is there a character or string that will be preserved in the HTML content (so a comment wont work) but wont be visible to the user? It also needs to be something that is fairly unlikely to appear next to a string, so something like  wouldn't work either.

不幸的是,大多数浏览器都会显示分隔符,直到我们找到并删除它们为止。如果可能的话,我想避免这种情况。是否有一个字符或字符串将保留在 HTML 内容中(因此注释不起作用)但对用户不可见?它也需要是不太可能出现在字符串旁边的东西,所以类似的东西 也不起作用。

EDIT: Sorry, I forgot to mention that the strings will be in attributes, so any sort of tag wont work.

编辑:对不起,我忘了提到字符串将在属性中,所以任何类型的标签都不起作用。

回答by Anon

‌- zero-width non-joiner (see http://htmlhelp.org/reference/html40/entities/special.html)

‌- 零宽度非连接器(参见http://htmlhelp.org/reference/html40/entities/special.html

On the off chance that this already appears in your text, double it up (eg: ‌‌mytext‌‌

如果这已经出现在您的文本中,请将其加倍(例如: ‌‌mytext‌‌



Edit in response to comment: works in Firefox 3. Note that you have to search for the Unicode value of the entity.

针对评论进行编辑:适用于 Firefox 3。请注意,您必须搜索实体的 Unicode 值。

<html>
<body>
    <div id="test">
        This is a &zwnj;test
    </div>

    <script type="application/javascript">
        var myDiv = document.getElementById("test");
        var content = myDiv.innerHTML;
        var pos = content.indexOf("\u200C");
        alert(pos);
    </script>
</body>
</html>

回答by amphetamachine

You could insert them into <span>elements. This will work only for in-page text (not attributes, or the like).

您可以将它们插入到<span>元素中。这仅适用于页内文本(不适用于属性等)。

Otherwise, you could insert a whitespace character that your program doesn't already output as part of the HTML, like a tab character (\x09), a vertical tab (\x0b), a bare carriage return (\x0d) — without a newline beside it, ala Windows text encoding — or, just a null byte (\x00).

否则,您可以插入一个空白字符,该字符您的程序尚未作为 HTML 的一部分输出,例如制表符 ( \x09)、垂直制表符 ( \x0b)、空的回车符 ( \x0d) — 旁边没有换行符,ala Windows文本编码——或者,只是一个空字节 ( \x00)。

回答by Kangkan

The best thing that I shall like to insert, which is not visible on the browser, will be a pair of tags with some special id, like <span id="delimiter" class="Delimiter"></span>. This will not show up on the content, while this can be present in the doc. You don't need to remove them.

我想插入的最好的东西,在浏览器上是不可见的,将是一对带有一些特殊 id 的标签,比如<span id="delimiter" class="Delimiter"></span>. 这不会出现在内容中,而这可以出现在文档中。您不需要删除它们。

回答by Tgr

You could use left-to-right (LTR) marks. Is this for some sort of XSS testing? If so, this might be of interest: Taint support for PHP

您可以使用从左到右 (LTR) 标记。这是为了某种 XSS 测试吗?如果是这样,这可能很有趣:对 PHP 的污点支持