HTML 中字符串的不可见分隔符
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2812253/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Invisible Delimiter for Strings in HTML
提问by noah
I need a way to identify certain strings in HTML markup. I know what the strings are, but it is possible that they could be substrings of other strings in the document. To find them, I output a special delimiter character (currently using \032
). On page load, we go through the HTML and record the location of the strings, and remove the delimiter.
我需要一种方法来识别 HTML 标记中的某些字符串。我知道这些字符串是什么,但它们可能是文档中其他字符串的子字符串。为了找到它们,我输出了一个特殊的分隔符(当前使用\032
)。在页面加载时,我们浏览 HTML 并记录字符串的位置,并删除分隔符。
Unfortunately, most browsers show the delimiter character until we can find and remove them all. I'd like to avoid that if possible. Is there a character or string that will be preserved in the HTML content (so a comment wont work) but wont be visible to the user? It also needs to be something that is fairly unlikely to appear next to a string, so something like
wouldn't work either.
不幸的是,大多数浏览器都会显示分隔符,直到我们找到并删除它们为止。如果可能的话,我想避免这种情况。是否有一个字符或字符串将保留在 HTML 内容中(因此注释不起作用)但对用户不可见?它也需要是不太可能出现在字符串旁边的东西,所以类似的东西
也不起作用。
EDIT: Sorry, I forgot to mention that the strings will be in attributes, so any sort of tag wont work.
编辑:对不起,我忘了提到字符串将在属性中,所以任何类型的标签都不起作用。
回答by Anon
‌
- zero-width non-joiner (see http://htmlhelp.org/reference/html40/entities/special.html)
‌
- 零宽度非连接器(参见http://htmlhelp.org/reference/html40/entities/special.html)
On the off chance that this already appears in your text, double it up (eg: ‌‌mytext‌‌
如果这已经出现在您的文本中,请将其加倍(例如: ‌‌mytext‌‌
Edit in response to comment: works in Firefox 3. Note that you have to search for the Unicode value of the entity.
针对评论进行编辑:适用于 Firefox 3。请注意,您必须搜索实体的 Unicode 值。
<html>
<body>
<div id="test">
This is a ‌test
</div>
<script type="application/javascript">
var myDiv = document.getElementById("test");
var content = myDiv.innerHTML;
var pos = content.indexOf("\u200C");
alert(pos);
</script>
</body>
</html>
回答by amphetamachine
You could insert them into <span>
elements. This will work only for in-page text (not attributes, or the like).
您可以将它们插入到<span>
元素中。这仅适用于页内文本(不适用于属性等)。
Otherwise, you could insert a whitespace character that your program doesn't already output as part of the HTML, like a tab character (\x09
), a vertical tab (\x0b
), a bare carriage return (\x0d
) — without a newline beside it, ala Windows text encoding — or, just a null byte (\x00
).
否则,您可以插入一个空白字符,该字符您的程序尚未作为 HTML 的一部分输出,例如制表符 ( \x09
)、垂直制表符 ( \x0b
)、空的回车符 ( \x0d
) — 旁边没有换行符,ala Windows文本编码——或者,只是一个空字节 ( \x00
)。
回答by Kangkan
The best thing that I shall like to insert, which is not visible on the browser, will be a pair of tags with some special id, like <span id="delimiter" class="Delimiter"></span>
. This will not show up on the content, while this can be present in the doc. You don't need to remove them.
我想插入的最好的东西,在浏览器上是不可见的,将是一对带有一些特殊 id 的标签,比如<span id="delimiter" class="Delimiter"></span>
. 这不会出现在内容中,而这可以出现在文档中。您不需要删除它们。
回答by Tgr
You could use left-to-right (LTR) marks. Is this for some sort of XSS testing? If so, this might be of interest: Taint support for PHP
您可以使用从左到右 (LTR) 标记。这是为了某种 XSS 测试吗?如果是这样,这可能很有趣:对 PHP 的污点支持