在 Java 中转义 HTML 的推荐方法

Question

提问by Ben Lings

Is there a recommended way to escape <, >, "and &characters when outputting HTML in plain Java code? (Other than manually doing the following, that is).

有没有逃脱推荐的方式<，>，"和&字符时输出HTML中普通的Java代码？（除了手动执行以下操作，即）。

String source = "The less than sign (<) and ampersand (&) must be escaped before using them in HTML";
String escaped = source.replace("<", "&lt;").replace("&", "&amp;"); // ...

Answer 1

回答by dfa

StringEscapeUtilsfrom Apache Commons Lang:

来自Apache Commons Lang 的 StringEscapeUtils：

import static org.apache.commons.lang.StringEscapeUtils.escapeHtml;
// ...
String source = "The less than sign (<) and ampersand (&) must be escaped before using them in HTML";
String escaped = escapeHtml(source);

For version 3:

对于版本 3：

import static org.apache.commons.lang3.StringEscapeUtils.escapeHtml4;
// ...
String escaped = escapeHtml4(source);

Answer 2

回答by Adamski

An alternative to Apache Commons: Use Spring's HtmlUtils.htmlEscape(String input)method.

Apache Commons 的替代方法：使用Spring的HtmlUtils.htmlEscape(String input)方法。

Answer 3

回答by AUU

For some purposes, HtmlUtils:

出于某些目的，HtmlUtils：

import org.springframework.web.util.HtmlUtils;
[...]
HtmlUtils.htmlEscapeDecimal("&"); //gives &#38;
HtmlUtils.htmlEscape("&"); //gives &amp;

Answer 4

回答by Martin Dimitrov

There is a newer version of the Apache Commons Lang libraryand it uses a different package name (org.apache.commons.lang3). The StringEscapeUtilsnow has different static methods for escaping different types of documents (http://commons.apache.org/proper/commons-lang/javadocs/api-3.0/index.html). So to escape HTML version 4.0 string:

有一个较新版本的Apache Commons Lang 库，它使用不同的包名称 (org.apache.commons.lang3)。在StringEscapeUtils现在有逃避不同类型的文档不同的静态方法（http://commons.apache.org/proper/commons-lang/javadocs/api-3.0/index.html）。因此，要转义 HTML 4.0 版字符串：

import static org.apache.commons.lang3.StringEscapeUtils.escapeHtml4;

String output = escapeHtml4("The less than sign (<) and ampersand (&) must be escaped before using them in HTML");

Answer 5

回答by Jeff Williams

Be careful with this. There are a number of different 'contexts' within an HTML document: Inside an element, quoted attribute value, unquoted attribute value, URL attribute, javascript, CSS, etc... You'll need to use a different encoding method for each of these to prevent Cross-Site Scripting (XSS). Check the OWASP XSS Prevention Cheat Sheetfor details on each of these contexts. You can find escaping methods for each of these contexts in the OWASP ESAPI library -- https://github.com/ESAPI/esapi-java-legacy.

小心这一点。HTML 文档中有许多不同的“上下文”：元素内部、带引号的属性值、不带引号的属性值、URL 属性、javascript、CSS 等……您需要为每个元素使用不同的编码方法这些是为了防止跨站脚本（XSS）。查看OWASP XSS 预防备忘单以获取有关每个上下文的详细信息。您可以在 OWASP ESAPI 库 - https://github.com/ESAPI/esapi-java-legacy 中找到每个上下文的转义方法。

Answer 6

回答by OriolJ

On android (API 16 or greater) you can:

在 android（API 16 或更高版本）上，您可以：

Html.escapeHtml(textToScape);

or for lower API:

或对于较低的 API：

TextUtils.htmlEncode(textToScape);

Answer 7

回答by Adam Gent

While @dfa answer of org.apache.commons.lang.StringEscapeUtils.escapeHtmlis nice and I have used it in the past it should not be used for escaping HTML (or XML) attributesotherwise the whitespace will be normalized (meaning all adjacent whitespace characters become a single space).

虽然@dfa 的答案org.apache.commons.lang.StringEscapeUtils.escapeHtml很好，而且我过去曾使用过它，但它不应该用于转义 HTML（或 XML）属性，否则空格将被规范化（意味着所有相邻的空格字符都变成一个空格）。

I know this because I have had bugs filed against my library (JATL) for attributes where whitespace was not preserved. Thus I have a drop in (copy n' paste) class (of which I stole some from JDOM) that differentiates the escaping of attributes and element content.

我知道这一点是因为我已经针对我的库 (JATL) 为未保留空格的属性提交了错误。因此，我有一个（复制和粘贴）类（其中我从 JDOM 中窃取了一些），它区分了属性和元素内容的转义。

While this may not have mattered as much in the past (proper attribute escaping) it is increasingly become of greater interest given the use use of HTML5's data-attribute usage.

虽然这在过去可能没有那么重要（正确的属性转义），但鉴于使用 HTML5 的data-属性用法，它越来越引起人们的兴趣。

Answer 8

回答by Bruno Eberhard

Nice short method:

不错的简短方法：

public static String escapeHTML(String s) {
    StringBuilder out = new StringBuilder(Math.max(16, s.length()));
    for (int i = 0; i < s.length(); i++) {
        char c = s.charAt(i);
        if (c > 127 || c == '"' || c == '\'' || c == '<' || c == '>' || c == '&') {
            out.append("&#");
            out.append((int) c);
            out.append(';');
        } else {
            out.append(c);
        }
    }
    return out.toString();
}

Based on https://stackoverflow.com/a/8838023/1199155(the amp is missing there). The four characters checked in the if clause are the only ones below 128, according to http://www.w3.org/TR/html4/sgml/entities.html

基于https://stackoverflow.com/a/8838023/1199155（放大器在那里丢失）。根据http://www.w3.org/TR/html4/sgml/entities.html，if子句中检查的四个字符是唯一低于 128 的字符

Answer 9

回答by okrasz

For those who use Google Guava:

对于那些使用谷歌番石榴的人：

import com.google.common.html.HtmlEscapers;
[...]
String source = "The less than sign (<) and ampersand (&) must be escaped before using them in HTML";
String escaped = HtmlEscapers.htmlEscaper().escape(source);

Answer 10

回答by Luca Stancapiano

org.apache.commons.lang3.StringEscapeUtils is now deprecated. You must now use org.apache.commons.text.StringEscapeUtils by

org.apache.commons.lang3.StringEscapeUtils 现在已弃用。您现在必须使用 org.apache.commons.text.StringEscapeUtils

    <dependency>
        <groupId>org.apache.commons</groupId>
        <artifactId>commons-text</artifactId>
        <version>${commons.text.version}</version>
    </dependency>

在 Java 中转义 HTML 的推荐方法

提问by Ben Lings

回答by dfa

回答by Adamski

回答by AUU

回答by Martin Dimitrov

回答by Jeff Williams

回答by OriolJ

回答by Adam Gent

回答by Bruno Eberhard

回答by okrasz

回答by Luca Stancapiano

相关推荐

最近更新

标签

在 Java 中转义 HTML 的推荐方法

提问by Ben Lings

回答by dfa

回答by Adamski

回答by AUU

回答by Martin Dimitrov

回答by Jeff Williams

回答by OriolJ

回答by Adam Gent

回答by Bruno Eberhard

回答by okrasz

回答by Luca Stancapiano

相关推荐

Java If-statement - 检查字符串对变量还是变量对字符串？

java本机进程超时

基于 Java 组件与基于请求的框架

“glob”类型模式是否有相当于 java.util.regex 的东西？

相关推荐

最近更新

标签