java Java如何将单引号和双引号编码为HTML实体?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/30620543/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-02 17:23:07  来源:igfitidea点击:

Java how to encode single quote and double quote into HTML entities?

javahtml

提问by GMsoF

How can I encode "into "and 'into '?

我怎样才能编码""''

I am quite suprised single quote and double quote is not defined in HTML Entities 4.0, and so StringEscapeUtilsnot able to escape these 2 characters into respective entities.

我很惊讶单引号和双引号未在 HTML 实体 4.0 中定义,因此StringEscapeUtils无法将这 2 个字符转义为相应的实体。

Is there any other String related tool able to do this?

有没有其他与字符串相关的工具能够做到这一点?

Any reason why single quote and double quote is not defined in HTML Entities 4.0?

HTML Entities 4.0 中未定义单引号和双引号的任何原因?

Besides single quote and double quote, is there any framework able to encode all the unicode character into respective entities? Since all the unicode can be manually translate into decimal entities and show in HTML, so wonder is there any tool able to convert it automatically?

除了单引号和双引号之外,是否有任何框架能够将所有 unicode 字符编码为相应的实体?由于所有的unicode都可以手动转换为十进制实体并以HTML格式显示,所以想知道有什么工具可以自动转换吗?

回答by learningloop

  1. Single quote and double quote not defined in HTML 4.0
  1. 单引号和双引号未在 HTML 4.0 中定义

Single quote only is not defined in HTML 4.0, double quote is defined as "starting HTML2.0

HTML 4.0 中没有定义单引号,双引号定义为"HTML2.0 开始

  1. StringEscapeUtils not able to escape these 2 characters into respective entities
  1. StringEscapeUtils 无法将这 2 个字符转义为各自的实体

escapeXml11in StringEscapeUtilssupportsconverting single quote into '.

escapeXml11inStringEscapeUtils支持将单引号转换为'.

For Example:

例如:

StringEscapeUtils.escapeXml11("'"); //Returns '
StringEscapeUtils.escapeHtml4("\""); //Returns "
  1. Is there any other String related tool able to do this?
  1. 有没有其他与字符串相关的工具能够做到这一点?

HTMLUtilsfrom Spring framework takes care of single quotes & double quotes, it also converts the values to decimal (like '& "). Following example is taken from the answer to this question:

Spring 框架中的HTMLUtils处理单引号和双引号,它还将值转换为十进制(如'& ")。以下示例取自此问题的答案:

import org.springframework.web.util.HtmlUtils;
[...]
HtmlUtils.htmlEscapeDecimal("&")` //gives &
HtmlUtils.htmlEscape("&")` //gives &
  1. Any reason why single quote and double quote is not defined in HTML Entities 4.0?
  1. HTML Entities 4.0 中未定义单引号和双引号的任何原因?

As per Character entity references in HTML 4the single quote is not defined. Double quote is available from HTML2.0. Whereas single quote is supported as part of XHTML1.0.

根据HTML 4 中字符实体引用,未定义单引号。双引号可从 HTML2.0 获得。而XHTML1.0支持单引号。

  1. Tool or method to encode all the unicode character into respective entities
  1. 将所有 unicode 字符编码为相应实体的工具或方法

There is a very good & simple java implementation mentioned as part of an answer to this question.

作为对这个问题的回答的一部分,提到了一个非常好的和简单的 java 实现。

Following is a sample program based on that answer:

以下是基于该答案的示例程序:

import org.apache.commons.lang3.StringEscapeUtils;

public class HTMLCharacterEscaper {
    public static void main(String[] args) {        
        //With StringEscapeUtils
        System.out.println("Using SEU: " + StringEscapeUtils.escapeHtml4("\" ?"));
        System.out.println("Using SEU: " + StringEscapeUtils.escapeXml11("'"));

        //Single quote & double quote
        System.out.println(escapeHTML("It's good"));
        System.out.println(escapeHTML("\" Grit \""));

        //Unicode characters
        System.out.println(escapeHTML("This is copyright symbol ?"));
        System.out.println(escapeHTML("Paragraph symbol ?"));
        System.out.println(escapeHTML("This is pound £"));      
    }

    public static String escapeHTML(String s) {
        StringBuilder out = new StringBuilder(Math.max(16, s.length()));
        for (int i = 0; i < s.length(); i++) {
            char c = s.charAt(i);
            if (c > 127 || c == '"' || c == '<' || c == '>' || c == '&' || c == '\'') {
                out.append("&#");
                out.append((int) c);
                out.append(';');
            } else {
                out.append(c);
            }
        }
        return out.toString();
    }

}

Following are some interesting links, which i came across during the pursuit of the answer:

以下是一些有趣的链接,我在寻求答案时遇到了这些链接: