如何在 Java 中对 XML 进行转义
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2833956/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
how to unescape XML in java
提问by Bas Hendriks
I need to unescape a xml string containing escaped XML tags:
我需要取消转义包含转义 XML 标签的 xml 字符串:
<
>
&
etc...
I did find some libs that can perform this task, but i'd rather use a single method that can perform this task.
我确实找到了一些可以执行此任务的库,但我宁愿使用可以执行此任务的单一方法。
Can someone help?
有人可以帮忙吗?
cheers, Bas Hendriks
干杯,巴斯亨德里克斯
采纳答案by Bozho
回答by texclayton
Here's a simple method to unescape XML. It handles the predefined XML entities and decimal numerical entities (&#nnnn;). Modifying it to handle hex entities (&#xhhhh;) should be simple.
下面是一种对 XML 进行转义的简单方法。它处理预定义的 XML 实体和十进制数字实体 (&#nnnn;)。修改它以处理十六进制实体 (&#xhhhh;) 应该很简单。
public static String unescapeXML( final String xml )
{
Pattern xmlEntityRegex = Pattern.compile( "&(#?)([^;]+);" );
//Unfortunately, Matcher requires a StringBuffer instead of a StringBuilder
StringBuffer unescapedOutput = new StringBuffer( xml.length() );
Matcher m = xmlEntityRegex.matcher( xml );
Map<String,String> builtinEntities = null;
String entity;
String hashmark;
String ent;
int code;
while ( m.find() ) {
ent = m.group(2);
hashmark = m.group(1);
if ( (hashmark != null) && (hashmark.length() > 0) ) {
code = Integer.parseInt( ent );
entity = Character.toString( (char) code );
} else {
//must be a non-numerical entity
if ( builtinEntities == null ) {
builtinEntities = buildBuiltinXMLEntityMap();
}
entity = builtinEntities.get( ent );
if ( entity == null ) {
//not a known entity - ignore it
entity = "&" + ent + ';';
}
}
m.appendReplacement( unescapedOutput, entity );
}
m.appendTail( unescapedOutput );
return unescapedOutput.toString();
}
private static Map<String,String> buildBuiltinXMLEntityMap()
{
Map<String,String> entities = new HashMap<String,String>(10);
entities.put( "lt", "<" );
entities.put( "gt", ">" );
entities.put( "amp", "&" );
entities.put( "apos", "'" );
entities.put( "quot", "\"" );
return entities;
}
回答by msangel
If you work with JSP, use su:unescapeXml from openutils-elfunctions
如果您使用 JSP,请使用openutils-elfunctions 中的su:unescapeXml
回答by Balazs Zsoldos
Here is one that I wrote in ten minutes. It does not use regular expressions, only simple iterations. I do not think that this can be enhanced to be much faster.
这是我用十分钟写的一篇。它不使用正则表达式,只使用简单的迭代。我不认为这可以提高得更快。
public static String unescape(final String text) {
StringBuilder result = new StringBuilder(text.length());
int i = 0;
int n = text.length();
while (i < n) {
char charAt = text.charAt(i);
if (charAt != '&') {
result.append(charAt);
i++;
} else {
if (text.startsWith("&", i)) {
result.append('&');
i += 5;
} else if (text.startsWith("'", i)) {
result.append('\'');
i += 6;
} else if (text.startsWith(""", i)) {
result.append('"');
i += 6;
} else if (text.startsWith("<", i)) {
result.append('<');
i += 4;
} else if (text.startsWith(">", i)) {
result.append('>');
i += 4;
} else i++;
}
}
return result.toString();
}