C# WebUtility.HtmlDecode 与 HttpUtilty.HtmlDecode

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/17352981/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-10 09:16:46  来源:igfitidea点击:

WebUtility.HtmlDecode vs HttpUtilty.HtmlDecode

c#html.netwindows-phone

提问by zi3guw

I was using WebUtilty.HtmlDecodeto decode HTML. It turns out that it doesn't decode properly, for example, –is supposed to decode to a "–" character, but WebUtilty.HtmlDecodedoes not decode it. HttpUtilty.HtmlDecode, however, does.

我是WebUtilty.HtmlDecode用来解码 HTML 的。结果证明它没有正确解码,例如,–应该解码为“-”字符,但WebUtilty.HtmlDecode没有对其进行解码。HttpUtilty.HtmlDecode,然而,确实如此。

Debug.WriteLine(WebUtility.HtmlDecode("–"));
Debug.WriteLine(HttpUtility.HtmlDecode("–"));


> –
> –

decode screenshot

解码截图

The documentation for both of these is the same: Converts a string that has been HTML-encoded for HTTP transmission into a decoded string.

这两者的文档是相同的: 将已为 HTTP 传输进行 HTML 编码的字符串转换为已解码的字符串。

Why are they different, which one should I be using, and what will change if I switch to WebUtility.HtmlDecode to get "–" to decode correctly?

为什么它们不同,我应该使用哪一个,以及如果我切换到 WebUtility.HtmlDecode 以获得“–”来正确解码会发生什么变化?

采纳答案by Kevin Gosse

The implementation of the two methods are indeed different on Windows Phone.

这两种方法在Windows Phone上的实现确实不同。

WebUtility.HtmlDecode:

WebUtility.HtmlDecode:

public static void HtmlDecode(string value, TextWriter output)
{
    if (value != null)
    {
        if (output == null)
        {
            throw new ArgumentNullException("output");
        }
        if (!StringRequiresHtmlDecoding(value))
        {
            output.Write(value);
        }
        else
        {
            int length = value.Length;
            for (int i = 0; i < length; i++)
            {
                bool flag;
                uint num4;
                char ch = value[i];
                if (ch != '&')
                {
                    goto Label_01B6;
                }
                int num3 = value.IndexOfAny(_htmlEntityEndingChars, i + 1);
                if ((num3 <= 0) || (value[num3] != ';'))
                {
                    goto Label_01B6;
                }
                string entity = value.Substring(i + 1, (num3 - i) - 1);
                if ((entity.Length <= 1) || (entity[0] != '#'))
                {
                    goto Label_0188;
                }
                if ((entity[1] == 'x') || (entity[1] == 'X'))
                {
                    flag = uint.TryParse(entity.Substring(2), NumberStyles.AllowHexSpecifier, NumberFormatInfo.InvariantInfo, out num4);
                }
                else
                {
                    flag = uint.TryParse(entity.Substring(1), NumberStyles.Integer, NumberFormatInfo.InvariantInfo, out num4);
                }
                if (flag)
                {
                    switch (_htmlDecodeConformance)
                    {
                        case UnicodeDecodingConformance.Strict:
                            flag = (num4 < 0xd800) || ((0xdfff < num4) && (num4 <= 0x10ffff));
                            goto Label_0151;

                        case UnicodeDecodingConformance.Compat:
                            flag = (0 < num4) && (num4 <= 0xffff);
                            goto Label_0151;

                        case UnicodeDecodingConformance.Loose:
                            flag = num4 <= 0x10ffff;
                            goto Label_0151;
                    }
                    flag = false;
                }
            Label_0151:
                if (!flag)
                {
                    goto Label_01B6;
                }
                if (num4 <= 0xffff)
                {
                    output.Write((char) num4);
                }
                else
                {
                    char ch2;
                    char ch3;
                    ConvertSmpToUtf16(num4, out ch2, out ch3);
                    output.Write(ch2);
                    output.Write(ch3);
                }
                i = num3;
                goto Label_01BD;
            Label_0188:
                i = num3;
                char ch4 = HtmlEntities.Lookup(entity);
                if (ch4 != '
public static string HtmlDecode(string html)
{
    if (html == null)
    {
        return null;
    }
    if (html.IndexOf('&') < 0)
    {
        return html;
    }
    StringBuilder sb = new StringBuilder();
    StringWriter writer = new StringWriter(sb, CultureInfo.InvariantCulture);
    int length = html.Length;
    for (int i = 0; i < length; i++)
    {
        char ch = html[i];
        if (ch == '&')
        {
            int num3 = html.IndexOfAny(s_entityEndingChars, i + 1);
            if ((num3 > 0) && (html[num3] == ';'))
            {
                string entity = html.Substring(i + 1, (num3 - i) - 1);
                if ((entity.Length > 1) && (entity[0] == '#'))
                {
                    try
                    {
                        if ((entity[1] == 'x') || (entity[1] == 'X'))
                        {
                            ch = (char) int.Parse(entity.Substring(2), NumberStyles.AllowHexSpecifier, CultureInfo.InvariantCulture);
                        }
                        else
                        {
                            ch = (char) int.Parse(entity.Substring(1), CultureInfo.InvariantCulture);
                        }
                        i = num3;
                    }
                    catch (FormatException)
                    {
                        i++;
                    }
                    catch (ArgumentException)
                    {
                        i++;
                    }
                }
                else
                {
                    i = num3;
                    char ch2 = HtmlEntities.Lookup(entity);
                    if (ch2 != '##代码##')
                    {
                        ch = ch2;
                    }
                    else
                    {
                        writer.Write('&');
                        writer.Write(entity);
                        writer.Write(';');
                        continue;
                    }
                }
            }
        }
        writer.Write(ch);
    }
    return sb.ToString();
}
') { ch = ch4; } else { output.Write('&'); output.Write(entity); output.Write(';'); goto Label_01BD; } Label_01B6: output.Write(ch); Label_01BD:; } } } }

HttpUtility.HtmlDecode:

HttpUtility.HtmlDecode:

##代码##

Interestingly, WebUtility doesn't exist on WP7. Also, the WP8 implementation of WebUtility is identical to the desktop one. The desktop implementation of HttpUtility.HtmlDecodeis just a wrapper around WebUtility.HtmlDecode. Last but not least, Silverlight 5 has the same implementation of HttpUtility.HtmlDecodeas Windows Phone, and does not implement WebUtility.

有趣的是,WP7 上不存在 WebUtility。此外,WebUtility 的 WP8 实现与桌面实现相同。的桌面实现HttpUtility.HtmlDecode只是围绕WebUtility.HtmlDecode. 最后但并非最不重要的一点是,Silverlight 5 具有与HttpUtility.HtmlDecodeWindows Phone相同的实现,并且没有实现 WebUtility。

From there, I can venture a guess: since the Windows Phone 7 runtime is based on Silverlight, WP7 inherited of the Silverlight version of HttpUtility.HtmlDecode, and WebUtility wasn't present. Then came WP8, whose runtime is based on WinRT. WinRT brought WebUtility, and the old version of HttpUtility.HtmlDecodewas kept to ensure the compatibility with the legacy WP7 apps.

从那里,我可以大胆猜测:由于 Windows Phone 7 运行时基于 Silverlight,WP7 继承了 Silverlight 版本的HttpUtility.HtmlDecode,并且不存在 WebUtility。然后是 WP8,其运行时基于 WinRT。WinRT 带来了 WebUtility,并HttpUtility.HtmlDecode保留了旧版本以确保与旧版 WP7 应用程序的兼容性。

As to know which one you should use... If you want to target WP7 then you have no choice but to use HttpUtility.HtmlDecode. If you're targeting WP8, then just pick the method whose behavior suits your needs the best. WebUtility is probably the future-proof choice, just in case Microsoft decides to ditch the Silverlight runtime in an upcoming version of Windows Phone. But I'd just go with the practical choice of picking HttpUtility to not have to worry about manually supporting the example you've put in your question.

至于知道你应该使用哪一个...如果你想针对 WP7 那么你别无选择,只能使用HttpUtility.HtmlDecode. 如果您的目标是 WP8,那么只需选择行为最适合您需求的方法即可。WebUtility 可能是面向未来的选择,以防万一微软决定在即将推出的 Windows Phone 版本中放弃 Silverlight 运行时。但我只是选择 HttpUtility 的实际选择,而不必担心手动支持您在问题中提出的示例。

回答by Jan Dobkowski

The methods do exactly the same. Moreover if you try to decompile them the implementations look like one was just copied from another.

方法完全一样。此外,如果您尝试反编译它们,则实现看起来就像是从另一个复制而来。

The difference is only intended use. HttpUtilityis contained in the System.Webassembly and is expected to be used in ASP.net applications which are built over this assembly. WebUtilityis contained in the Systemassembly referenced by nearly all applications and is provided for more general purpose or client use.

区别仅在于预期用途HttpUtility包含在System.Web程序集中,预计将在基于此程序集构建的 ASP.net 应用程序中使用。WebUtility包含在System几乎所有应用程序引用的程序集中,并提供给更通用的用途或客户端使用。

回答by crea7or

Just to notify others who will find this in search. Use any function that mentioned in the question, but never use Windows.Data.Html.HtmlUtilities.ConvertToText(string input). It's 70 times slower than WebUtilty.HtmlDecodeand produce crashes! Crash will be named as mshtml!IEPeekMessagein the DevCenter. It looks like this function call InternetExplorer to convert the string. Just avoid it.

只是为了通知其他人会在搜索中找到这个。使用问题中提到的任何函数,但不要使用Windows.Data.Html.HtmlUtilities.ConvertToText(string input). 它比WebUtilty.HtmlDecode并产生崩溃慢 70 倍!崩溃将mshtml!IEPeekMessage在 DevCenter 中命名。看起来这个函数调用 InternetExplorer 来转换字符串。只是避免它。