使用 C# 进行 URL 编码

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/575440/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-04 08:52:58  来源:igfitidea点击:

URL Encoding using C#

c#.neturlencode

提问by masfenix

I have an application which sends a POST request to the VB forum software and logs someone in (without setting cookies or anything).

我有一个应用程序,它向 VB 论坛软件发送 POST 请求并登录(不设置 cookie 或任何东西)。

Once the user is logged in I create a variable that creates a path on their local machine.

用户登录后,我创建一个变量,在他们的本地机器上创建一个路径。

c:\tempfolder\date\username

c:\临时文件夹\日期\用户名

The problem is that some usernames are throwing "Illegal chars" exception. For example if my username was mas|fenixit would throw an exception..

问题是一些用户名抛出“非法字符”异常。例如,如果我的用户名是mas|fenix它会抛出异常..

Path.Combine( _      
  Environment.GetFolderPath(System.Environment.SpecialFolder.CommonApplicationData), _
  DateTime.Now.ToString("ddMMyyhhmm") + "-" + form1.username)

I don't want to remove it from the string, but a folder with their username is created through FTP on a server. And this leads to my second question. If I am creating a folder on the server can I leave the "illegal chars" in? I only ask this because the server is Linux based, and I am not sure if Linux accepts it or not.

我不想从字符串中删除它,但是通过服务器上的 FTP 创建了一个带有用户名的文件夹。这就引出了我的第二个问题。如果我在服务器上创建一个文件夹,我可以留下“非法字符”吗?我问这个只是因为服务器是基于 Linux 的,我不确定 Linux 是否接受它。

EDIT: It seems that URL encode is NOT what I want.. Here's what I want to do:

编辑:似乎 URL 编码不是我想要的......这是我想要做的:

old username = mas|fenix
new username = mas%xxfenix

Where %xx is the ASCII value or any other value that would easily identify the character.

其中 %xx 是 ASCII 值或任何其他可以轻松识别字符的值。

采纳答案by Gregory A Beamer

Edit: Note that this answer is now out of date. See Siarhei Kuchuk's answer belowfor a better fix

编辑:请注意,此答案现已过时。请参阅下面的 Siarhei Kuchuk 的回答以获得更好的解决方案

UrlEncoding will do what you are suggesting here. With C#, you simply use HttpUtility, as mentioned.

UrlEncoding 将执行您在此处建议的操作。使用 C#,您只需使用HttpUtility,如前所述。

You can also Regex the illegal characters and then replace, but this gets far more complex, as you will have to have some form of state machine (switch ... case, for example) to replace with the correct characters. Since UrlEncodedoes this up front, it is rather easy.

您也可以对非法字符进行正则表达式,然后替换,但这会变得更加复杂,因为您必须使用某种形式的状态机(例如 switch ... case)来替换正确的字符。既然UrlEncode这样做了,这很容易。

As for Linux versus windows, there are some characters that are acceptable in Linux that are not in Windows, but I would not worry about that, as the folder name can be returned by decoding the Url string, using UrlDecode, so you can round trip the changes.

至于 Linux 与 Windows,有些字符在 Linux 中是可以接受的,但在 Windows 中却没有,但我不会担心,因为可以通过解码 Url 字符串返回文件夹名称,使用UrlDecode,因此您可以往返变化。

回答by teedyay

Url Encoding is easy in .NET. Use:

Url 编码在 .NET 中很容易。用:

System.Web.HttpUtility.UrlEncode(string url)

If that'll be decoded to get the folder name, you'll still need to exclude characters that can't be used in folder names (*, ?, /, etc.)

如果将对其进行解码以获取文件夹名称,您仍然需要排除不能在文件夹名称中使用的字符(*、?、/ 等)

回答by Dan Herbert

You should encode only the user name or other part of the URL that could be invalid. URL encoding a URL can lead to problems since something like this:

您应该只对可能无效的用户名或 URL 的其他部分进行编码。URL 编码 URL 可能会导致问题,因为如下所示:

string url = HttpUtility.UrlEncode("http://www.google.com/search?q=Example");

Will yield

会屈服

http%3a%2f%2fwww.google.com%2fsearch%3fq%3dExample

http%3a%2f%2fwww.google.com%2fsearch%3fq%3dExample

This is obviously not going to work well. Instead, you should encode ONLY the value of the key/value pair in the query string, like this:

这显然不会很好地工作。相反,您应该只编码查询字符串中键/值对的值,如下所示:

string url = "http://www.google.com/search?q=" + HttpUtility.UrlEncode("Example");

Hopefully that helps. Also, as teedyaymentioned, you'll still need to make sure illegal file-name characters are removed or else the file system won't like the path.

希望这有帮助。此外,正如teedyay提到的,您仍然需要确保删除非法的文件名字符,否则文件系统将不喜欢该路径。

回答by useful

If you can't see System.Web, change your project settings. The target framework should be ".NET Framework 4" instead of ".NET Framework 4 Client Profile"

如果您看不到 System.Web,请更改您的项目设置。目标框架应该是“.NET Framework 4”而不是“.NET Framework 4 Client Profile”

回答by Siarhei Kuchuk

Better way is to use

更好的方法是使用

Uri.EscapeUriString

Uri.EscapeUriString

to not reference Full Profile of .net 4.

不引用 .net 4 的完整配置文件。

回答by Simon Tewsi

I've been experimenting with the various methods .NET provide for URL encoding. Perhaps the following table will be useful (as output from a test app I wrote):

我一直在试验 .NET 为 URL 编码提供的各种方法。也许下表会很有用(作为我编写的测试应用程序的输出):

Unencoded UrlEncoded UrlEncodedUnicode UrlPathEncoded EscapedDataString EscapedUriString HtmlEncoded HtmlAttributeEncoded HexEscaped
A         A          A                 A              A                 A                A           A                    %41
B         B          B                 B              B                 B                B           B                    %42

a         a          a                 a              a                 a                a           a                    %61
b         b          b                 b              b                 b                b           b                    %62

0         0          0                 0              0                 0                0           0                    %30
1         1          1                 1              1                 1                1           1                    %31

[space]   +          +                 %20            %20               %20              [space]     [space]              %20
!         !          !                 !              !                 !                !           !                    %21
"         %22        %22               "              %22               %22              "      "               %22
#         %23        %23               #              %23               #                #           #                    %23
$         %24        %24               $              %24               $                $           $                    %24
%         %25        %25               %              %25               %25              %           %                    %25
&         %26        %26               &              %26               &                &       &                %26
'         %27        %27               '              '                 '                '       '                %27
(         (          (                 (              (                 (                (           (                    %28
)         )          )                 )              )                 )                )           )                    %29
*         *          *                 *              %2A               *                *           *                    %2A
+         %2b        %2b               +              %2B               +                +           +                    %2B
,         %2c        %2c               ,              %2C               ,                ,           ,                    %2C
-         -          -                 -              -                 -                -           -                    %2D
.         .          .                 .              .                 .                .           .                    %2E
/         %2f        %2f               /              %2F               /                /           /                    %2F
:         %3a        %3a               :              %3A               :                :           :                    %3A
;         %3b        %3b               ;              %3B               ;                ;           ;                    %3B
<         %3c        %3c               <              %3C               %3C              &lt;        &lt;                 %3C
=         %3d        %3d               =              %3D               =                =           =                    %3D
>         %3e        %3e               >              %3E               %3E              &gt;        >                    %3E
?         %3f        %3f               ?              %3F               ?                ?           ?                    %3F
@         %40        %40               @              %40               @                @           @                    %40
[         %5b        %5b               [              %5B               %5B              [           [                    %5B
\         %5c        %5c               \              %5C               %5C              \           \                    %5C
]         %5d        %5d               ]              %5D               %5D              ]           ]                    %5D
^         %5e        %5e               ^              %5E               %5E              ^           ^                    %5E
_         _          _                 _              _                 _                _           _                    %5F
`         %60        %60               `              %60               %60              `           `                    %60
{         %7b        %7b               {              %7B               %7B              {           {                    %7B
|         %7c        %7c               |              %7C               %7C              |           |                    %7C
}         %7d        %7d               }              %7D               %7D              }           }                    %7D
~         %7e        %7e               ~              ~                 ~                ~           ~                    %7E

ā         %c4%80     %u0100            %c4%80         %C4%80            %C4%80           ā           ā                    [OoR]
ā         %c4%81     %u0101            %c4%81         %C4%81            %C4%81           ā           ā                    [OoR]
ē         %c4%92     %u0112            %c4%92         %C4%92            %C4%92           ē           ē                    [OoR]
ē         %c4%93     %u0113            %c4%93         %C4%93            %C4%93           ē           ē                    [OoR]
ī         %c4%aa     %u012a            %c4%aa         %C4%AA            %C4%AA           ī           ī                    [OoR]
ī         %c4%ab     %u012b            %c4%ab         %C4%AB            %C4%AB           ī           ī                    [OoR]
ō         %c5%8c     %u014c            %c5%8c         %C5%8C            %C5%8C           ō           ō                    [OoR]
ō         %c5%8d     %u014d            %c5%8d         %C5%8D            %C5%8D           ō           ō                    [OoR]
ū         %c5%aa     %u016a            %c5%aa         %C5%AA            %C5%AA           ū           ū                    [OoR]
ū         %c5%ab     %u016b            %c5%ab         %C5%AB            %C5%AB           ū           ū                    [OoR]

The columns represent encodings as follows:

列表示编码如下:

  • UrlEncoded: HttpUtility.UrlEncode

  • UrlEncodedUnicode: HttpUtility.UrlEncodeUnicode

  • UrlPathEncoded: HttpUtility.UrlPathEncode

  • EscapedDataString: Uri.EscapeDataString

  • EscapedUriString: Uri.EscapeUriString

  • HtmlEncoded: HttpUtility.HtmlEncode

  • HtmlAttributeEncoded: HttpUtility.HtmlAttributeEncode

  • HexEscaped: Uri.HexEscape

  • 网址编码: HttpUtility.UrlEncode

  • UrlEncodedUnicode: HttpUtility.UrlEncodeUnicode

  • 网址路径编码: HttpUtility.UrlPathEncode

  • 转义数据字符串: Uri.EscapeDataString

  • 转义的UriString: Uri.EscapeUriString

  • Html编码: HttpUtility.HtmlEncode

  • HtmlAttributeEncoded: HttpUtility.HtmlAttributeEncode

  • 十六进制转义: Uri.HexEscape

NOTES:

笔记:

  1. HexEscapecan only handle the first 255 characters. Therefore it throws an ArgumentOutOfRangeexception for the Latin A-Extended characters (eg ā).

  2. This table was generated in .NET 4.0 (see Levi Botelho's comment below that says the encoding in .NET 4.5 is slightly different).

  1. HexEscape只能处理前 255 个字符。因此,它对ArgumentOutOfRange拉丁文 A 扩展字符(例如 ā)抛出异常。

  2. 该表是在 .NET 4.0 中生成的(请参阅下面的 Levi Botelho 的评论,其中说 .NET 4.5 中的编码略有不同)。

EDIT:

编辑:

I've added a second table with the encodings for .NET 4.5. See this answer: https://stackoverflow.com/a/21771206/216440

我添加了第二个表,其中包含 .NET 4.5 的编码。看到这个答案:https: //stackoverflow.com/a/21771206/216440

EDIT 2:

编辑2:

Since people seem to appreciate these tables, I thought you might like the source code that generates the table, so you can play around yourselves. It's a simple C# console application, which can target either .NET 4.0 or 4.5:

既然人们似乎很欣赏这些表格,我想你可能会喜欢生成表格的源代码,这样你就可以自己玩了。这是一个简单的 C# 控制台应用程序,可以针对 .NET 4.0 或 4.5:

using System;
using System.Collections.Generic;
using System.Text;
// Need to add a Reference to the System.Web assembly.
using System.Web;

namespace UriEncodingDEMO2
{
    class Program
    {
        static void Main(string[] args)
        {
            EncodeStrings();

            Console.WriteLine();
            Console.WriteLine("Press any key to continue...");
            Console.Read();
        }

        public static void EncodeStrings()
        {
            string stringToEncode = "ABCD" + "abcd"
            + "0123" + " !\"#$%&'()*+,-./:;<=>?@[\]^_`{|}~" + "āāēēīīōōūū";

            // Need to set the console encoding to display non-ASCII characters correctly (eg the 
            //  Latin A-Extended characters such as āāēē...).
            Console.OutputEncoding = Encoding.UTF8;

            // Will also need to set the console font (in the console Properties dialog) to a font 
            //  that displays the extended character set correctly.
            // The following fonts all display the extended characters correctly:
            //  Consolas
            //  DejaVu Sana Mono
            //  Lucida Console

            // Also, in the console Properties, set the Screen Buffer Size and the Window Size 
            //  Width properties to at least 140 characters, to display the full width of the 
            //  table that is generated.

            Dictionary<string, Func<string, string>> columnDetails =
                new Dictionary<string, Func<string, string>>();
            columnDetails.Add("Unencoded", (unencodedString => unencodedString));
            columnDetails.Add("UrlEncoded",
                (unencodedString => HttpUtility.UrlEncode(unencodedString)));
            columnDetails.Add("UrlEncodedUnicode",
                (unencodedString => HttpUtility.UrlEncodeUnicode(unencodedString)));
            columnDetails.Add("UrlPathEncoded",
                (unencodedString => HttpUtility.UrlPathEncode(unencodedString)));
            columnDetails.Add("EscapedDataString",
                (unencodedString => Uri.EscapeDataString(unencodedString)));
            columnDetails.Add("EscapedUriString",
                (unencodedString => Uri.EscapeUriString(unencodedString)));
            columnDetails.Add("HtmlEncoded",
                (unencodedString => HttpUtility.HtmlEncode(unencodedString)));
            columnDetails.Add("HtmlAttributeEncoded",
                (unencodedString => HttpUtility.HtmlAttributeEncode(unencodedString)));
            columnDetails.Add("HexEscaped",
                (unencodedString
                    =>
                    {
                        // Uri.HexEscape can only handle the first 255 characters so for the 
                        //  Latin A-Extended characters, such as A, it will throw an 
                        //  ArgumentOutOfRange exception.                       
                        try
                        {
                            return Uri.HexEscape(unencodedString.ToCharArray()[0]);
                        }
                        catch
                        {
                            return "[OoR]";
                        }
                    }));

            char[] charactersToEncode = stringToEncode.ToCharArray();
            string[] stringCharactersToEncode = Array.ConvertAll<char, string>(charactersToEncode,
                (character => character.ToString()));
            DisplayCharacterTable<string>(stringCharactersToEncode, columnDetails);
        }

        private static void DisplayCharacterTable<TUnencoded>(TUnencoded[] unencodedArray,
            Dictionary<string, Func<TUnencoded, string>> mappings)
        {
            foreach (string key in mappings.Keys)
            {
                Console.Write(key.Replace(" ", "[space]") + " ");
            }
            Console.WriteLine();

            foreach (TUnencoded unencodedObject in unencodedArray)
            {
                string stringCharToEncode = unencodedObject.ToString();
                foreach (string columnHeader in mappings.Keys)
                {
                    int columnWidth = columnHeader.Length + 1;
                    Func<TUnencoded, string> encoder = mappings[columnHeader];
                    string encodedString = encoder(unencodedObject);

                    // ASSUMPTION: Column header will always be wider than encoded string.
                    Console.Write(encodedString.Replace(" ", "[space]").PadRight(columnWidth));
                }
                Console.WriteLine();
            }
        }
    }
}

回答by m1m1k

Ideally these would go in a class called "FileNaming" or maybe just rename Encode to "FileNameEncode". Note: these are not designed to handle Full Paths, just the folder and/or file names. Ideally you would Split("/") your full path first and then check the pieces. And obviously instead of a union, you could just add the "%" character to the list of chars not allowed in Windows, but I think it's more helpful/readable/factual this way. Decode() is exactly the same but switches the Replace(Uri.HexEscape(s[0]), s) "escaped" with the character.

理想情况下,这些将放在名为“FileNaming”的类中,或者只是将 Encode 重命名为“FileNameEncode”。注意:这些不是设计来处理完整路径的,只是文件夹和/或文件名。理想情况下,您会先 Split("/") 完整路径,然后再检查各个部分。显然,您可以将“%”字符添加到 Windows 中不允许的字符列表中,而不是联合,但我认为这样更有用/可读/更实际。Decode() 完全相同,但将 Replace(Uri.HexEscape(s[0]), s) 切换为“转义”字符。

public static List<string> urlEncodedCharacters = new List<string>
{
  "/", "\", "<", ">", ":", "\"", "|", "?", "%" //and others, but not *
};
//Since this is a superset of urlEncodedCharacters, we won't be able to only use UrlEncode() - instead we'll use HexEncode
public static List<string> specialCharactersNotAllowedInWindows = new List<string>
{
  "/", "\", "<", ">", ":", "\"", "|", "?", "*" //windows dissallowed character set
};

    public static string Encode(string fileName)
    {
        //CheckForFullPath(fileName); // optional: make sure it's not a path?
        List<string> charactersToChange = new List<string>(specialCharactersNotAllowedInWindows);
        charactersToChange.AddRange(urlEncodedCharacters.
            Where(x => !urlEncodedCharacters.Union(specialCharactersNotAllowedInWindows).Contains(x)));   // add any non duplicates (%)

        charactersToChange.ForEach(s => fileName = fileName.Replace(s, Uri.HexEscape(s[0])));   // "?" => "%3f"

        return fileName;
    }

Thanks @simon-tewsi for the very usefull table above!

感谢@simon-tewsi 提供上面非常有用的表格!

回答by Davut Gürbüz

In addition to @Dan Herbert's answer , You we should encode just the values generally.

除了@Dan Herbert 的回答之外,您通常应该只对值进行编码。

Split has params parameter Split('&','='); expression firstly split by & then '=' so odd elements are all values to be encoded shown below.

Split 有参数 Split('&','='); 表达式首先被 & 然后 '=' 分割,所以奇数元素都是要编码的值,如下所示。

public static void EncodeQueryString(ref string queryString)
{
    var array=queryString.Split('&','=');
    for (int i = 0; i < array.Length; i++) {
        string part=array[i];
        if(i%2==1)
        {               
            part=System.Web.HttpUtility.UrlEncode(array[i]);
            queryString=queryString.Replace(array[i],part);
        }
    }
}

回答by Charlie

The .NET implementation of UrlEncodedoes not comply with RFC 3986.

的 .NET 实现UrlEncode不符合 RFC 3986。

  1. Some characters are not encoded but should be. The !()*characters are listed in the RFC's section 2.2 as a reserved characters that must be encoded yet .NET fails to encode these characters.

  2. Some characters are encoded but should not be. The .-_characters are not listed in the RFC's section 2.2 as a reserved character that should not be encoded yet .NET erroneously encodes these characters.

  3. The RFC specifies that to be consistent, implementations should use upper-case HEXDIG, where .NET produces lower-case HEXDIG.

  1. 有些字符没有编码,但应该编码。这些!()*字符在 RFC 的第 2.2 节中列为必须编码的保留字符,但 .NET 无法对这些字符进行编码。

  2. 某些字符已编码,但不应编码。这些.-_字符未在 RFC 的第 2.2 节中列为不应编码的保留字符,但 .NET 错误地对这些字符进行了编码。

  3. RFC 指定要保持一致,实现应使用大写的 HEXDIG,其中 .NET 生成小写的 HEXDIG。

回答by Athari

Since .NET Framework 4.5and .NET Standard 1.0you should use WebUtility.UrlEncode. Advantages over alternatives:

.NET Framework 4.5.NET Standard 1.0 开始,您应该使用WebUtility.UrlEncode. 优于替代品的优势:

  1. It is part of .NET Framework 4.5+, .NET Core 1.0+, .NET Standard 1.0+, UWP 10.0+ and all Xamarin platforms as well. HttpUtility, while being available in .NET Framework earlier (.NET?Framework 1.1+), becomes available on other platforms much later (.NET?Core 2.0+, .NET?Standard 2.0+) and it still unavailable in UWP (see related question).

  2. In .NET Framework, it resides in System.dll, so it does not require any additional references, unlike HttpUtility.

  3. It properly escapes characters for URLs, unlike Uri.EscapeUriString(see comments to drweb86's answer).

  4. It does not have any limits on the length of the string, unlike Uri.EscapeDataString(see related question), so it can be used for POST requests, for example.

  1. 它是 .NET Framework 4.5+、.NET Core 1.0+、.NET Standard 1.0+、UWP 10.0+ 和所有 Xamarin 平台的一部分。HttpUtility,虽然在早期的 .NET Framework(.NET?Framework 1.1+)中可用,但在之后的其他平台上可用(.NET?Core 2.0+,.NET?Standard 2.0+)并且在 UWP 中仍然不可用(请参阅相关问题)。

  2. 在 .NET Framework 中,它驻留在 中System.dll,因此与HttpUtility.

  3. 正确地转义了 URLs 的字符,不像Uri.EscapeUriString(参见对 drweb86 的回答的评论)。

  4. 它对string 的长度没有任何限制,不像Uri.EscapeDataString(参见相关问题),因此它可以用于 POST 请求,例如。