我应该假定URL中的编码字符应位于哪个字符集？-IGI

时间：2020-03-06 14:47:38 　来源:igfitidea点击:

RFC 1738指定URL的语法，并提到

URLs are written only with the graphic
  printable characters of the

  US-ASCII coded character set. The
  octets 80-FF hexadecimal are not

  used in US-ASCII, and the octets 00-1F
  and 7F hexadecimal represent

  control characters; these must be
  encoded.

但是，它没有说这些八位位组代表什么代码。

RFC 2396似乎在尝试改善这种情况，但是：

For original character sequences that
  contain non-ASCII characters, however, the situation is more
  difficult. Internet protocols that transmit octet sequences intended to
  represent character sequences are expected to provide some way of
  identifying the charset used, if there might be more than one
  [RFC2277].  However, there is currently no provision within the
  generic URI syntax to accomplish this identification. An individual URI
  scheme may require a single charset, define a default charset, or
  provide a way to indicate the charset used.
  
  It is expected that a systematic treatment of character encoding within URI will be
  developed as a future modification of this specification.

客户端可以确定使用哪种字符集来解释编码八位位组，或者服务器可以确定客户端用来进行哪些编码的方式是否明确？

在我看来，大多数服务器都默认为UTF-8，但这实际上是一个选择，而不是指定的选择。

解决方案

根据报价，URL为ASCII。就这样。

URI OTOH，允许更大的字符集；通常是我们自己说的UTF-8.

要记住的一点是，URL是URI的子集。因此，真正的问题是，这些是我们在浏览器中编写的？我猜你可以写一个URI，浏览器应该尽力将其转换为URL(HTTP / 1.1支持，AFAICR)。对于非ASCII字符，表示十六进制代码，通常编码为UTF-8.

我相信我们正在寻找的规范是RFC 3987，它描述了IRIs国际化资源标识符。

我应该假定URL中的编码字符应位于哪个字符集？

解决方案

相关推荐

最近更新

标签

我应该假定URL中的编码字符应位于哪个字符集？

解决方案

相关推荐

哪些问题跟踪工具支持子票，它们如何有效地弥合项目经理和开发人员之间的鸿沟？

使用.NET 2.0运行时定位.NET Framework 3.5. 注意事项？

跟踪树的Javascript性能改进是否会找到其他解释语言的方式？

如何模拟ext3文件系统损坏？

相关推荐

最近更新

标签