apache 目录名称中带有下划线的 URL 编码?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2222519/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 18:44:39  来源:igfitidea点击:

URL Encoding with Underscores in a Directory Name?

apacheencodingurl-rewritingdirectoryapache2

提问by leeand00

We've run into an odd argument where I work, and I may be wrong on this, so this is why I am asking.

我们在我工作的地方遇到了一个奇怪的争论,我可能错了,所以这就是我问的原因。

Our software outputs a directory to an Apache server that replaces an underscore with a %5F in the name of the directory.

我们的软件向 Apache 服务器输出一个目录,该目录将目录名称中的下划线替换为 %5F。

For instance if the name of the directory was listed as a string in our software it would be: "andy_test", but then when the software outputs the directory to the Apache server, it would become "andy%5Ftest". Unfortunately, when you access the url on the server it ends up becoming "andy%255Ftest".

例如,如果目录的名称在我们的软件中作为字符串列出,它将是:“andy_test”,但是当软件将目录输出到 Apache 服务器时,它将变为“andy%5Ftest”。不幸的是,当您访问服务器上的 url 时,它最终会变成“andy%255Ftest”。

Somehow this seems wrong to me, once again the progression is:

不知何故,这对我来说似乎是错误的,再一次的进展是:

  1. andy_test <- (as a string in the software)
  2. andy%5Ftest <- (listed as a directory on the server)
  3. andy%255Ftest <- (must be used when calling the same directory as a URL on the server from a web browser.)
  1. andy_test <-(作为软件中的字符串)
  2. andy%5Ftest <-(列为服务器上的目录)
  3. andy%255Ftest <-(在从 Web 浏览器调用与服务器上的 URL 相同的目录时必须使用。)

I'm assuming that "%5" is encoding for underscore, and that "%25" is encoding for "%".

我假设“%5”是下划线编码,而“%25”是“%”编码。

Now it would seem to me that the way that the directory name should be listed on the server would be just plain andy_test and if you were using an encoded URI then maybe you would end up with the "andy%5Ftest" to access the directory on the apache server.

现在在我看来,目录名称应该在服务器上列出的方式只是简单的 andy_test,如果您使用的是编码的 URI,那么也许您最终会使用“andy%5Ftest”来访问目录阿帕奇服务器。

I asked the guys on the backend about it, and they said that they were just: "encoding anything that was not a letter or a number.

我问了后端的人,他们说他们只是:“编码任何不是字母或数字的东西。

So I guess I'm a bit confused on this. Can you tell me who is right, and direct me to some information on why?

所以我想我对此有点困惑。你能告诉我谁是对的,并指导我了解一些关于为什么的信息吗?

回答by ziya

You should not encode the directory names as you create them (as you suggested). Encoding should only happen at the last stage where it is handed out to the browser. That's why you are ending up with 'double' encoding: %25 is % and 5F is the leftover from the first encoding of underscore.

您不应在创建目录名称时对其进行编码(如您所建议的那样)。编码应该只发生在它被分发给浏览器的最后阶段。这就是为什么你以“双”编码结束:%25 是 %,5F 是下划线第一次编码的剩余部分。

Also, note that you don't need to encode underscores according to rfc1738.

另外,请注意,您不需要根据rfc1738对下划线进行编码。

2.2. URL Character Encoding Issues

...

Thus, only alphanumerics, the special characters "$-_.+!*'(),", and reserved characters used for their reserved purposes may be used unencoded within a URL.

2.2. URL 字符编码问题

...

因此,只有字母数字、特殊字符“$-_.+!*'(),”和用于其保留目的的保留字符可以在 URL 中未编码地使用。

回答by Vinko Vrsalovic

There is double encoding happening in what you are showing. Two steps should be enough:

您所展示的内容中发生了双重编码。两步应该足够了:

andy_testis both the string in the software and the actual name of the directory or script in the filesystem (the resource the web server accesses)

andy_test既是软件中的字符串,也是文件系统中目录或脚本的实际名称(Web 服务器访问的资源)

andy%5Ftestis andy_testURL encoded. This string should the browser use (it's not really needed in the underscore case, but may be in other cases).

andy%5Ftestandy_testURL 编码的。该字符串应由浏览器使用(在下划线情况下并不是真正需要的,但在其他情况下可能需要)。

andy%255ftestis just andy_testURL encoded twice, which makes no sense, there should be no need to. Just decide WHERE you will do the encoding. If you do it both at the code level and at the webserver level this is what can happen and the result is broken links unless you are decoding two times again, which is not really needed nor sane.

andy%255ftest只是andy_testURL 编码两次,这是没有意义的,应该没有必要。只需决定您将在哪里进行编码。如果您在代码级别和网络服务器级别都这样做,这就是可能发生的情况,结果是链接断开,除非您再次解码两次,这不是真正需要也不是明智的。