你如何在 Java 中对 URL 进行转义?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/623861/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-11 17:04:17  来源:igfitidea点击:

How do you unescape URLs in Java?

javaurl-encoding

提问by Penchant

When I read the xml through a URL's InputStream, and then cut out everything except the url, I get "http://cliveg.bu.edu/people/sganguly/player/%20Rang%20De%20Basanti%20-%20Tu%20Bin%20Bataye.mp3".

当我通过 URL 的 InputStream 读取 xml,然后切掉 url 之外的所有内容时,我得到“ http://cliveg.bu.edu/people/sganguly/player/%20Rang%20De%20Basanti%20-%20Tu% 20Bin%20Bataye.mp3”。

As you can see, there are a lot of "%20"s.

如您所见,有很多“%20”。

I want the url to be unescaped.

我希望网址不被转义。

Is there any way to do this in Java, without using a third-party library?

有没有办法在 Java 中做到这一点,而不使用第三方库?

采纳答案by ng.

This is not unescaped XML, this is URL encoded text. Looks to me like you want to use the following on the URL strings.

这不是未转义的 XML,这是 URL 编码的文本。在我看来,您想在 URL 字符串上使用以下内容。

URLDecoder.decode(url);

This will give you the correct text. The result of decoding the like you provided is this.

这将为您提供正确的文本。像您提供的那样解码的结果是这样的。

http://cliveg.bu.edu/people/sganguly/player/ Rang De Basanti - Tu Bin Bataye.mp3

The %20 is an escaped space character. To get the above I used the URLDecoder object.

%20 是转义的空格字符。为了获得上述内容,我使用了 URLDecoder 对象。

回答by Mario

I'm having problems using this method when I have special characters like á, é, í, etc. My (probably wild) guess is widechars are not being encoded properly... well, at least I was expecting to see sequences like %uC2BFinstead of %C2%BF.

当我有像á, é,í等特殊字符时,我在使用这种方法时遇到了问题。我的(可能是疯狂的)猜测是宽字符没有被正确编码......好吧,至少我希望看到像%uC2BF而不是%C2%BF.

Edited: My bad, this post explains the difference between URL encoding and JavaScript's escape sequences: URI encoding in UNICODE for apache httpclient 4

编辑:我的错,这篇文章解释了 URL 编码和 JavaScript 转义序列之间的区别:UNICODE 中的 URI encoding for apache httpclient 4

回答by freedev

URLDecoder.decode(String s)is deprecated since Java 5

URLDecoder.decode(String s)自 Java 5 起已弃用

You should use URLDecoder.decode(String s, String enc).

你应该使用URLDecoder.decode(String s, String enc).

For example:

例如:

URLDecoder.decode(url, "UTF-8")

Regarding the encoding to use:

关于使用的编码:

Note: The World Wide Web Consortium Recommendationstates that UTF-8should be used. Not doing so may introduce incompatibilites.

注意:万维网联盟建议指出应使用UTF-8。不这样做可能会导致不兼容。