Java - 将字符串转换为有效的 URI 对象

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/573184/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-11 16:21:53  来源:igfitidea点击:

Java - Convert String to valid URI object

javaandroidencodingutf-8

提问by lostInTransit

I am trying to get a java.net.URIobject from a String. The string has some characters which will need to be replaced by their percentage escape sequences. But when I use URLEncoder to encode the String with UTF-8 encoding, even the / are replaced with their escape sequences.

我正在尝试java.net.URIString. 该字符串有一些字符需要用它们的百分比转义序列替换。但是当我使用 URLEncoder 用 UTF-8 编码对字符串进行编码时,甚至 / 也被替换为它们的转义序列。

How can I get a valid encoded URL from a String object?

如何从 String 对象获取有效的编码 URL?

http://www.google.com?q=abgives http%3A%2F%2www.google.com...whereas I want the output to be http://www.google.com?q=a%20b

http://www.google.com?q=ab给出http%3A%2F%2www.google.com ...而我希望输出为http://www.google.com?q=a% 20b

Can someone please tell me how to achieve this.

有人可以告诉我如何实现这一目标。

I am trying to do this in an Android app. So I have access to a limited number of libraries.

我正在尝试在 Android 应用程序中执行此操作。所以我可以访问有限数量的图书馆。

采纳答案by Hans Doggen

You might try: org.apache.commons.httpclient.util.URIUtil.encodeQueryin Apache commons-httpclientproject

您可以尝试:org.apache.commons.httpclient.util.URIUtil.encodeQueryApache commons-httpclient项目中

Like this (see URIUtil):

像这样(见URIUtil):

URIUtil.encodeQuery("http://www.google.com?q=a b")

will become:

会变成:

http://www.google.com?q=a%20b

You can of course do it yourself, but URI parsing can get pretty messy...

你当然可以自己做,但 URI 解析会变得非常混乱......

回答by Jason Day

You can use the multi-argument constructors of the URIclass. From the URIjavadoc:

您可以使用类的多参数构造函数URI。从URIjavadoc:

The multi-argument constructors quote illegal characters as required by the components in which they appear. The percent character ('%') is always quoted by these constructors. Any other characters are preserved.

多参数构造函数根据它们出现的组件的要求引用非法字符。这些构造函数总是引用百分比字符 ('%')。保留任何其他字符。

So if you use

所以如果你使用

URI uri = new URI("http", "www.google.com?q=a b");

Then you get http:www.google.com?q=a%20bwhich isn't quite right, but it's a little closer.

然后你会发现http:www.google.com?q=a%20b哪个不太正确,但它更接近一些。

If you know that your string will not have URL fragments (e.g. http://example.com/page#anchor), then you can use the following code to get what you want:

如果您知道您的字符串不会包含 URL 片段(例如http://example.com/page#anchor),那么您可以使用以下代码来获取您想要的内容:

String s = "http://www.google.com?q=a b";
String[] parts = s.split(":",2);
URI uri = new URI(parts[0], parts[1], null);

To be safe, you should scan the string for #characters, but this should get you started.

为安全起见,您应该扫描字符串中的#字符,但这应该会让您开始。

回答by TofuBeer

The java.net blog had a class the other day that might have done what you want (but it is down right now so I cannot check).

前几天 java.net 博客有一个课程,可能已经完成了你想要的(但它现在已经关闭,所以我无法检查)。

This code here could probably be modified to do what you want:

可以修改此处的代码以执行您想要的操作:

http://svn.apache.org/repos/asf/incubator/shindig/trunk/java/common/src/main/java/org/apache/shindig/common/uri/UriBuilder.java

http://svn.apache.org/repos/asf/incubator/shindig/trunk/java/common/src/main/java/org/apache/shindig/common/uri/UriBuilder.java

Here is the one I was thinking of from java.net: https://urlencodedquerystring.dev.java.net/

这是我从 java.net 想到的一个:https: //urlencodedquerystring.dev.java.net/

回答by Tim Cooper

If you don't like libraries, how about this?

如果你不喜欢图书馆,这个怎么样?

Note that you should not use this function on the whole URL, instead you should use this on the components...e.g. just the "a b" component, as you build up the URL - otherwise the computer won't know what characters are supposed to have a special meaning and which ones are supposed to have a literal meaning.

请注意,您不应在整个 URL 上使用此函数,而应在组件上使用此函数……例如,在构建 URL 时仅使用“ab”组件 - 否则计算机将不知道应该使用哪些字符具有特殊含义,哪些应该具有字面含义。

/** Converts a string into something you can safely insert into a URL. */
public static String encodeURIcomponent(String s)
{
    StringBuilder o = new StringBuilder();
    for (char ch : s.toCharArray()) {
        if (isUnsafe(ch)) {
            o.append('%');
            o.append(toHex(ch / 16));
            o.append(toHex(ch % 16));
        }
        else o.append(ch);
    }
    return o.toString();
}

private static char toHex(int ch)
{
    return (char)(ch < 10 ? '0' + ch : 'A' + ch - 10);
}

private static boolean isUnsafe(char ch)
{
    if (ch > 128 || ch < 0)
        return true;
    return " %$&+,/:;=?@<>#%".indexOf(ch) >= 0;
}

回答by MrCranky

Or perhaps you could use this class:

或者你可以使用这个类:

http://developer.android.com/reference/java/net/URLEncoder.html

http://developer.android.com/reference/java/net/URLEncoder.html

Which is present in Android since API level 1.

自 API 级别 1 起就存在于 Android 中。

Annoyingly however, it treats spaces specially (replacing them with + instead of %20). To get round this we simply use this fragment:

然而,令人讨厌的是,它特别对待空格(用 + 而不是 %20 替换它们)。为了解决这个问题,我们只需使用这个片段:

URLEncoder.encode(value, "UTF-8").replace("+", "%20");

URLEncoder.encode(value, "UTF-8").replace("+", "%20");

回答by bensnider

Android has always had the Uri class as part of the SDK: http://developer.android.com/reference/android/net/Uri.html

Android 一直将 Uri 类作为 SDK 的一部分:http: //developer.android.com/reference/android/net/Uri.html

You can simply do something like:

您可以简单地执行以下操作:

String requestURL = String.format("http://www.example.com/?a=%s&b=%s", Uri.encode("foo bar"), Uri.encode("100% fubar'd"));

回答by Hervé Donner

I had similar problems for one of my projects to create a URI object from a string. I couldn't find any clean solution either. Here's what I came up with :

我的一个项目从字符串创建 URI 对象时遇到了类似的问题。我也找不到任何干净的解决方案。这是我想出的:

public static URI encodeURL(String url) throws MalformedURLException, URISyntaxException  
{
    URI uriFormatted = null; 

    URL urlLink = new URL(url);
    uriFormatted = new URI("http", urlLink.getHost(), urlLink.getPath(), urlLink.getQuery(), urlLink.getRef());

    return uriFormatted;
}

You can use the following URI constructor instead to specify a port if needed:

如果需要,您可以使用以下 URI 构造函数来指定端口:

URI uri = new URI(scheme, userInfo, host, port, path, query, fragment);

回答by Craig B

I'm going to add one suggestion here aimed at Android users. You can do this which avoids having to get any external libraries. Also, all the search/replace characters solutions suggested in some of the answers above are perilous and should be avoided.

我将在这里添加一项针对 Android 用户的建议。您可以这样做,以避免必须获得任何外部库。此外,上述一些答案中建议的所有搜索/替换字符解决方案都是危险的,应该避免。

Give this a try:

试试这个:

String urlStr = "http://abc.dev.domain.com/0007AC/ads/800x480 15sec h.264.mp4";
URL url = new URL(urlStr);
URI uri = new URI(url.getProtocol(), url.getUserInfo(), url.getHost(), url.getPort(), url.getPath(), url.getQuery(), url.getRef());
url = uri.toURL();

You can see that in this particular URL, I need to have those spaces encoded so that I can use it for a request.

您可以看到在这个特定的 URL 中,我需要对这些空格进行编码,以便我可以将其用于请求。

This takes advantage of a couple features available to you in Android classes. First, the URL class can break a url into its proper components so there is no need for you to do any string search/replace work. Secondly, this approach takes advantage of the URI class feature of properly escaping components when you construct a URI via components rather than from a single string.

这利用了 Android 类中可用的几个功能。首先,URL 类可以将 url 分解为其适当的组件,因此您无需进行任何字符串搜索/替换工作。其次,当您通过组件而不是从单个字符串构造 URI 时,这种方法利用了正确转义组件的 URI 类特性。

The beauty of this approach is that you can take any valid url string and have it work without needing any special knowledge of it yourself.

这种方法的美妙之处在于您可以使用任何有效的 url 字符串并使其工作,而无需您自己对此有任何特殊知识。

回答by Amol Ghotankar

Well I tried using

好吧,我尝试使用

String converted = URLDecoder.decode("toconvert","UTF-8");

I hope this is what you were actually looking for?

我希望这就是你真正要找的?

回答by dgiugg

Even if this is an old post with an already accepted answer, I post my alternative answer because it works well for the present issue and it seems nobody mentioned this method.

即使这是一个已经接受答案的旧帖子,我也会发布我的替代答案,因为它对当前问题很有效,而且似乎没有人提到这种方法。

With the java.net.URI library:

使用 java.net.URI 库:

URI uri = URI.create(URLString);

And if you want a URL-formatted string corresponding to it:

如果你想要一个与之对应的 URL 格式的字符串:

String validURLString = uri.toASCIIString();

Unlike many other methods (e.g. java.net.URLEncoder) this one replaces only unsafe ASCII characters (like ?, é...).

与许多其他方法(例如 java.net.URLEncoder)不同,该方法仅替换不安全的 ASCII 字符(例如?, é...)。



In the above example, if URLStringis the following String:

在上面的例子中,如果URLString是以下内容String

"http://www.domain.com/fa?on+word"

the resulting validURLStringwill be:

结果validURLString将是:

"http://www.domain.com/fa%C3%A7on+word"

which is a well-formatted URL.

这是一个格式良好的 URL。