如何在 Java 中检查有效的 URL?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2230676/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-13 04:59:36  来源:igfitidea点击:

How to check for a valid URL in Java?

javavalidationurl

提问by Eric Wilson

What is the best way to check if a URL is valid in Java?

在 Java 中检查 URL 是否有效的最佳方法是什么?

If tried to call new URL(urlString)and catch a MalformedURLException, but it seems to be happy with anything that begins with http://.

如果试图调用new URL(urlString)并捕获 a MalformedURLException,但它似乎对任何以 开头的东西都很满意http://

I'm not concerned about establishing a connection, just validity. Is there a method for this? An annotation in Hibernate Validator? Should I use a regex?

我不关心建立连接,只关心有效性。有没有办法做到这一点?Hibernate Validator 中的注释?我应该使用正则表达式吗?

Edit:Some examples of accepted URLs are http://***and http://my favorite site!.

编辑:接受的 URL 的一些示例是http://***http://my favorite site!

采纳答案by Tendayi Mawushe

Consider using the Apache Commons UrlValidator class

考虑使用Apache Commons UrlValidator 类

UrlValidator urlValidator = new UrlValidator();
urlValidator.isValid("http://my favorite site!");

There are several properties that you can set to control how this class behaves, by default http, https, and ftpare accepted.

您可以设置几个属性来控制此类的行为方式,默认情况下httphttps、 和ftp

回答by Adam Matan

validator package:

验证器包:

There seems to be a nice package by Yonatan Matalon called UrlUtil. Quoting its API:

Yonatan Matalon似乎有一个不错的包,称为 UrlUtil。引用它的 API:

isValidWebPageAddress(java.lang.String address, boolean validateSyntax, 
                      boolean validateExistance) 
Checks if the given address is a valid web page address.

Sun's approach - check the network address

Sun的做法——检查网络地址

Sun's Java site offers connect attempt as a solutionfor validating URLs.

Sun 的 Java 站点提供连接尝试作为验证 URL的解决方案

Other regex code snippets:

其他正则表达式代码片段:

There are regex validation attempts at Oracle's siteand weberdev.com.

Oracle 站点weberdev.com 上有正则表达式验证尝试。

回答by uckelman

Judging by the source code for URI, the

从源代码来看URI

public URL(URL context, String spec, URLStreamHandler handler)

constructor does more validation than the other constructors. You might try that one, but YMMV.

构造函数比其他构造函数做更多的验证。你可以试试那个,但 YMMV。

回答by Prasanna Pilla

Here is way I tried and found useful,

这是我尝试并发现有用的方法,

URL u = new URL(name); // this would check for the protocol
u.toURI(); // does the extra checking required for validation of URI 

回答by user123444555621

I'd love to post this as a comment to Tendayi Mawushe's answer, but I'm afraid there is not enough space ;)

我很想将此作为对Tendayi Mawushe 的回答的评论发布,但恐怕没有足够的空间 ;)

This is the relevant part from the Apache Commons UrlValidator source:

这是来自 Apache Commons UrlValidator的相关部分:

/**
 * This expression derived/taken from the BNF for URI (RFC2396).
 */
private static final String URL_PATTERN =
        "/^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?/";
//         12            3  4          5       6   7        8 9

/**
 * Schema/Protocol (ie. http:, ftp:, file:, etc).
 */
private static final int PARSE_URL_SCHEME = 2;

/**
 * Includes hostname/ip and port number.
 */
private static final int PARSE_URL_AUTHORITY = 4;

private static final int PARSE_URL_PATH = 5;

private static final int PARSE_URL_QUERY = 7;

private static final int PARSE_URL_FRAGMENT = 9;

You can easily build your own validator from there.

您可以从那里轻松构建自己的验证器。

回答by isapir

I didn't like any of the implementations (because they use a Regex which is an expensive operation, or a library which is an overkill if you only need one method), so I ended up using the java.net.URI class with some extra checks, and limiting the protocols to: http, https, file, ftp, mailto, news, urn.

我不喜欢任何实现(因为它们使用 Regex,这是一项昂贵的操作,或者如果您只需要一种方法,则使用一个过度杀伤的库),所以我最终使用了 java.net.URI 类和一些额外检查,并将协议限制为:http、https、file、ftp、mailto、news、urn。

And yes, catching exceptions can be an expensive operation, but probably not as bad as Regular Expressions:

是的,捕获异常可能是一项昂贵的操作,但可能没有正则表达式那么糟糕:

final static Set<String> protocols, protocolsWithHost;

static {
  protocolsWithHost = new HashSet<String>( 
      Arrays.asList( new String[]{ "file", "ftp", "http", "https" } ) 
  );
  protocols = new HashSet<String>( 
      Arrays.asList( new String[]{ "mailto", "news", "urn" } ) 
  );
  protocols.addAll(protocolsWithHost);
}

public static boolean isURI(String str) {
  int colon = str.indexOf(':');
  if (colon < 3)                      return false;

  String proto = str.substring(0, colon).toLowerCase();
  if (!protocols.contains(proto))     return false;

  try {
    URI uri = new URI(str);
    if (protocolsWithHost.contains(proto)) {
      if (uri.getHost() == null)      return false;

      String path = uri.getPath();
      if (path != null) {
        for (int i=path.length()-1; i >= 0; i--) {
          if ("?<>:*|\"".indexOf( path.charAt(i) ) > -1)
            return false;
        }
      }
    }

    return true;
  } catch ( Exception ex ) {}

  return false;
}

回答by Andrei Volgin

My favorite approach, without external libraries:

我最喜欢的方法,没有外部库:

try {
    URI uri = new URI(name);

    // perform checks for scheme, authority, host, etc., based on your requirements

    if ("mailto".equals(uri.getScheme()) {/*Code*/}
    if (uri.getHost() == null) {/*Code*/}

} catch (URISyntaxException e) {
}

回答by Joe

The most "foolproof" way is to check for the availability of URL:

最“万无一失”的方法是检查 URL 的可用性:

public boolean isURL(String url) {
  try {
     (new java.net.URL(url)).openStream().close();
     return true;
  } catch (Exception ex) { }
  return false;
}