如何使用 Java 检查 URL 是否存在或返回 404?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1378199/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-12 11:21:15  来源:igfitidea点击:

How to check if a URL exists or returns 404 with Java?

javaurlhttp-status-code-404

提问by Sergio del Amo

String urlString = "http://www.nbc.com/Heroes/novels/downloads/Heroes_novel_001.pdf";
URL url = new URL(urlString);
if(/* Url does not return 404 */) {
    System.out.println("exists");
} else {
    System.out.println("does not exists");
}
urlString = "http://www.nbc.com/Heroes/novels/downloads/Heroes_novel_190.pdf";
url = new URL(urlString);
if(/* Url does not return 404 */) {
    System.out.println("exists");
} else {
    System.out.println("does not exists");
}

This should print

这应该打印

exists
does not exists

TEST

测试

public static String URL = "http://www.nbc.com/Heroes/novels/downloads/";

public static int getResponseCode(String urlString) throws MalformedURLException, IOException {
    URL u = new URL(urlString); 
    HttpURLConnection huc =  (HttpURLConnection)  u.openConnection(); 
    huc.setRequestMethod("GET"); 
    huc.connect(); 
    return huc.getResponseCode();
}

System.out.println(getResponseCode(URL + "Heroes_novel_001.pdf")); 
System.out.println(getResponseCode(URL + "Heroes_novel_190.pdf"));   
System.out.println(getResponseCode("http://www.example.com")); 
System.out.println(getResponseCode("http://www.example.com/junk"));           

Output

输出

200
200
200
404

200
200
200
404

SOLUTION

解决方案

Add the next line before .connect() and the output would be 200, 404, 200, 404

在 .connect() 之前添加下一行,输出将是 200, 404, 200, 404

huc.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2 (.NET CLR 3.5.30729)");

回答by Brian Agnew

Use HttpUrlConnectionby calling openConnection()on your URL object.

通过调用您的 URL 对象来使用HttpUrlConnectionopenConnection()

getResponseCode()will give you the HTTP response once you've read from the connection.

一旦您从连接中读取信息,getResponseCode()就会为您提供 HTTP 响应。

e.g.

例如

   URL u = new URL("http://www.example.com/"); 
   HttpURLConnection huc = (HttpURLConnection)u.openConnection(); 
   huc.setRequestMethod("GET"); 
   huc.connect() ; 
   OutputStream os = huc.getOutputStream(); 
   int code = huc.getResponseCode(); 

(not tested)

(未测试)

回答by ZZ Coder

There is nothing wrong with your code. It's the NBC.com doing tricks on you. When NBC.com decides that your browser is not capable of displaying PDF, it simply sends back a webpage regardless what you are requesting, even if it doesn't exist.

您的代码没有任何问题。是 NBC.com 在欺骗你。当 NBC.com 确定您的浏览器无法显示 PDF 时,它只会发送回一个网页,无论您请求什么,即使它不存在。

You need to trick it back by telling it your browser is capable, something like,

你需要通过告诉它你的浏览器有能力来欺骗它,比如,

conn.setRequestProperty("User-Agent",
    "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.0.13) Gecko/2009073021 Firefox/3.0.13");

回答by RealHowTo

You may want to add

您可能想要添加

HttpURLConnection.setFollowRedirects(false);
// note : or
//        huc.setInstanceFollowRedirects(false)

if you don't want to follow redirection (3XX)

如果您不想遵循重定向 (3XX)

Instead of doing a "GET", a "HEAD" is all you need.

您只需要一个“HEAD”而不是“GET”。

huc.setRequestMethod("HEAD");
return (huc.getResponseCode() == HttpURLConnection.HTTP_OK);

回答by mmm

this worked for me:

这对我有用:

URL u = new URL ( "http://www.example.com/");
HttpURLConnection huc =  ( HttpURLConnection )  u.openConnection (); 
huc.setRequestMethod ("GET");  //OR  huc.setRequestMethod ("HEAD"); 
huc.connect () ; 
int code = huc.getResponseCode() ;
System.out.println(code);

thanks for the suggestions above.

感谢上述建议。

回答by BullyWiiPlaza

Based on the given answers and information in the question, this is the code you should use:

根据问题中给出的答案和信息,这是您应该使用的代码:

public static boolean doesURLExist(URL url) throws IOException
{
    // We want to check the current URL
    HttpURLConnection.setFollowRedirects(false);

    HttpURLConnection httpURLConnection = (HttpURLConnection) url.openConnection();

    // We don't need to get data
    httpURLConnection.setRequestMethod("HEAD");

    // Some websites don't like programmatic access so pretend to be a browser
    httpURLConnection.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2 (.NET CLR 3.5.30729)");
    int responseCode = httpURLConnection.getResponseCode();

    // We only accept response code 200
    return responseCode == HttpURLConnection.HTTP_OK;
}

Of course tested and working.

当然经过测试和工作。