用几行java代码读取url到字符串
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4328711/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Read url to string in few lines of java code
提问by Pomponius
I'm trying to find Java's equivalent to Groovy's:
我试图找到与 Groovy 等效的 Java:
String content = "http://www.google.com".toURL().getText();
I want to read content from a URL into string. I don't want to pollute my code with buffered streams and loops for such a simple task. I looked into apache's HttpClient but I also don't see a one or two line implementation.
我想将 URL 中的内容读入字符串。对于这样一个简单的任务,我不想用缓冲流和循环污染我的代码。我查看了 apache 的 HttpClient 但我也没有看到一两行的实现。
回答by Joseph Weissman
This answer refers to an older version of Java. You may want to look at ccleve's answer.
这个答案是指旧版本的 Java。你可能想看看 ccleve 的回答。
Here is the traditional way to do this:
这是执行此操作的传统方法:
import java.net.*;
import java.io.*;
public class URLConnectionReader {
public static String getText(String url) throws Exception {
URL website = new URL(url);
URLConnection connection = website.openConnection();
BufferedReader in = new BufferedReader(
new InputStreamReader(
connection.getInputStream()));
StringBuilder response = new StringBuilder();
String inputLine;
while ((inputLine = in.readLine()) != null)
response.append(inputLine);
in.close();
return response.toString();
}
public static void main(String[] args) throws Exception {
String content = URLConnectionReader.getText(args[0]);
System.out.println(content);
}
}
As @extraneon has suggested, ioutilsallows you to do this in a very eloquent way that's still in the Java spirit:
正如@extraneon 所建议的那样,ioutils允许您以一种非常雄辩的方式执行此操作,并且仍然具有 Java 精神:
InputStream in = new URL( "http://jakarta.apache.org" ).openStream();
try {
System.out.println( IOUtils.toString( in ) );
} finally {
IOUtils.closeQuietly(in);
}
回答by extraneon
If you have the input stream (see Joe's answer) also consider ioutils.toString( inputstream ).
如果您有输入流(请参阅 Joe 的回答),还可以考虑 ioutils.toString( inputstream )。
http://commons.apache.org/io/api-1.4/org/apache/commons/io/IOUtils.html#toString(java.io.InputStream)
http://commons.apache.org/io/api-1.4/org/apache/commons/io/IOUtils.html#toString(java.io.InputStream)
回答by ccleve
Now that some time has passed since the original answer was accepted, there's a better approach:
现在自从最初的答案被接受以来已经过去了一段时间,有一个更好的方法:
String out = new Scanner(new URL("http://www.google.com").openStream(), "UTF-8").useDelimiter("\A").next();
If you want a slightly fuller implementation, which is not a single line, do this:
如果你想要一个稍微完整的实现,这不是一行,请执行以下操作:
public static String readStringFromURL(String requestURL) throws IOException
{
try (Scanner scanner = new Scanner(new URL(requestURL).openStream(),
StandardCharsets.UTF_8.toString()))
{
scanner.useDelimiter("\A");
return scanner.hasNext() ? scanner.next() : "";
}
}
回答by steve
Or just use Apache Commons IOUtils.toString(URL url)
, or the variant that also accepts an encoding parameter.
或者只是使用 Apache Commons IOUtils.toString(URL url)
,或者也接受编码参数的变体。
回答by takacsot
Additional example using Guava:
使用番石榴的其他示例:
URL xmlData = ...
String data = Resources.toString(xmlData, Charsets.UTF_8);
回答by Jeanne Boyarsky
Now that more time has passed, here's a way to do it in Java 8:
现在已经过去了更多的时间,这里有一种在 Java 8 中实现的方法:
URLConnection conn = url.openConnection();
try (BufferedReader reader = new BufferedReader(new InputStreamReader(conn.getInputStream(), StandardCharsets.UTF_8))) {
pageText = reader.lines().collect(Collectors.joining("\n"));
}
回答by Brad Parks
The following works with Java 7/8, secure urls, and shows how to add a cookie to your request as well. Note this is mostly a direct copy of this other great answer on this page, but added the cookie example, and clarification in that it works with secure urls as well ;-)
以下内容适用于 Java 7/8、安全 url,并展示了如何将 cookie 添加到您的请求中。请注意,这主要是此页面上其他出色答案的直接副本,但添加了 cookie 示例,并说明它也适用于安全 url ;-)
If you need to connect to a server with an invalid certificate or self signed certificate, this will throw security errors unless you import the certificate. If you need this functionality, you could consider the approach detailed in this answerto this related question on StackOverflow.
如果您需要使用无效证书或自签名证书连接到服务器,除非您导入证书,否则这将引发安全错误。如果您需要此功能,您可以考虑在 StackOverflow 上这个相关问题的答案中详述的方法。
Example
例子
String result = getUrlAsString("https://www.google.com");
System.out.println(result);
outputs
产出
<!doctype html><html itemscope="" .... etc
Code
代码
import java.net.URL;
import java.net.URLConnection;
import java.io.BufferedReader;
import java.io.InputStreamReader;
public static String getUrlAsString(String url)
{
try
{
URL urlObj = new URL(url);
URLConnection con = urlObj.openConnection();
con.setDoOutput(true); // we want the response
con.setRequestProperty("Cookie", "myCookie=test123");
con.connect();
BufferedReader in = new BufferedReader(new InputStreamReader(con.getInputStream()));
StringBuilder response = new StringBuilder();
String inputLine;
String newLine = System.getProperty("line.separator");
while ((inputLine = in.readLine()) != null)
{
response.append(inputLine + newLine);
}
in.close();
return response.toString();
}
catch (Exception e)
{
throw new RuntimeException(e);
}
}
回答by Sean Reilly
There's an even better way as of Java 9:
从 Java 9 开始,还有更好的方法:
URL u = new URL("http://www.example.com/");
try (InputStream in = u.openStream()) {
return new String(in.readAllBytes(), StandardCharsets.UTF_8);
}
Like the original groovy example, this assumes that the content is UTF-8 encoded. (If you need something more clever than that, you need to create a URLConnection and use it to figure out the encoding.)
与原始 groovy 示例一样,这假定内容是 UTF-8 编码的。(如果您需要比这更聪明的东西,您需要创建一个 URLConnection 并使用它来确定编码。)
回答by Dave
Here's Jeanne's lovely answer, but wrapped in a tidy function for muppets like me:
这是珍妮的可爱答案,但为像我这样的布偶提供了一个整洁的功能:
private static String getUrl(String aUrl) throws MalformedURLException, IOException
{
String urlData = "";
URL urlObj = new URL(aUrl);
URLConnection conn = urlObj.openConnection();
try (BufferedReader reader = new BufferedReader(new InputStreamReader(conn.getInputStream(), StandardCharsets.UTF_8)))
{
urlData = reader.lines().collect(Collectors.joining("\n"));
}
return urlData;
}
回答by jschnasse
URL to String in pure Java
纯Java中字符串的URL
Example call
示例调用
String str = getStringFromUrl("YourUrl");
Implementation
执行
You can use the method described in this answer, on How to read URL to an InputStreamand combine it with this answer on How to read InputStream to String.
您可以使用此答案中描述的方法,关于如何读取 URL 到 InputStream并将其与关于如何读取 InputStream 到 String 的答案结合使用。
The outcome will be something like
结果将是这样的
public String getStringFromUrl(URL url) throws IOException {
return inputStreamToString(urlToInputStream(url,null));
}
public String inputStreamToString(InputStream inputStream) throws IOException {
try(ByteArrayOutputStream result = new ByteArrayOutputStream()) {
byte[] buffer = new byte[1024];
int length;
while ((length = inputStream.read(buffer)) != -1) {
result.write(buffer, 0, length);
}
return result.toString(UTF_8);
}
}
private InputStream urlToInputStream(URL url, Map<String, String> args) {
HttpURLConnection con = null;
InputStream inputStream = null;
try {
con = (HttpURLConnection) url.openConnection();
con.setConnectTimeout(15000);
con.setReadTimeout(15000);
if (args != null) {
for (Entry<String, String> e : args.entrySet()) {
con.setRequestProperty(e.getKey(), e.getValue());
}
}
con.connect();
int responseCode = con.getResponseCode();
/* By default the connection will follow redirects. The following
* block is only entered if the implementation of HttpURLConnection
* does not perform the redirect. The exact behavior depends to
* the actual implementation (e.g. sun.net).
* !!! Attention: This block allows the connection to
* switch protocols (e.g. HTTP to HTTPS), which is <b>not</b>
* default behavior. See: https://stackoverflow.com/questions/1884230
* for more info!!!
*/
if (responseCode < 400 && responseCode > 299) {
String redirectUrl = con.getHeaderField("Location");
try {
URL newUrl = new URL(redirectUrl);
return urlToInputStream(newUrl, args);
} catch (MalformedURLException e) {
URL newUrl = new URL(url.getProtocol() + "://" + url.getHost() + redirectUrl);
return urlToInputStream(newUrl, args);
}
}
/*!!!!!*/
inputStream = con.getInputStream();
return inputStream;
} catch (Exception e) {
throw new RuntimeException(e);
}
}
Pros
优点
It is pure java
It can be easily enhanced by adding different headers (instead of passing a null object, like the example above does), authentication, etc.
Handling of protocol switches is supported
它是纯Java
通过添加不同的标头(而不是像上面的示例那样传递空对象)、身份验证等,可以轻松地增强它。
支持协议切换的处理