Java JSoup 获取 URL 错误

Question

提问by PICKAB00

I'm creating an application which will enable me to fetch values from a specific website to the console. The value is from a <span>element and I'm using JSoup.

我正在创建一个应用程序，它使我能够从特定网站获取值到控制台。该值来自一个<span>元素，我正在使用JSoup。

My challenge has to do with this error:

我的挑战与这个错误有关：

Error fetching URL

获取网址时出错

Here is my Java code:

这是我的Java代码：

public class TestSl {
    public static void main(String[] args) throws IOException {
        Document doc = Jsoup.connect("https://stackoverflow.com/questions/11970938/java-html-parser-to-extract-specific-data").get();
        Elements spans = doc.select("span[class=hidden-text]");
        for (Element span: spans) {
            System.out.println(span.text());
        }
    }
}

And here is the error on Console:

这是控制台上的错误：

Exception in thread "main" org.jsoup.HttpStatusException: HTTP error fetching URL. Status=403, URL=Java Html parser to extract specific data?at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:590) at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:540) at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:227) at org.jsoup.helper.HttpConnection.get(HttpConnection.java:216) at TestSl.main(TestSl.java:19)

线程“main”org.jsoup.HttpStatusException 中的异常：获取 URL 的 HTTP 错误。Status=403, URL= Java Html 解析器提取具体数据？在 org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:590) 在 org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:540) 在 org.jsoup.helper.HttpConnection.execute(HttpConnection) .java:227) 在 org.jsoup.helper.HttpConnection.get(HttpConnection.java:216) 在 TestSl.main(TestSl.java:19)

What am I doing wrong and how can I resolve it?

我做错了什么，我该如何解决？

Answer 1

回答by Jared Rummler

Set the user-agent header:

设置用户代理标头：

.userAgent("Mozilla")

Example:

例子：

Document document = Jsoup.connect("https://stackoverflow.com/questions/11970938/java-html-parser-to-extract-specific-data").userAgent("Mozilla").get();
Elements elements = document.select("span.hidden-text");
for (Element element : elements) {
  System.out.println(element.text());
}

Stack Exchange
Inbox
Reputation and Badges

堆栈交换
收件箱
声誉和徽章

source: https://stackoverflow.com/a/7523425/1048340

来源：https: //stackoverflow.com/a/7523425/1048340

Perhaps this is related: https://meta.stackexchange.com/questions/277369/a-terms-of-service-update-restricting-companies-that-scrape-your-profile-informa

也许这是相关的：https: //meta.stackexchange.com/questions/277369/a-terms-of-service-update-restricting-companies-that-scrape-your-profile-informa

Java JSoup 获取 URL 错误

提问by PICKAB00

回答by Jared Rummler

相关推荐

最近更新

标签

Java JSoup 获取 URL 错误

提问by PICKAB00

回答by Jared Rummler

相关推荐

Jackson 注解 JsonFormat$Value json java.lang.NoSuchMethodError

java 如何通过 start-stop-daemon 优雅地关闭 Spring Boot 应用程序

java 如何使用java从智能卡读取文件

java WFLYJPA0060：执行此操作需要事务（使用事务或扩展持久性上下文）

相关推荐

最近更新

标签