java 网页抓取java初学者

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/6446356/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-30 15:54:44  来源:igfitidea点击:

web scraping java beginner

javaweb-scrapinghtml-parsingwebharvesthtmlcleaner

提问by user807593

I am new to Java, I would like to become really good in web scraping and parsing data

我是 Java 新手,我想成为非常擅长网络抓取和解析数据的人

Are there any sites related to web scraping that would help me understand the how the APIs like htmcleaner, web-harvest, htmlparser work??

是否有任何与网络抓取相关的网站可以帮助我了解 htmcleaner、web-harvest、htmlparser 等 API 的工作原理?

I'm still not proficient enough in Java to look at their Javadocs and understand how all their methods work, and cannot find Java code examples(tutorials) on the web that would help me.

我对 Java 仍然不够精通,无法查看他们的 Javadoc 并了解他们所有方法的工作原理,并且无法在网络上找到对我有帮助的 Java 代码示例(教程)。

回答by Marsellus Wallace

Why don't you try with this library: JSoup?

你为什么不试试这个库:JSoup

The cookbook introduction is a good place where to start or you can go straight to the other specific code examples if you prefer.

说明书介绍是一个很好的起点,如果您愿意,也可以直接转到其他特定的代码示例。