java Jsoup:获取所有标题标签
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/12988256/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Jsoup: get all heading tags
提问by Tropicalista
I'm trying to parse an html document with Jsoup to get all heading tags. In addition I need to group the heading tags as [h1] [h2] etc...
我正在尝试使用 Jsoup 解析 html 文档以获取所有标题标签。此外,我需要将标题标签分组为 [h1] [h2] 等...
hh = doc.select("h[0-6]");
but this give me an empty array.
但这给了我一个空数组。
回答by ollo
Your selector means h-Tag with attribute "0-6"here - not a regex. But you can combine multiple selectors instead: hh = doc.select("h0, h1, h2, h3, h4, h5, h6");
.
您的选择器在这里表示带有属性“0-6”的 h-Tag- 不是正则表达式。但是你可以组合多种选择,而不是:hh = doc.select("h0, h1, h2, h3, h4, h5, h6");
。
Grouping:do you need a group with all h-Tags + a group for each h1, h2, ... tag or only a group for each h1, h2, ... tag?
分组:您需要一个包含所有 h 标签的组 + 每个 h1、h2、... 标签的组还是每个 h1、h2、... 标签只需要一个组?
Here's an example how you can do this:
以下是如何执行此操作的示例:
// Group of all h-Tags
Elements hTags = doc.select("h1, h2, h3, h4, h5, h6");
// Group of all h1-Tags
Elements h1Tags = hTags.select("h1");
// Group of all h2-Tags
Elements h2Tags = hTags.select("h2");
// ... etc.
If you want a group for each h1, h2, ... tag you can drop first selector and replace hTags
with doc
in the others.
如果你想要一组为每个H1,H2,...标签则可以删除第一选择和替换hTags
用doc
在其他人。
回答by Sai Sunder
Use doc.select("h1,h2,h3,h4,h5,h6")to get all heading tags. Use doc.select("h1")to get each of those tags separately. See the various things you can do with a select statement in http://preciselyconcise.com/apis_and_installations/jsoup/j_selector.php
使用doc.select("h1,h2,h3,h4,h5,h6")获取所有标题标签。使用doc.select("h1")分别获取每个标签。在http://preciselyconcise.com/apis_and_installations/jsoup/j_selector.php 中查看可以使用 select 语句执行的各种操作
回答by Mike Slinn
Here is a Scala version of the answer that uses Ammonite's syntax to specify the Maven coordinates for Jsoup:
这是答案的 Scala 版本,它使用 Ammonite 的语法为 Jsoup 指定 Maven 坐标:
import $ivy.`org.jsoup:jsoup:1.11.3`
val html = scala.io.Source.fromURL("https://scalacourses.com").mkString
val doc = org.jsoup.Jsoup.parse(html)
doc.select("h1, h2, h3, h4, h5, h6, h7").eachText()