Html XPath 从 IMG 标签解析“SRC”?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1179641/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-29 00:23:49  来源:igfitidea点击:

XPath to Parse "SRC" from IMG tag?

htmlparsingxpathscreen-scraping

提问by dMix

Right now I successfully grabbed the full element from an HTML page with this:

现在我成功地从一个 HTML 页面中抓取了完整的元素:

//img[@class='photo-large']

for example it would return this:

例如它会返回这个:

<img src="http://example.com/img.jpg" class='photo-large' />

But I only need the SRC url (http://example.com/img.jpg). Any help?

但我只需要 SRC url ( http://example.com/img.jpg)。有什么帮助吗?

回答by Jeff Yates

You are so close to answering this yourself that I am somewhat reluctant to answer it for you. However, the following XPath should provide what you want (provided the source is XHTML, of course).

你这么接近自己回答这个问题,我有点不愿意为你回答。但是,以下 XPath 应该提供您想要的(当然,前提是源是 XHTML)。

//img[@class='photo-large']/@src

For further tips, check out W3 Schools. They have excellent tutorials on such things and a great reference too.

如需更多提示,请查看W3 学校。他们有关于这些东西的优秀教程和很好的参考。

回答by andre-r

Using Hpricotthis works:

使用Hpricot这有效:

doc.at('//img[@class="photo-large"]')['src']

In case you have more than one image, the following gives an array:

如果您有多个图像,以下给出了一个数组:

doc.search('//img[@class="photo-large"]').map do |e| e['src'] end

However, Nokogiriis many times fasterand it “can be used as a drop in replacement”for Hpricot.
Here the version for Nokogiri, in which this XPath for selecting attributes works:

然而,引入nokogiri快很多倍,它“可以作为替代的下降”的角度来说,Hpricot。
这是 Nokogiri 的版本,用于选择属性的 XPath 在其中起作用:

doc.at('//img[@class="photo-large"]/@src').to_s

or for many images:

或对于许多图像:

doc.search('//img[@class="photo-large"]/@src').to_a

回答by nithish peddi

//img/@src

//img/@src

you can just go with this if you want a link of the image.

如果你想要图像的链接,你可以使用这个。

example:

例子:

<img alt="" class="avatar width-full rounded-2" height="230" src="https://avatars3.githubusercontent.com/...;s=460" width="230">