使用带有 Java 的 Selenium WebDriver 从页面源中获取所有 href 链接
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/28163618/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Fetching all href links from the page source using Selenium WebDriver with Java
提问by QualityThoughts
I am trying to test HTTP RESPONSE of all href links on the page, using WebDriver to fetch all the links from the page and then using http.connect to get the response status.
我正在尝试测试页面上所有 href 链接的 HTTP RESPONSE,使用 WebDriver 从页面中获取所有链接,然后使用 http.connect 获取响应状态。
Code snippet to get links of anchor tag:
获取锚标签链接的代码片段:
List<WebElement> list = driver.findElements(By.cssSelector("a"));
for (WebElement link : list) {
System.out.println(link.getText());
}
But my page has many more href links which are not having anchor tag <a>
and might reside outside body of the page in header section or so. Some examples are as shown below. Above webdriver code wont solve in fetching all types of links. Also need to extract src links in some cases...
但是我的页面有更多的 href 链接,这些链接没有锚标记,<a>
并且可能位于页眉部分左右的页面正文之外。一些示例如下所示。上面的 webdriver 代码无法解决获取所有类型链接的问题。在某些情况下还需要提取 src 链接...
<script src="https://www.test.com/js/50/f59ae5bd.js"></script>
<link rel="stylesheet" href="www.test.com/css/27/5a92c391c7be2e9.css" rel="stylesheet" type="text/css" />
<link sizes="72x72" href="https://www.test.com/css/27/5a92c391c7b/kj32.png" />
<li><a href="https://www.test.com/webapps/mpp/resortcheck">resortcheck</a>
I would appreciate if someone can guide how to go about or has resolved similar issues in getting all href links from page.
如果有人可以指导如何处理或解决了从页面获取所有 href 链接的类似问题,我将不胜感激。
回答by vins
You can use Xpath to get all the elements containing the attributes href / src.
您可以使用 Xpath 获取包含属性 href / src 的所有元素。
List<WebElement> list=driver.findElements(By.xpath("//*[@href or @src]"));
I tried something like this to get all the links to the other resource files. It works fine.
我尝试了类似的方法来获取指向其他资源文件的所有链接。它工作正常。
WebDriver driver = new FirefoxDriver();
driver.get("http://www.google.com");
List<WebElement> list=driver.findElements(By.xpath("//*[@href or @src]"));
for(WebElement e : list){
String link = e.getAttribute("href");
if(null==link)
link=e.getAttribute("src");
System.out.println(e.getTagName() + "=" + link);
}
回答by Uday
What do you mean by links exists outside of body?
你所说的链接存在于身体之外是什么意思?
All links are identifiable by html tag. What other ways to represent links?
所有链接都可以通过 html 标签识别。还有什么其他方式来表示链接?
Check my below code may help:
检查我的以下代码可能会有所帮助:
public static void main(String[] args)
{
WebDriver driver = new FirefoxDriver();
driver.get("http://www.google.com/");
List<WebElement> links=driver.findElements(By.tagName("a"));
for(WebElement ele:links)
System.out.println(ele.getAttribute("href"));
}