如何使用带有 Java 的 Selenium WebDriver 查找损坏的链接
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/23414150/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to find broken links using Selenium WebDriver with Java
提问by LearningCode
I want to verify broken links on a website and I am using this code:
我想验证网站上的断开链接,我正在使用以下代码:
public static int invalidLink;
String currentLink;
String temp;
public static void main(String[] args) throws IOException {
// Launch The Browser
WebDriver driver = new FirefoxDriver();
// Enter URL
driver.get("http://www.applicoinc.com");
// Get all the links URL
List<WebElement> ele = driver.findElements(By.tagName("a"));
System.out.println("size:" + ele.size());
boolean isValid = false;
for (int i = 0; i < ele.size(); i++) {
isValid = getResponseCode(ele.get(i).getAttribute("href"));
if (isValid) {
System.out.println("ValidLinks:" + ele.get(i).getAttribute("href"));
driver.get(ele.get(i).getAttribute("href"));
List<WebElement> ele1 = driver.findElements(By.tagName("a"));
System.out.println("InsideSize:" + ele1.size());
for (int j=0; j<ele1.size(); j++){
isValid = getResponseCode(ele.get(j).getAttribute("href"));
if (isValid) {
System.out.println("ValidLinks:" + ele.get(j).getAttribute("href"));
}
else{
System.out.println("InvalidLinks:"+ ele.get(j).getAttribute("href"));
}
}
} else {
System.out.println("InvalidLinks:"
+ ele.get(i).getAttribute("href"));
}
}
}
}
public static boolean getResponseCode(String urlString) {
boolean isValid = false;
try {
URL u = new URL(urlString);
HttpURLConnection h = (HttpURLConnection) u.openConnection();
h.setRequestMethod("GET");
h.connect();
System.out.println(h.getResponseCode());
if (h.getResponseCode() != 404) {
isValid = true;
}
} catch (Exception e) {
}
return isValid;
}
}
回答by Sighil
It seems, that some of your href attribute contains expressions which are not identified as url's. What comes immediately to mind is to use the try catch block to identify such url's. Try the following piece of code.
看来,您的某些 href 属性包含未标识为 url 的表达式。立即想到的是使用 try catch 块来识别此类 url。试试下面的一段代码。
package com.automation.test;
import java.io.IOException;
import java.net.HttpURLConnection;
import java.net.MalformedURLException;
import java.net.URL;
import java.util.List;
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.firefox.FirefoxDriver;
public class Test {
public static int invalidLink;
String currentLink;
String temp;
public static void main(String[] args) throws IOException {
// Launch The Browser
WebDriver driver = new FirefoxDriver();
// Enter Url
driver.get("file:///home/sighil/Desktop/file");
// Get all the links url
List<WebElement> ele = driver.findElements(By.tagName("a"));
System.out.println("size:" + ele.size());
boolean isValid = false;
for (int i = 0; i < ele.size(); i++) {
// System.out.println(ele.get(i).getAttribute("href"));
isValid = getResponseCode(ele.get(i).getAttribute("href"));
if (isValid) {
System.out.println("ValidLinks:"
+ ele.get(i).getAttribute("href"));
} else {
System.out.println("InvalidLinks:"
+ ele.get(i).getAttribute("href"));
}
}
}
public static boolean getResponseCode(String urlString) {
boolean isValid = false;
try {
URL u = new URL(urlString);
HttpURLConnection h = (HttpURLConnection) u.openConnection();
h.setRequestMethod("GET");
h.connect();
System.out.println(h.getResponseCode());
if (h.getResponseCode() != 404) {
isValid = true;
}
} catch (Exception e) {
}
return isValid;
}
}
I have modified getResponseCode to return boolean values based on whether the url is valid(true) or invalid(false).
我已经修改了 getResponseCode 以根据 url 是有效(真)还是无效(假)返回布尔值。
Hope this helps you.
希望这对你有帮助。
回答by Steve Weaver Crawford
I would keep it returning an int, and just have the MalformedURLException be a special case, returning -1.
我会保持它返回一个int,并且只将MalformedURLException 作为一个特例,返回-1。
public static int getResponseCode(String urlString) {
try {
URL u = new URL(urlString);
HttpURLConnection h = (HttpURLConnection) u.openConnection();
h.setRequestMethod("GET");
h.connect();
return h.getResponseCode();
} catch (MalformedURLException e) {
return -1;
}
}
EDIT: It seems you're sticking with the boolean approach, as I said before this has it's limitations but should work ok for demonstartion purposes.
编辑:似乎您坚持使用布尔方法,正如我之前所说,这有其局限性,但对于演示目的应该可以正常工作。
There is no reason to find all elements a second time taking the approach you have. Try this:
没有理由第二次使用您拥有的方法找到所有元素。尝试这个:
// Get all the links
List<WebElement> ele = driver.findElements(By.tagName("a"));
System.out.println("size:" + ele.size());
boolean isValid = false;
for (int i = 0; i < ele.size(); i++) {
string nextHref = ele.get(i).getAttribute("href");
isValid = getResponseCode(nextHref);
if (isValid) {
System.out.println("Valid Link:" + nextHref);
}
else {
System.out.println("INVALID Link:" + nextHref);
}
}
This is untested code, so if it does not work, please provide more detail than just saying 'it doesn't work', provide output & any stack traces/error messages if possible. Cheers
这是未经测试的代码,因此如果它不起作用,请提供更多详细信息而不仅仅是说“它不起作用”,如果可能,请提供输出和任何堆栈跟踪/错误消息。干杯
回答by Mayur Shah
You can try below code.
你可以试试下面的代码。
public static void main(String[] args) {
WebDriver driver = new FirefoxDriver();
List<String> brokenLinks = getBrokenURLs(driver, "http://mayurshah.in", 2, new ArrayList<String>());
for(String brokenLink : brokenLinks){
System.out.println(brokenLink);
}
}
public static List<String> getBrokenURLs(WebDriver driver, String appURL, int depth, List<String> links){
{
driver.navigate().to(appURL);
System.out.println("Depth is: " + depth);
while(depth > 0){
List<WebElement> linkElems = driver.findElements(By.tagName("a"));
for(WebElement linkElement : linkElems)
if(!links.contains(linkElement))
links.add(linkElement.getAttribute("href"));
for(String link : links)
getBrokenURLs(driver, link, --depth, links);
}
}
return getBrokenURLs(driver, links, new ArrayList<String>()) ;
}
public static List<String> getBrokenURLs(WebDriver driver, List<String> links, List<String> brokenLinks){
{
for(String link : brokenLinks){
driver.navigate().to(link);
if(driver.getTitle().contains("404 Page Not Found")){
brokenLinks.add(link);
}
}
}
return brokenLinks ;
}
In above code, I am first getting list of URLs from the first page. Now I am navigating to the first link of the IInd page and getting all URLs, this way I will keep on storing all URL by going to each page one by one, till depth is mentioned.
在上面的代码中,我首先从第一页获取 URL 列表。现在我导航到 IInd 页面的第一个链接并获取所有 URL,这样我将通过逐页访问每个页面来继续存储所有 URL,直到提到深度。
After collecting all URLs, I will verify validity of each URL one by one and return List of URLs with 404 page.
收集所有URL后,我会一一验证每个URL的有效性,并返回带有404页面的URL列表。
Hope that helps!
希望有帮助!
源代码:https: //softwaretestingboard.com/qna/1380/how-to-find-broken-links-images-from-page-using-webdriver#axzz4wM3UEZtq
回答by Java By Kiran
In web application we have to verify all the links whether they are broken means the after clicking on link ‘page not found' page displays. Below is the code:
在 Web 应用程序中,我们必须验证所有链接是否已损坏,这意味着单击链接后会显示“找不到页面”页面。下面是代码:
import java.net.HttpURLConnection;
import java.net.URL;
import java.util.List;
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.firefox.FirefoxDriver;
public class VerifyLinks {
public static void main(String[] args) {
WebDriver driver = new FirefoxDriver();
driver.manage().window().maximize();
driver.get("https://www.google.co.in");
List< WebElement > allLink = driver.findElements(By.tagName("a"));
System.out.println("Total links are " + allLink.size());
for (int i = 0; i < allLink.size(); i++) {
WebElement ele = allLink.get(i);
String url = ele.getAttribute("href");
verifyLinkActive(url);
}
}
public static void verifyLinkActive(String linkurl) {
try {
URL url = new URL(linkurl);
HttpURLConnection httpUrlConnect = (HttpURLConnection) url.openConnection();
httpUrlConnect.setConnectTimeout(3000);
httpUrlConnect.connect();
if (httpUrlConnect.getResponseCode() == 200) {
System.out.println(linkurl + " - " + httpUrlConnect.getResponseMessage());
}
if (httpUrlConnect.getResponseCode() == HttpURLConnection.HTTP_NOT_FOUND) {
System.out.println(linkurl + " - " + httpUrlConnect.getResponseMessage()
+ " - " + HttpURLConnection.HTTP_NOT_FOUND);
}
}
catch (Exception e) {
}
}
}
For more tutorial visit https://www.jbktutorials.com/selenium