如何使用带有 Java 的 Selenium WebDriver 从图像(验证码)中读取文本
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/18935696/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to read the text from image (captcha) by using Selenium WebDriver with Java
提问by SATENDRA
I have registration webpage but in last captcha is displaying..
我有注册网页,但在最后一个验证码中显示..
I am not able to read the text from image. I am going to mention the code and output ..
我无法从图像中读取文本。我将提到代码和输出..
@Test
public void loginTest() throws InterruptedException {
System.out.println("Testing");
driver.get("https://customer.onlinelic.in/ForgotPwd.htm");
WebElement element = driver.findElement(By.xpath("//*[@id='forgotPassword']/table/tbody/tr[5]/td[3]/img"));
System.out.println(" get the instance ");
String elementTest = element.getAttribute("src");
System.out.println("Element : " + elementTest);
}
Output: Error
输出:错误
Exception in thread "main" org.openqa.selenium.NoSuchElementException: Unable to locate element: {"method":"xpath","selector":"//[@id='forgotPassword']/table/tbody/tr[5]/td[3]/img"} Command duration or timeout: 60.02 seconds For documentation on this error, please visit: http://seleniumhq.org/exceptions/no_such_element.htmlBuild info: version: '2.35.0', revision: '8df0c6b', time: '2013-08-12 15:43:19' System info: os.name: 'Windows 7', os.arch: 'amd64', os.version: '6.1', java.version: '1.6.0_26' Session ID: 5f5b2e1a-56a4-49ad-8fd3-2870747a7768 Driver info: org.openqa.selenium.firefox.FirefoxDriver Capabilities [{platform=XP, acceptSslCerts=true, javascriptEnabled=true, browserName=firefox, rotatable=false, locationContextEnabled=true, version=23.0.1, cssSelectorsEnabled=true, databaseEnabled=true, handlesAlerts=true, browserConnectionEnabled=true, nativeEvents=true, webStorageEnabled=true, applicationCacheEnabled=true, takesScreenshot=true}] at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.openqa.selenium.remote.ErrorHandler.createThrowable(ErrorHandler.java:191) at org.openqa.selenium.remote.ErrorHandler.throwIfResponseFailed(ErrorHandler.java:145) at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:554) at org.openqa.selenium.remote.RemoteWebDriver.findElement(RemoteWebDriver.java:307) at org.openqa.selenium.remote.RemoteWebDriver.findElementByXPath(RemoteWebDriver.java:404) at org.openqa.selenium.By$ByXPath.findElement(By.java:344) at org.openqa.selenium.remote.RemoteWebDriver.findElement(RemoteWebDriver.java:299) at seleniumtest.CaptchaTest.loginTest(CaptchaTest.java:41) at seleniumtest.CaptchaTest.main(CaptchaTest.java:59) Caused by: org.openqa.selenium.remote.ErrorHandler$UnknownServerException: Unable to locate element: {"method":"xpath","selector":"//[@id='forgotPassword']/table/tbody/tr[5]/td[3]/img"} Build info: version: '2.35.0', revision: '8df0c6b', time: '2013-08-12 15:43:19' System info: os.name: 'Windows 7', os.arch: 'amd64', os.version: '6.1', java.version: '1.6.0_26' Driver info: driver.version: unknown at .FirefoxDriver.prototype.findElementInternal_(file:///C:/Users/lukup/AppData/Local/Temp/anonymous4043037924964932185webdriver-profile/extensions/[email protected]/components/driver_component.js:8880) at .fxdriver.Timer.prototype.setTimeout/<.notify(file:///C:/Users/lukup/AppData/Local/Temp/anonymous4043037924964932185webdriver-profile/extensions/[email protected]/components/driver_component.js:396)
线程 "main" org.openqa.selenium.NoSuchElementException 中的异常:无法定位元素:{"method":"xpath","selector":"// [@id='forgotPassword']/table/tbody/tr[ 5]/td[3]/img"} 命令持续时间或超时:60.02 秒有关此错误的文档,请访问:http://seleniumhq.org/exceptions/no_such_element.html 构建信息:版本:'2.35.0',修订:'8df0c6b',时间:'2013-08-12 15:43:19' 系统信息:os.name:'Windows 7',os.arch:'amd64' , os.version: '6.1', java.version: '1.6.0_26' Session ID: 5f5b2e1a-56a4-49ad-8fd3-2870747a7768 驱动信息:org.openqa.selenium.firefox.FirefoxDriver Capabilities [{platform=XP] =true,javascriptEnabled=true,browserName=firefox,rotatable=false,locationContextEnabled=true,version=23.0.1,cssSelectorsEnabled=true,databaseEnabled=true,handlesAlerts=true,browserConnectionEnabled=true,nativeEvents=true,webStorageEnabled=true,applicationCacheEnabled =true,takesScreenshot=true}] at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl。newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.openqa.selenium.remote.Error .createThrowable(ErrorHandler.java:191) at org.openqa.selenium.remote.ErrorHandler.throwIfResponseFailed(ErrorHandler.java:145) at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:554) at org. openqa.selenium.remote.RemoteWebDriver.findElement(RemoteWebDriver.java:307) 在 org.openqa.selenium.remote.RemoteWebDriver.findElementByXPath(RemoteWebDriver.java:404) 在 org.openqa.selenium.By$ByXPath.findElement( java:344) 在 org.openqa.selenium.remote.RemoteWebDriver.findElement(RemoteWebDriver.java:299) 在 seleniumtest。CaptchaTest.loginTest(CaptchaTest.java:41) at seleniumtest.CaptchaTest.main(CaptchaTest.java:59) 引起:org.openqa.selenium.remote.ErrorHandler$UnknownServerException:无法定位元素:{"method":"xpath ","选择器":"//[@id='forgotPassword']/table/tbody/tr[5]/td[3]/img"} 构建信息:版本:'2.35.0',修订:'8df0c6b',时间:'2013-08- 12 15:43:19'系统信息:os.name:'Windows 7',os.arch:'amd64',os.version:'6.1',java.version:'1.6.0_26'驱动程序信息:driver.version :在 .FirefoxDriver.prototype.findElementInternal_(file:///C:/Users/lukup/AppData/Local/Temp/anonymous4043037924964932185webdriver-profile/extensions/[email protected]/components/driver_component.js:8880) 处未知。 fxdriver.Timer.prototype.setTimeout/<.notify(file:///C:/Users/lukup/AppData/Local/Temp/anonymous4043037924964932185webdriver-profile/extensions/[email protected]/components/driver_component.js:396)
回答by Robbie Wareham
Two problems.
两个问题。
You have the wrong xpath so you getting a NoSuchElement exception.
Even you had the right xpath, you would not be able to extract the text, as that would defeat the point if CAPTCHA
您使用了错误的 xpath,因此您会收到 NoSuchElement 异常。
即使你有正确的 xpath,你也无法提取文本,因为如果 CAPTCHA
回答by Mayur Shah
One can not read from CAPTCHA. If you can read from CAPTCHA, there is no point in using CAPTCHA.
一个无法从 CAPTCHA 中读取。如果您可以从 CAPTCHA 中读取,则使用 CAPTCHA 没有任何意义。
回答by A.J
The forgot password form is in an iframe. That is the reason for selenium not finding the element. You need to switch to the iframe holding the form first, and then run your findelement. Your xpath is correct.
忘记密码表单位于 iframe 中。这就是硒找不到元素的原因。您需要先切换到包含表单的 iframe,然后运行您的 findelement。你的 xpath 是正确的。
Use driver.switchTo().frame(arg0)
for switching into the frame. See javadoc here
使用driver.switchTo().frame(arg0)
切换到框架中。在此处查看 javadoc
To get the captcha text, I didn't understand what you meant by 'store the test and compare'. Ideally you shouldn't be able to read the text from the captcha(As others have mentioned). One alternative approach I have seen is, storing the captcha value as alt text
in the development and QA environment. So that you can read it and enter in the textbox. When the code goes to production or any outside environment, this alt text
can be removed.
为了获得验证码文本,我不明白您所说的“存储测试并进行比较”是什么意思。理想情况下,您不应该从验证码中读取文本(正如其他人所提到的)。我见过的另一种方法是,alt text
在开发和 QA 环境中存储验证码值。这样您就可以阅读它并在文本框中输入。当代码进入生产环境或任何外部环境时,alt text
可以将其删除。
回答by Anirudh
The whole purpose of CAPTCHA is to prevent automation from the UI! You may wanna use internal APIs for verifying the action.
CAPTCHA 的全部目的是防止 UI 自动化!您可能希望使用内部 API 来验证操作。
回答by Johnny
Just to elaborate the previous answers, CAPTCHA as an acronym for "Completely Automated Public Turing test to tell Computers and Humans Apart". So, if "machine" can solve it, it's not really do it's job.
只是为了详细说明之前的答案,CAPTCHA 是“完全自动化的公共图灵测试来区分计算机和人类”的首字母缩写词。所以,如果“机器”可以解决它,它就不是真正的工作。
In order to solve it, there is something you can do - to use API of external services such as http://www.deathbycaptcha.com. You implementing their API, passing them the CAPTCHA and get in return the text. The average solving time i have observed is around 10-15 seconds.
为了解决它,您可以做一些事情 - 使用外部服务的 API,例如http://www.deathbycaptcha.com。您实现他们的 API,向他们传递 CAPTCHA 并返回文本。我观察到的平均求解时间约为 10-15 秒。
Example for implementation (taken from here)
实施示例(取自此处)
import com.DeathByCaptcha.AccessDeniedException;
import com.DeathByCaptcha.Captcha;
import com.DeathByCaptcha.Client;
import com.DeathByCaptcha.SocketClient;
import com.DeathByCaptcha.HttpClient;
/* Put your DeathByCaptcha account username and password here.
Use HttpClient for HTTP API. */
Client client = (Client)new SocketClient(username, password);
try {
double balance = client.getBalance();
/* Put your CAPTCHA file name, or file object, or arbitrary input stream,
or an array of bytes, and optional solving timeout (in seconds) here: */
Captcha captcha = client.decode(captchaFileName, timeout);
if (null != captcha) {
/* The CAPTCHA was solved; captcha.id property holds its numeric ID,
and captcha.text holds its text. */
System.out.println("CAPTCHA " + captcha.id + " solved: " + captcha.text);
if (/* check if the CAPTCHA was incorrectly solved */) {
client.report(captcha);
}
}
} catch (AccessDeniedException e) {
/* Access to DBC API denied, check your credentials and/or balance */
}
回答by Shravan Kumar
I have a solution which will work for a specific website. You can get a snapshot of the whole page and get the image of captcha. Then divide the whole width of the captcha image by total number of characters (in a captcha generally it's usually constant). Now we have the individual characters of the captcha image. Collect all the possible characters of the captcha by reloading the page.
我有一个适用于特定网站的解决方案。您可以获得整个页面的快照并获得验证码的图像。然后将验证码图像的整个宽度除以字符总数(在验证码中通常是常数)。现在我们有了验证码图像的各个字符。通过重新加载页面收集验证码的所有可能字符。
Once you have all the possible characters then given any captcha image you can compare its characters with the images that we have and decide which letter or number it is.
一旦您拥有所有可能的字符,然后给出任何验证码图像,您就可以将其字符与我们拥有的图像进行比较,并决定它是哪个字母或数字。
Steps to follow:
要遵循的步骤:
Collect captcha image and divide it into individual characters.
private static BufferedImage cropImage(File filePath, int x, int y, int w, int h) { try { BufferedImage originalImgage = ImageIO.read(filePath); BufferedImage subImgage = originalImgage.getSubimage(x, y, w, h); return subImgage; } catch (IOException e) { e.printStackTrace(); return null; } }
Now read each character image of the captcha and compare it with all other images in above folder. You can compare two images using pixel values public static float getDiff(File f1, File f2, int width, int height) throws IOException { BufferedImage bi1 = null; BufferedImage bi2 = null; bi1 = new BufferedImage(width, height, BufferedImage.TYPE_INT_ARGB); bi2 = new BufferedImage(width, height, BufferedImage.TYPE_INT_ARGB);
bi1 = ImageIO.read(f1); bi2 = ImageIO.read(f2); float diff = 0; for (int i = 0; i < width; i++) { for (int j = 0; j < height; j++) { int rgb1 = bi1.getRGB(i, j); int rgb2 = bi2.getRGB(i, j); int b1 = rgb1 & 0xff; int g1 = (rgb1 & 0xff00) >> 8; int r1 = (rgb1 & 0xff0000) >> 16; int b2 = rgb2 & 0xff; int g2 = (rgb2 & 0xff00) >> 8; int r2 = (rgb2 & 0xff0000) >> 16; diff += Math.abs(b1 - b2); diff += Math.abs(g1 - g2); diff += Math.abs(r1 - r2); } } return diff; }
- Whichever images having less diff value that is the actual match. Append its name to a string.
- After reading all images of the captcha return string 1: https://i.stack.imgur.com/FYPhd.png
收集验证码图像并将其分成单个字符。
private static BufferedImage cropImage(File filePath, int x, int y, int w, int h) { try { BufferedImage originalImgage = ImageIO.read(filePath); BufferedImage subImgage = originalImgage.getSubimage(x, y, w, h); return subImgage; } catch (IOException e) { e.printStackTrace(); return null; } }
现在读取验证码的每个字符图像并将其与上述文件夹中的所有其他图像进行比较。您可以使用像素值比较两个图像 public static float getDiff(File f1, File f2, int width, int height) throws IOException { BufferedImage bi1 = null; BufferedImage bi2 = null; bi1 = new BufferedImage(width, height, BufferedImage.TYPE_INT_ARGB); bi2 = new BufferedImage(width, height, BufferedImage.TYPE_INT_ARGB);
bi1 = ImageIO.read(f1); bi2 = ImageIO.read(f2); float diff = 0; for (int i = 0; i < width; i++) { for (int j = 0; j < height; j++) { int rgb1 = bi1.getRGB(i, j); int rgb2 = bi2.getRGB(i, j); int b1 = rgb1 & 0xff; int g1 = (rgb1 & 0xff00) >> 8; int r1 = (rgb1 & 0xff0000) >> 16; int b2 = rgb2 & 0xff; int g2 = (rgb2 & 0xff00) >> 8; int r2 = (rgb2 & 0xff0000) >> 16; diff += Math.abs(b1 - b2); diff += Math.abs(g1 - g2); diff += Math.abs(r1 - r2); } } return diff; }
- 具有较小差异值的图像才是实际匹配。将其名称附加到字符串中。
- 阅读验证码返回字符串1 的所有图像后 :https: //i.stack.imgur.com/FYPhd.png
In above picture image name specifies the digit or character.
上图中图像名称指定数字或字符。
This works only for simple captcha like [1
这仅适用于简单的验证码,如 [ 1
回答by virender rana
And here is the sample code to read the text from above image :
这是从上面的图像中读取文本的示例代码:
import java.awt.Image;
import java.awt.image.BufferedImage;
import java.awt.image.RenderedImage;
import java.io.File;
import java.io.IOException;
import java.net.URL;
import javax.imageio.ImageIO;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.firefox.FirefoxDriver;
import org.testng.annotations.BeforeTest;
import org.testng.annotations.Test;
import com.asprise.util.ocr.OCR;
public class ExtractImage {
WebDriver driver;
@BeforeTest
public void setUpDriver() {
driver = new FirefoxDriver();
}
@Test
public void start() throws IOException{
/*Navigate to http://www.mythoughts.co.in/2013/10/extract-and-verify-text-from-image.html page
* and get the image source attribute
*
*/
driver.get("http://www.mythoughts.co.in/2013/10/extract-and-verify-text-from-image.html");
String imageUrl=driver.findElement(By.xpath("//*[@id='post-body-5614451749129773593']/div[1]/div[1]/div/a/img")).getAttribute("src");
System.out.println("Image source path : \n"+ imageUrl);
URL url = new URL(imageUrl);
Image image = ImageIO.read(url);
String s = new OCR().recognizeCharacters((RenderedImage) image);
System.out.println("Text From Image : \n"+ s);
System.out.println("Length of total text : \n"+ s.length());
driver.quit();
/* Use below code If you want to read image location from your hard disk
*
BufferedImage image = ImageIO.read(new File("Image location"));
String imageText = new OCR().recognizeCharacters((RenderedImage) image);
System.out.println("Text From Image : \n"+ imageText);
System.out.println("Length of total text : \n"+ imageText.length());
*/
}
}
Here is the output of the above program:
这是上述程序的输出:
Image source path : http://2.bp.blogspot.com/-42SgMHAeF8U/Uk8QlYCoy-I/AAAAAAAADSA/TTAVAAgDhio/s1600/love.jpg
图片来源路径:http: //2.bp.blogspot.com/-42SgMHAeF8U/Uk8QlYCoy-I/AAAAAAAADSA/TTAVAAgDhio/s1600/love.jpg
Never M2suse the O, ne Who Likes You Never Say Busy To Th,e One Who Needs You Never cheat The One Who ReaZZy Trust You, Never foJnget The One Who Zways Remember You.
永远不要使用 O,喜欢你的人永远不要对你说忙,永远不要欺骗那些真正信任你的人,永远不要忘记 Zways 记得你的人。
Length of total text : 175
总文本长度:175