php CURL 如何使用 Captcha 和 Session 登录
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/5800918/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How CURL Login with Captcha and Session
提问by yudo hartono
define('COOKIE', './cookie.txt'); define('MYURL', 'https://register.pandi.or.id/main'); function getUrl($url, $method='', $vars='', $open=false) { $agents = 'Mozilla/5.0 (X11; U; Linux i686; en-US) AppleWebKit/534.16 (KHTML, like Gecko) Chrome/10.0.648.204 Safari/534.16'; $header_array = array( "Via: 1.1 register.pandi.or.id", "Keep-Alive: timeout=15,max=100", ); static $cookie = false; if (!$cookie) { $cookie = session_name() . '=' . time(); } $referer = 'https://register.pandi.or.id/main'; $ch = curl_init(); if ($method == 'post') { curl_setopt($ch, CURLOPT_POST, 1); curl_setopt($ch, CURLOPT_POSTFIELDS, "$vars"); } curl_setopt($ch, CURLOPT_HEADER, 1); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_HTTPHEADER, $header_array); curl_setopt($ch, CURLOPT_USERAGENT, $agents); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 5); curl_setopt($ch, CURLOPT_MAXREDIRS, 10); curl_setopt($ch, CURLOPT_REFERER, $referer); curl_setopt($ch, CURLOPT_COOKIE, $cookie); curl_setopt($ch, CURLOPT_COOKIEJAR, COOKIE); curl_setopt($ch, CURLOPT_COOKIEFILE, COOKIE); curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 2); $buffer = curl_exec($ch); if (curl_errno($ch)) { echo "error " . curl_error($ch); die; } curl_close($ch); return $buffer; } function save_captcha($ch) { $agents = 'Mozilla/5.0 (X11; U; Linux i686; en-US) AppleWebKit/534.16 (KHTML, like Gecko) Chrome/10.0.648.204 Safari/534.16'; $url = "https://register.pandi.or.id/jcaptcha"; static $cookie = false; if (!$cookie) { $cookie = session_name() . '=' . time(); } $ch = curl_init(); // Initialize a CURL session. curl_setopt($ch, CURLOPT_URL, $url); // Pass URL as parameter. curl_setopt($ch, CURLOPT_USERAGENT, $agents); curl_setopt($ch, CURLOPT_COOKIESESSION, true); curl_setopt($ch, CURLOPT_COOKIE, $cookie); curl_setopt($ch, CURLOPT_COOKIEJAR, COOKIE); curl_setopt($ch, CURLOPT_COOKIEFILE, COOKIE); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); // Return stream contents. curl_setopt($ch, CURLOPT_BINARYTRANSFER, 1); // We'll be returning this $data = curl_exec($ch); // // Grab the jpg and save the contents in the curl_close($ch); // close curl resource, and free up system resources. $captcha_tmpfile = './captcha/captcha-' . rand(1000, 10000) . '.jpg'; $fp = fopen($tmpdir . $captcha_tmpfile, 'w'); fwrite($fp, $data); fclose($fp); return $captcha_tmpfile; } if (isset($_POST['captcha'])) { $id = "yudohartono"; $pw = "mypassword"; $postfields = "navigation=authenticate&login-type=registrant&username=" . $id . "&password=" . $pw . "&captcha_response=" . $_POST['captcha'] . "press=login"; $url = "https://register.pandi.or.id/main"; $result = getUrl($url, 'post', $postfields); echo $result; } else { $open = getUrl('https://register.pandi.or.id/main', '', '', true); $captcha = save_captcha($ch); $fp = fopen($tmpdir . "/cookie12.txt", 'r'); $a = fread($fp, filesize($tmpdir . "/cookie12.txt")); fclose($fp);
<form action='' method='POST'>
<img src='<?php echo $captcha ?>' />
<input type='text' name='captcha' value=''>
<input type='submit' value='proses'>
</form>";
if (!is_readable('cookie.txt') && !is_writable('cookie.txt')) { echo "cookie fail to read"; chmod('../pandi/', '777'); } }
this cookie.txt
这个cookie.txt
# Netscape HTTP Cookie File # http://curl.haxx.se/rfc/cookie_spec.html # This file was generated by libcurl! Edit at your own risk. register.pandi.or.id FALSE / FALSE 0 JSESSIONID 05CA8241C5B76F70F364CA244E4D1DF4
after i submit form just display
在我提交表单后只显示
HTTP/1.1 200 OK Date: Wed, 27 Apr 2011 07:38:08 GMT Server: Apache-Coyote/1.1 X-Powered-By: Servlet 2.4; Tomcat-5.0.28/JBoss-4.0.0 (build: CVSTag=JBoss_4_0_0 date=200409200418) Content-Length: 0 Via: 1.1 register.pandi.or.id Content-Type: text/plain X-Pad: avoid browser bug
if not error "Captcha invalid"
always failed login to pandi
what wrong in my script?
I'm not want to Break Captcha but i want display captcha and user input captcha from my web page, so user can registrar domain dotID from my web automaticaly
如果不是错误“验证码无效”总是无法登录 pandi 我的脚本有什么问题?
我不想打破验证码,但我想从我的网页显示验证码和用户输入的验证码,以便用户可以从我的网络自动注册域 dotID
回答by kapa
A captcha is intended to differentiate between humans and robots (programs). Seems like you are trying to log in with a program. The captcha seems to do its job :).
验证码旨在区分人类和机器人(程序)。似乎您正在尝试使用程序登录。验证码似乎完成了它的工作:)。
I don't see a legal way around.
我看不到合法的解决办法。
回答by Randy
It happens because,
You took your captcha image from first getURL (ie first curl_exec)
and processed the captcha but to submit your captcha you are requested getURL (ie again curl_exec)
which means to a new page with a new captcha again.
发生这种情况是因为,
您首先获取了验证码图像getURL (ie first curl_exec)
并处理了验证码,但是要提交您的验证码,您会被要求getURL (ie again curl_exec)
再次使用新的验证码进入新页面。
So you are placing the old captcha and putting it in the new captcha. I'm having the same problem & resolved it.
因此,您将放置旧的验证码并将其放入新的验证码中。我遇到了同样的问题并解决了它。
回答by Alex Et Cie
Using a headless browsing solution this is possible. ie: zombie.js coffee.js on Node.. Also it may be possible to extract the "image" from the captcha and, using image recognition, "read" the image and convert it to text, which is then posted with the form.
使用无头浏览解决方案,这是可能的。即:Node.js 上的zombie.js coffee.js .. 也可以从验证码中提取“图像”,并使用图像识别“读取”图像并将其转换为文本,然后与表单一起发布.
As of today, the only surefire method to "trick" a captcha is to use headless browsing.
截至今天,“欺骗”验证码的唯一可靠方法是使用无头浏览。
回答by Atanas Atanasov
Yes, Andro Selva is right. On the second request it gives new captcha. Once it loads captcha with getUrl function and the second load is from the save_captcha function, so this are 2 different images.
是的,安德罗塞尔瓦是对的。在第二个请求中,它提供新的验证码。一旦它使用 getUrl 函数加载验证码,第二次加载来自 save_captcha 函数,所以这是 2 个不同的图像。
It must do something like this: Download the captcha image before close the curl and before post and tell the script to wait untill you provide captcha answer - I will use preg_match. It will require some javascript as well.
它必须执行以下操作:在关闭 curl 之前和发布之前下载验证码图像,并告诉脚本等待直到您提供验证码答案 - 我将使用 preg_match。它还需要一些 javascript。
If the captcha image is generated from javascript, you need to execute this javascript with the same cookie or token. In this situation, the easier solution is to record the headers with e.g. livehttpheaders addon for mozila ffox.
如果验证码图像是从 javascript 生成的,则需要使用相同的 cookie 或令牌执行此 javascript。在这种情况下,更简单的解决方案是使用例如用于 mozila ffox 的 livehttpheaders 插件记录标题。
回答by Axe
Captcha is a dynamic image created by the server when you hit the page. It will keep changing, you must extract the captcha from the page and then parse it and then submit your page for a login. Captcha will keep changing as and when the page is triggered to load!
验证码是当您点击页面时由服务器创建的动态图像。它会不断变化,您必须从页面中提取验证码,然后解析它,然后提交您的页面进行登录。当页面被触发加载时,验证码会不断变化!
回答by Pih
With PHP I do not know how to do it, you have to get the captcha and find a way to solve it. It has a lot of algorithms to do it for you, but if you want to use java, I already hacked the source code from this linkto get the code to solve the captcha and it works very well for a lot of captcha systems.
用PHP我不知道怎么做,你必须得到验证码并找到解决它的方法。它有很多算法可以为您完成,但是如果您想使用 java,我已经从这个链接中破解了源代码以获取解决验证码的代码,并且它适用于许多验证码系统。
So, you could try to implement your own captcha solver, that will take a lot of time, try to find an existing implementation for PHP, or, IMHO, the best option, to use the JDownloader code base.
因此,您可以尝试实现自己的验证码求解器,这将花费大量时间,尝试找到 PHP 的现有实现,或者,恕我直言,最好的选择是使用 JDownloader 代码库。