使用 Curl PHP 模拟 ajax 调用
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/19212293/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Mimicking an ajax call with Curl PHP
提问by Davor
I'm scraping a site using curl (via PHP) and some information I want is a list of products which is by default only showing the first few ones. The rest is passed to the user when they click a button to get the full list of products, which triggers an ajax call to return that list.
我正在使用 curl(通过 PHP)抓取网站,我想要的一些信息是产品列表,默认情况下只显示前几个产品。当用户单击按钮以获取完整的产品列表时,其余部分将传递给用户,这将触发 ajax 调用以返回该列表。
Here is in a nutshell the JS they use:
简而言之,这是他们使用的 JS:
headers['__RequestVerificationToken'] = token;
$.ajax({
type: "post",
url: "/ajax/getProductList",
dataType: 'html',
data: JSON.stringify({ historyPageIndex: 1, displayPeriod: 0, productsType: All }),
contentType: 'application/json; charset=utf-8',
success: function (result) {
$(target).html("");
$(target).html(result);
},
beforeSend: function (XMLHttpRequest) {
if (headers['__RequestVerificationToken']) {
XMLHttpRequest.setRequestHeader("__RequestVerificationToken", headers['__RequestVerificationToken']);
}
}
});
Here is my PHP script:
这是我的 PHP 脚本:
curl_setopt($ch, CURLOPT_USERAGENT, $userAgent);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_MAXREDIRS, 10);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookieLocation);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookieLocation);
curl_setopt($ch, CURLOPT_POST, false);
curl_setopt($ch, CURLOPT_URL, 'https://www.domain.com/Applications/ViewProducts');
curl_setopt($ch, CURLOPT_REFERER, 'https://www.domain.com/');
$webpage = curl_exec($ch);
$productsType = trim(find_by_pattren($webpage, '<input id="productsType" name="productsType" type="hidden" value="(.*?)"'));
$token = trim(find_by_pattren($webpage, '<input name="__RequestVerificationToken" type="hidden" value="(.*?)"'));
$postVariables = 'productsType='.$productsType.
'&historyPageIndex=1
&displayPeriod=0
&__RequestVerificationToken='.$token;
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $postVariables);
curl_setopt($ch, CURLOPT_URL, 'https://www.domain.com/ajax/getProductList');
curl_setopt($ch, CURLOPT_REFERER, 'https://www.domain.com/Applications/ViewProducts');
$webpage = curl_exec($ch);
This produces an error page with the site. I think the main reasons could be that:
这会产生站点的错误页面。我认为主要原因可能是:
They check whether it's an ajax request (no clue how to fix that)
The token needs to be in the header and not in the post variables
他们检查它是否是一个 ajax 请求(不知道如何解决这个问题)
令牌需要在标题中而不是在帖子变量中
Any idea?
任何的想法?
EDIT: here is the working code:
编辑:这是工作代码:
curl_setopt($ch, CURLOPT_USERAGENT, $userAgent);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_MAXREDIRS, 10);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookieLocation);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookieLocation);
curl_setopt($ch, CURLOPT_URL, 'https://www.domain.com/Applications/ViewProducts');
curl_setopt($ch, CURLOPT_REFERER, 'https://www.domain.com/');
$webpage = curl_exec($ch);
$productsType = trim(find_by_pattren($webpage, '<input id="productsType" name="productsType" type="hidden" value="(.*?)"'));
$token = trim(find_by_pattren($webpage, '<input name="__RequestVerificationToken" type="hidden" value="(.*?)"'));
$postVariables = json_encode(array('productsType' => $productsType,
'historyPageIndex' => 1,
'displayPeriod' => 0));
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, array("X-Requested-With: XMLHttpRequest", "Content-Type: application/json; charset=utf-8", "__RequestVerificationToken: $token"));
curl_setopt($ch, CURLOPT_POSTFIELDS, $postVariables);
curl_setopt($ch, CURLOPT_URL, 'https://www.domain.com/ajax/getProductList');
curl_setopt($ch, CURLOPT_REFERER, 'https://www.domain.com/Applications/ViewProducts');
$webpage = curl_exec($ch);
回答by AlliterativeAlice
To set the request verification token as a header, more closely mimic an AJAX request, and set the content-type to JSON, use CURLOPT_HEADER.
要将请求验证令牌设置为标头,更接近地模拟 AJAX 请求,并将内容类型设置为 JSON,请使用 CURLOPT_HEADER。
curl_setopt($ch, CURLOPT_HTTPHEADER, array("X-Requested-With: XMLHttpRequest", "Content-Type: application/json; charset=utf-8", "__RequestVerificationToken: $token"));
I also notice that you're superfluously setting CURLOPT_POST to false on line 7 of your code, and that the post data you're sending isn't in JSON format. You should have:
我还注意到您在代码的第 7 行多余地将 CURLOPT_POST 设置为 false,并且您发送的帖子数据不是 JSON 格式。你应该有:
$postVariables = '{"historyPageIndex":1,"displayPeriod":0,"productsType":"All"}';