使用 Curl PHP 模拟 ajax 调用

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/19212293/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-25 19:00:41  来源:igfitidea点击:

Mimicking an ajax call with Curl PHP

phpajaxcurl

提问by Davor

I'm scraping a site using curl (via PHP) and some information I want is a list of products which is by default only showing the first few ones. The rest is passed to the user when they click a button to get the full list of products, which triggers an ajax call to return that list.

我正在使用 curl(通过 PHP)抓取网站,我想要的一些信息是产品列表,默认情况下只显示前几个产品。当用户单击按钮以获取完整的产品列表时,其余部分将传递给用户,这将触发 ajax 调用以返回该列表。

Here is in a nutshell the JS they use:

简而言之,这是他们使用的 JS:

headers['__RequestVerificationToken'] = token;
$.ajax({
type: "post",
url: "/ajax/getProductList",
dataType: 'html',
data: JSON.stringify({ historyPageIndex: 1, displayPeriod: 0, productsType: All }),
contentType: 'application/json; charset=utf-8',
success: function (result) {
    $(target).html("");
    $(target).html(result);
},
beforeSend: function (XMLHttpRequest) {
    if (headers['__RequestVerificationToken']) {
        XMLHttpRequest.setRequestHeader("__RequestVerificationToken", headers['__RequestVerificationToken']);
    }
}
});

Here is my PHP script:

这是我的 PHP 脚本:

curl_setopt($ch, CURLOPT_USERAGENT, $userAgent);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_MAXREDIRS, 10);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookieLocation);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookieLocation);
curl_setopt($ch, CURLOPT_POST, false);
curl_setopt($ch, CURLOPT_URL, 'https://www.domain.com/Applications/ViewProducts');
curl_setopt($ch, CURLOPT_REFERER, 'https://www.domain.com/');
$webpage = curl_exec($ch);
$productsType = trim(find_by_pattren($webpage, '<input id="productsType" name="productsType" type="hidden" value="(.*?)"'));
$token = trim(find_by_pattren($webpage, '<input name="__RequestVerificationToken" type="hidden" value="(.*?)"'));

$postVariables = 'productsType='.$productsType.
'&historyPageIndex=1
&displayPeriod=0
&__RequestVerificationToken='.$token;
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $postVariables);
curl_setopt($ch, CURLOPT_URL, 'https://www.domain.com/ajax/getProductList');
curl_setopt($ch, CURLOPT_REFERER, 'https://www.domain.com/Applications/ViewProducts');
$webpage = curl_exec($ch);

This produces an error page with the site. I think the main reasons could be that:

这会产生站点的错误页面。我认为主要原因可能是:

  • They check whether it's an ajax request (no clue how to fix that)

  • The token needs to be in the header and not in the post variables

  • 他们检查它是否是一个 ajax 请求(不知道如何解决这个问题)

  • 令牌需要在标题中而不是在帖子变量中

Any idea?

任何的想法?

EDIT: here is the working code:

编辑:这是工作代码:

curl_setopt($ch, CURLOPT_USERAGENT, $userAgent);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_MAXREDIRS, 10);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookieLocation);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookieLocation);
curl_setopt($ch, CURLOPT_URL, 'https://www.domain.com/Applications/ViewProducts');
curl_setopt($ch, CURLOPT_REFERER, 'https://www.domain.com/');
$webpage = curl_exec($ch);
$productsType = trim(find_by_pattren($webpage, '<input id="productsType" name="productsType" type="hidden" value="(.*?)"'));
$token = trim(find_by_pattren($webpage, '<input name="__RequestVerificationToken" type="hidden" value="(.*?)"'));

$postVariables = json_encode(array('productsType' => $productsType,
'historyPageIndex' => 1,
'displayPeriod' => 0));
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, array("X-Requested-With: XMLHttpRequest", "Content-Type: application/json; charset=utf-8", "__RequestVerificationToken: $token"));
curl_setopt($ch, CURLOPT_POSTFIELDS, $postVariables);
curl_setopt($ch, CURLOPT_URL, 'https://www.domain.com/ajax/getProductList');
curl_setopt($ch, CURLOPT_REFERER, 'https://www.domain.com/Applications/ViewProducts');
$webpage = curl_exec($ch);

回答by AlliterativeAlice

To set the request verification token as a header, more closely mimic an AJAX request, and set the content-type to JSON, use CURLOPT_HEADER.

要将请求验证令牌设置为标头,更接近地模拟 AJAX 请求,并将内容类型设置为 JSON,请使用 CURLOPT_HEADER。

curl_setopt($ch, CURLOPT_HTTPHEADER, array("X-Requested-With: XMLHttpRequest", "Content-Type: application/json; charset=utf-8", "__RequestVerificationToken: $token"));

I also notice that you're superfluously setting CURLOPT_POST to false on line 7 of your code, and that the post data you're sending isn't in JSON format. You should have:

我还注意到您在代码的第 7 行多余地将 CURLOPT_POST 设置为 false,并且您发送的帖子数据不是 JSON 格式。你应该有:

$postVariables = '{"historyPageIndex":1,"displayPeriod":0,"productsType":"All"}';