php 如何避免来自 LinkedIn 的“HTTP/1.1 999 请求被拒绝”响应?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/27571419/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-25 23:25:12  来源:igfitidea点击:

How to avoid "HTTP/1.1 999 Request denied" response from LinkedIn?

phpcurlamazon-web-servicesamazon-ec2linkedin

提问by zoonman

I'm making request to LinkedIn page and receiving "HTTP/1.1 999 Request denied" response. I use AWS/EC-2 and get this response. On localhost everything works fine.

我正在向 LinkedIn 页面发出请求并收到“HTTP/1.1 999 请求被拒绝”响应。我使用 AWS/EC-2 并得到此响应。在本地主机上一切正常。

This is sample of my code to get html-code of the page.

这是我获取页面 html 代码的代码示例。

<?php
error_reporting(E_ALL);
$url= 'https://www.linkedin.com/pulse/5-essential-strategies-digital-michelle';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
$response = curl_exec($ch);
$info = curl_getinfo($ch);
curl_close($ch);
var_dump($response);
var_dump($info); 

I don't need whole page content, just meta-tags (title, og-tags).

我不需要整个页面内容,只需要元标签(标题、OG 标签)。

回答by Guilherme Nascimento

Note that the error 999don't exist in W3C Hypertext Transfer Protocol - HTTP/1.1, probably this error is customized (sounds like a joke)

请注意,错误999W3C Hypertext Transfer Protocol - HTTP/1.1 中不存在,这个错误可能是自定义的(听起来像个笑话)

LinkedIn don't allow direct access, the probable reason of them blocking any "url" from others webservers access should be to:

LinkedIn 不允许直接访问,他们阻止其他网络服务器访问任何“url”的可能原因应该是:

  1. Prevent unauthorized copying of information
  2. Prevent invasions
  3. Prevent abuse of requests.
  4. Force use API
  1. 防止未经授权复制信息
  2. 防止入侵
  3. 防止滥用请求。
  4. 强制使用 API

Some IP addresses of servers are blocked, as the "IP" from "domestic ISP" are not blocked and that when you access the LinkedInwith web-browser you use the IP of your internet provider.

服务器的某些 IP 地址被阻止,因为来自“国内 ISP”的“IP”未被阻止,并且当您使用网络浏览器访问LinkedIn 时,您使用的是互联网提供商的 IP。

The only way to access the data is to use their APIs. See:

访问数据的唯一方法是使用它们的 API。看:

Note: The search engines like Googleand Bingprobably have their IPs in a "whitelist".

注意:GoogleBing等搜索引擎可能会将其 IP 放在“白名单”中。

回答by Ond?ej Bleha

<?php
header("Content-Type: text/plain");

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "https://www.linkedin.com/company/technistone-a-s-");

$header = array();
$header[] = "Host: www.linkedin.com";
$header[] = "User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:50.0) Gecko/20100101 Firefox/50.0";
$header[] = "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
$header[] = "Accept-Language: en-US,en;q=0.5";
$header[] = "Accept-Encoding: gzip, deflate, br";
$header[] = "Connection: keep-alive";
$header[] = "Upgrade-Insecure-Requests: 1";

curl_setopt($ch,CURLOPT_ENCODING , "gzip");
curl_setopt($ch, CURLOPT_HTTPHEADER , $header);
$my_var = curl_exec($ch);

echo $my_var;

回答by Mike Veazie - MSFT

I ran into this while doing local web development and using the LinkedIn badge feature (profile.js). I was only getting the 999 Request deniedin Chrome, so I just cleared my browser cache and localStorage and it started to work again.

我在进行本地 Web 开发和使用 LinkedIn 徽章功能 (profile.js) 时遇到了这个问题。我只999 Request denied在 Chrome 中获得了,所以我只是清除了浏览器缓存和 localStorage,它又开始工作了。

UPDATE - Clearing cache was just a coincidence and the issue came back. LinkedIn is having issues with their badge functionality.

更新 - 清除缓存只是一个巧合,问题又回来了。LinkedIn 的徽章功能存在问题。

I submitted a help thread to their forums. https://www.linkedin.com/help/linkedin/forum/question/714971

我向他们的论坛提交了一个帮助主题。 https://www.linkedin.com/help/linkedin/forum/question/714971

回答by Kaleem

LinkedIn is not supporting the default encoding 'identity' , so if you set the header

LinkedIn 不支持默认编码 'identity' ,因此如果您设置标题

'Accept-Encoding': 'gzip, deflate'

'接受编码':'gzip,放气'

you should get the response , but you would have to decompress it.

你应该得到响应,但你必须解压缩它。

回答by Muhammad Numan

LinkedIn do not allow direct access. They have blacklisted Heroku/AWS IP address and the only way to access the data is to use their APIs. it can be accessed from the local machine or headless browser

LinkedIn 不允许直接访问。他们已将 Heroku/AWS IP 地址列入黑名单,访问数据的唯一方法是使用他们的 API。它可以从本地机器或无头浏览器访问