php Facebook - 解析输入 URL 时出错,未缓存任何数据或未抓取任何数据
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/25747758/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Facebook - Error parsing input URL, no data was cached, or no data was scraped
提问by bravo net
After research I found that alot of people facing the same issue. But so far I don't solution, this happened after I switch my server to linode.com
经过研究,我发现很多人都面临同样的问题。但到目前为止我还没有解决,这是在我将服务器切换到 linode.com 之后发生的
lets take an example. www.acemark2u.com is one of the website hosted under the linode server, when I try to debug in https://developers.facebook.com/tools/debug/og/object/, it just couldn't fetch the scrape information correctly, and if I try with one of the page www.acemark2u.com/about-us, it just show me the error "Error parsing input URL, no data was cached, or no data was scraped."
让我们举个例子。www.acemark2u.com 是 linode 服务器下托管的网站之一,当我尝试在https://developers.facebook.com/tools/debug/og/object/ 中调试时,它无法获取抓取信息正确,如果我尝试使用 www.acemark2u.com/about-us 页面之一,它只会向我显示错误“解析输入 URL 时出错,没有缓存任何数据,或者没有抓取任何数据”。
weird things happen. when I try to debug using ip address 106.187.35.114/~acemark2 everything goes smooth. fetching nicely, no error 404 for pages.
奇怪的事情发生。当我尝试使用 ip 地址 106.187.35.114/~acemark2 进行调试时,一切顺利。很好地获取,页面没有错误 404。
I suspect it might caused by "gethostbyaddr" function (ref: http://www.gearhack.com/Forums/DisplayComments.php?file=Computer/Network/Internet/Preventing_Your_Web_Server_From_Blocking_Facebook_Share) but so far I don't have solutions.
我怀疑它可能是由“gethostbyaddr”函数引起的(参考:http: //www.gearhack.com/Forums/DisplayComments.php?file=Computer/Network/Internet/Preventing_Your_Web_Server_From_Blocking_Facebook_Share )但到目前为止我没有解决方案。
采纳答案by bravo net
i found the solution at last.
我终于找到了解决方案。
In my default DNS A/AAAA record i did not remove these few ip
在我的默认 DNS A/AAAA 记录中,我没有删除这几个 ip
2400:8900::f03c:91ff:fe73:a95d Default
mail 2400:8900::f03c:91ff:fe73:a95d Default
www 2400:8900::f03c:91ff:fe73:a95d Default
that's why some of the users will pointed to the above IP when they access via proper web address.
这就是为什么有些用户通过正确的网址访问时会指向上述IP。
回答by Kzar
For people experiencing the same issue but for different causes, I discovered a few interesting things about how Facebook "scrapes" pages, checking the logs of the server while doing some trials.
对于遇到相同问题但原因不同的人,我发现了一些关于 Facebook 如何“抓取”页面、在进行一些试验时检查服务器日志的有趣事情。
First of all: if you never tried to share a page with FB, FB never tried to scrape it, and it will not try to do so if you only put the url in the Debug tool. That's the first reason because you get the error: it just states that FB has no information on the page, you must "force" it to scrape the page.
首先:如果您从未尝试过与FB共享页面,则FB从未尝试过抓取它,如果您仅将url放入调试工具中,它也不会尝试这样做。这是第一个原因,因为您收到错误:它只是指出 FB 在页面上没有信息,您必须“强制”它抓取页面。
The first time you try to share a page, FB scrapes it (asks your server the first 40k of the page and analyse the opengraph tags). What can happen is that you do not see the image: Facebook Share Dialog does not display thumbnails one first load
第一次尝试共享页面时,FB 会对其进行抓取(询问您的服务器页面的前 40k 并分析 opengraph 标签)。可能发生的情况是您看不到图像:Facebook 共享对话框在第一次加载时不显示缩略图
The reason is that FB behind the scenes is still scraping your page and caching the image. The next time, in fact, you have also the image. How to solve it? Pre caching: https://developers.facebook.com/docs/sharing/best-practices#precaching
原因是幕后的 FB 仍在抓取您的页面并缓存图像。下一次,事实上,你也有图像。如何解决?预缓存:https: //developers.facebook.com/docs/sharing/best-practices#precaching
or simply add
或者简单地添加
<meta property="og:image:width" content="450"/>
<meta property="og:image:height" content="298"/>
回答by Saumil
This question has already accepted answer but in case this answer doesn't work for anyone here is what worked for me.
这个问题已经接受了答案,但如果这个答案对这里的任何人都不起作用,那么这对我有用。
The URL which I provided in the og:url
was protected URL i.e. only those users can view the page pointed by the URL who are signed-in. When I changed the URL to point to my homepage which can be viewed by both signed-in or signed-out users viz. http://www.ercafe.comeverything worked fine.
我在og:url
受保护的 URL 中提供的URL 即只有那些用户可以查看登录的 URL 指向的页面。当我将 URL 更改为指向我的主页时,登录或注销的用户都可以查看。http://www.ercafe.com一切正常。
回答by Paul leclercq
We had a similar issue on one of our sites.
我们在我们的一个网站上遇到了类似的问题。
We resolved this by disabling apache mod_security while we use the facebook object debug tool to "fetch new scrape information"
我们通过禁用 apache mod_security 解决了这个问题,同时我们使用 facebook 对象调试工具来“获取新的抓取信息”
回答by michalzuber
For me the solution was replacing the DNS A records
对我来说,解决方案是替换 DNS A 记录
example.sk 3600 1.2.3.4
www.example.sk 3600 1.2.3.4
to
到
example.sk 3600 1.2.3.4
*.example.sk 3600 1.2.3.4