Javascript pushState 和 SEO

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/6193858/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-23 20:40:41  来源:igfitidea点击:

pushState and SEO

javascriptweb-applicationsseohashbangpushstate

提问by Harry

Many people have been saying, use pushState rather than hashbang.

很多人一直在说,使用 pushState 而不是 hashbang。

What I don't understand is, how would you be search-engine friendly without using hashbang?

我不明白的是,如果不使用 hashbang,您将如何对搜索引擎友好?

Presumably your pushState content is generated by client-side JavaScript code.

大概您的 pushState 内容是由客户端 JavaScript 代码生成的。

The scenario is thusly:

因此,场景是:

I'm on example.com. My user clicks a link: href="example.com/blog"

我在example.com。我的用户点击了一个链接:href="example.com/blog"

pushState captures the click, updates the URL, grabs a JSON file from somewhere, and creates the listing of blog posts in the content area.

pushState 捕获点击,更新 URL,从某处获取 JSON 文件,并在内容区域中创建博客文章列表。

With hashbangs, google knows to go to the escaped_fragment URL to get their static content.

使用 hashbangs,谷歌知道去escaped_fragment URL 来获取它们的静态内容。

With pushState, Google just sees nothing as it can't use the JavaScript code to load the JSON and subsequently create the template.

使用 pushState,Google 什么也看不到,因为它无法使用 JavaScript 代码加载 JSON 并随后创建模板。

The only way to do it I can see is to render the template on the server side, but that completely negates the benefits of pushing the application layer to the client.

我能看到的唯一方法是在服务器端呈现模板,但这完全否定了将应用程序层推送到客户端的好处。

So am I getting this right, pushState is not SEO friendly for client-side applications at all?

那么我说得对吗,pushState 对于客户端应用程序根本不是 SEO 友好的?

采纳答案by prograhammer

What about using the meta tag that Google suggests for those who don't want hash-bangs in their URLs: <meta name="fragment" content="!">

使用 Google 为那些不希望在其 URL 中使用 hash-bangs 的人建议的元标记怎么样: <meta name="fragment" content="!">

See here for more info: https://developers.google.com/webmasters/ajax-crawling/docs/getting-started

请参阅此处了解更多信息:https: //developers.google.com/webmasters/ajax-crawling/docs/getting-started

Unfortunately I don't think Nicole clarified the issue that I thought the OP was having. The problem is simply that we don't know who we are serving content to if we don't use the hash-bang. Pushstate does not solve this for us. We don't want search engines telling end-users to navigate to some URL that spits out unformatted JSON. Instead, we create URLs (that trigger other calls to more URLs) that retrieve data via AJAX and present it to the user in the manner we prefer. If the user is not a human, then as an alternative we can serve an html-snapshot, so that search engines can properly direct human users to the URL that they would expect to find the requested data at (and in a presentable manner). But the ultimate challenge is how do we determine the type of user? Yes we can possibly use .htaccess or something to rewrite the URL for search engine bots we detect, but I'm not sure how fullproof and futureproof this is. It may also be possible that Google could penalize people for doing this sort of thing, but I have not researched it fully. So the (pushstate + google's meta tag) combo seems to be a likely solution.

不幸的是,我认为 Nicole 没有澄清我认为 OP 存在的问题。问题很简单,如果我们不使用 hash-bang,我们不知道我们向谁提供内容。Pushstate 并没有为我们解决这个问题。我们不希望搜索引擎告诉最终用户导航到一些输出未格式化 JSON 的 URL。相反,我们创建 URL(触发对更多 URL 的其他调用),通过 AJAX 检索数据并以我们喜欢的方式将其呈现给用户。如果用户不是人类,那么作为替代方案,我们可以提供 html 快照,以便搜索引擎可以正确地将人类用户定向到他们希望在其中找到所请求数据的 URL(并以可呈现的方式)。但最终的挑战是我们如何确定用户类型?是的,我们可能会使用 . htaccess 或其他东西来重写我们检测到的搜索引擎机器人的 URL,但我不确定这是多么全面和面向未来。谷歌也有可能因为人们做这种事情而惩罚他们,但我还没有完全研究过。所以 (pushstate + google's meta tag) 组合似乎是一个可能的解决方案。

回答by Nicole

Is pushStatebad if you need search engines to read your content?

pushState如果您需要搜索引擎来阅读您的内容,这很糟糕吗?

No, the talk about pushStateis geared around accomplishing the same general process to hashbangs, but with better-looking URLs. Think about what really happens when you use hashbangs...

不,讨论pushState是围绕完成与 hashbangs 相同的一般过程,但具有更好看的 URL。想想当你使用 hashbangs 时会发生什么......

You say:

你说:

With hashbangs, Google knows to go to the escaped_fragment URL to get their static content.

使用 hashbangs,谷歌知道去escaped_fragment URL 来获取它们的静态内容。

So in other words,

所以换句话说,

  1. Google sees a link to example.com/#!/blog
  2. Google requests example.com/?_escaped_fragment_=/blog
  3. You return a snapshot of the content the user should see
  1. 谷歌看到一个链接 example.com/#!/blog
  2. 谷歌请求 example.com/?_escaped_fragment_=/blog
  3. 返回用户应该看到的内容的快照

As you can see, it already relies on the server. If you aren't serving a snapshot of the content from the server, then your site isn't getting indexed properly.

如您所见,它已经依赖于服务器。 如果您没有提供来自服务器的内容快照,那么您的网站就没有被正确编入索引。

So how will Google see anything with pushState?

那么谷歌将如何看待 pushState 的任何事情呢?

With pushState, google just sees nothing as it can't use the javascript to load the json and subsequently create the template.

使用 pushState,谷歌什么也看不到,因为它不能使用 javascript 加载 json 并随后创建模板。

Actually, Google will see whatever it can request at site.com/blog. A URL still points to a resource on the server, and clients still obey this contract. Of course, for modern clients, Javascript has opened up new possibilities for retrieving and interacting with content without a pagerefresh, but the contracts are the same.

实际上,Google 会在site.com/blog. URL 仍然指向服务器上的资源,客户端仍然遵守这个约定。当然,对于现代客户来说,Javascript 为在不刷新页面的情况下检索内容和与内容交互开辟了新的可能性,但契约是相同的。

So the intended elegance of pushStateis that it serves the same content to all users, old and new, JS-capable and not, but the new users get an enhanced experience.

因此, 预期的优雅pushState是它为所有用户提供相同的内容,无论是新老用户,支持 JS 还是不支持,但新用户获得了增强的体验

How do you get Google to see your content?

你如何让谷歌看到你的内容?

  1. The Facebook approach — serve the same content at the URL site.com/blogthat your client app would transform into when you push /blogonto the state. (Facebook doesn't use pushStateyet that I know of, but they do this with hashbangs)

  2. The Twitter approach — redirect all incoming URLs to the hashbang equivalent. In other words, a link to "/blog" pushes /blogonto the state. But if it's requested directly, the browser ends up at #!/blog. (For Googlebot, this would then route to _escaped_fragment_as you want. For other clients, you could pushStateback to the pretty URL).

  1. Facebook 方法——在 URLsite.com/blog上提供相同的内容,当您推/blog送到状态时,您的客户端应用程序将转换为该 URL 。(据pushState我所知,Facebook 还没有使用,但他们用 hashbangs 做到了这一点)

  2. Twitter 方法——将所有传入的 URL 重定向到等效的 hashbang。换句话说,指向“/b​​log”的链接会推/blog送到状态。但如果直接请求,浏览器最终会在#!/blog. (对于 Googlebot,这将_escaped_fragment_根据您的需要路由到。对于其他客户端,您可以pushState返回到漂亮的 URL)。

So do you lose the _escaped_fragment_capability with pushState?

那么你会失去_escaped_fragment_能力pushState吗?

In a couple of different comments, you said

在几个不同的评论中,你说

escaped fragment is completely different. You can serve pure unthemed content, cached content, and not be put under the load that normal pages are.

The ideal solution is for Google to either do JavaScript sites or implement some way of knowing that there's an escaped fragment URL even for pushstate sites (robots.txt?).

转义的片段是完全不同的。您可以提供纯无主题的内容、缓存的内容,而不会像普通页面那样承受负载。

理想的解决方案是让 Google 要么做 JavaScript 站点,要么实现某种方式来知道即使对于 pushstate 站点(robots.txt?)也存在转义的片段 URL。

The benefits you mentioned are not isolated to _escaped_fragment_. That it does the rewriting for you and uses a specially-named GETparam is really an implementation detail. There is nothing really special about it that you couldn't do with standard URLs — in other words, rewrite /blogto /?content=/blogon your own using mod_rewriteor your server's equivalent.

你提到的好处并不是孤立的_escaped_fragment_。它为您进行重写并使用特殊命名的GET参数实际上是一个实现细节。有没有什么特别的地方,你不能做标准的URL -换句话说,重写/blog/?content=/blog对自己使用的mod_rewrite或服务器的等价物。

What if you don't serve server-side content at all?

如果您根本不提供服务器端内容怎么办?

If you can't rewrite URLs and serve some kind of contentat /blog(or whatever state you pushed into the browser), then your server is really no longer abiding by the HTTP contract.

如果您无法重写URL和服务某类内容/blog(或任何你推入浏览器状态),那么你的服务器是真的不再由HTTP守合同。

This is important because a page reload (for whatever reason) will pull content at this URL. (See https://wiki.mozilla.org/Firefox_3.6/PushState_Security_Review— "view-source and reload will both fetch the content at the new URI if one was pushed.")

这很重要,因为页面重新加载(无论出于何种原因)都会在此 URL 上拉取内容。(请参阅https://wiki.mozilla.org/Firefox_3.6/PushState_Security_Review—“如果推送了,view-source 和 reload 都会在新 URI 处获取内容。”)

It's not that drawing user interfaces once on the client-side and loading content via JS APIs is a bad goal, its just that it isn't really accounted for with HTTP and URLs and it's basically not backward-compatible.

并不是说在客户端绘制一次用户界面并通过 JS API 加载内容是一个糟糕的目标,只是它并没有真正考虑到 HTTP 和 URL,并且基本上不向后兼容。

At the moment,this is the exact thing that hashbangs are intended for — to represent distinct page states that are navigated on the client and not on the server. A reload, for example, will load the sameresource which can then read, parse, and process the hashed value.

目前,这正是 hashbangs 的目的——代表在客户端而不是服务器上导航的不同页面状态。例如,重新加载将加载相同的资源,然后该资源可以读取、解析和处理散列值。

It just happens to be that they have also been used(notably by Facebook and Twitter) to change the history to a server-side location without a page refresh. It is in those use cases that people are recommending abandoning hashbangs for pushState.

碰巧的是,它们也被用于(尤其是 Facebook 和 Twitter)将历史更改为服务器端位置,而无需刷新页面。 正是在这些用例中,人们建议为 pushState 放弃 hashbangs。

If you render all content client-side, you should think of pushStateas part of a more convenient history API, and not a way out of using hashbangs.

如果您在客户端呈现所有内容,您应该将其pushState视为更方便的历史 API 的一部分,而不是使用 hashbangs 的方法。

回答by Julian

All interesting talk about pushState and #!, and I still cannot see how pushState replaces #!'s purpose as the original poster asks.

关于 pushState 和 的所有有趣的讨论#!,我仍然看不到 pushState 如何替换 #! 的目的,正如原始海报所要求的那样。

Our solution to make our 99% JavaScript-based Ajax site/application SEOable is using #!of course. Since client rendering is done via HTML, JavaScript and PHP we use the following logic in a loader controlled by our page landing. The HTML files are totally separated from the JavaScript and PHP because we want the same HTML in both (for most part). The JavaScript and PHP do mostly the same thing, but the PHP code is less complicated as the JavaScript is a much richer user experience.

我们的解决方案是使我们的 99% 基于 JavaScript 的 Ajax 站点/应用程序可搜索引擎优化#!。由于客户端渲染是通过 HTML、JavaScript 和 PHP 完成的,我们在由我们的页面登陆控制的加载器中使用以下逻辑。HTML 文件与 JavaScript 和 PHP 完全分离,因为我们希望两者(大部分)具有相同的 HTML。JavaScript 和 PHP 的功能大致相同,但 PHP 代码不那么复杂,因为 JavaScript 的用户体验要丰富得多。

The JavaScript uses jQuery to inject into HTML the content it wants. PHP uses PHPQuery to inject into the HTML the content it wants - using 'almost' the same logic, but much simpler as the PHP version will only be used to display an SEOable version with SEOable links and not be interacted with like the JavaScript version.

JavaScript 使用 jQuery 将它想要的内容注入 HTML。PHP 使用 PHPQuery 将它想要的内容注入到 HTML 中 - 使用“几乎”相同的逻辑,但更简单,因为 PHP 版本将仅用于显示带有 SEOable 链接的 SEOable 版本,而不是像 JavaScript 版本那样进行交互。

All are the three components that make up a page, page.htm, page.js and page.php exist for anything that uses the escaped fragment to know whether to load the PHP version in place of the JavaScript version. The PHP version doesn't need to exist for non-SEOable content (such as pages that can only be seen after user login). All is straightforward.

所有都是组成页面的三个组件,page.htm、page.js 和 page.php 存在于任何使用转义片段来知道是否加载 PHP 版本来代替 JavaScript 版本的任何内容。对于不可搜索引擎优化的内容(例如只有在用户登录后才能看到的页面),不需要存在 PHP 版本。一切都很简单。

I'm still puzzled how some front end developers get away developing great sites (with the richness of Google Docs) without using server-side technologies in conjunction with browser ones... If JavaScript is not even enabled, then our 99% JavaScript solution will of course not do anything without the PHP in place.

我仍然感到困惑的是,一些前端开发人员如何在不将服务器端技术与浏览器技术结合使用的情况下开发出色的网站(具有丰富的 Google Docs)……如果甚至没有启用 JavaScript,那么我们的 99% JavaScript 解决方案如果没有 PHP,当然不会做任何事情。

It is possible to have a nice URL to land on a PHP served page and redirect to a JavaScript version if JavaScript is enabled, but that is not nice from a user perspective since users are the more important audience.

如果启用了 JavaScript,可能会有一个很好的 URL 登陆 PHP 服务页面并重定向到 JavaScript 版本,但从用户的角度来看,这并不好,因为用户是更重要的受众。

On a side note. If you are making just a simple website that can function without any JavaScript, then I can see pushState being useful if your want to progressively enhance your user experience from a simple statically rendered content into something better, but if you want to give your user the best experience from the go get... let's say your latest game written in JavaScript or something like Google Docs then it's use for this solution is somewhat limiting as gracefully falling back can only go so far before the user experience is painful compared to the vision of the site.

在旁注。如果您只是制作一个无需任何 JavaScript 即可运行的简单网站,那么如果您想逐步增强用户体验,从简单的静态呈现内容到更好的内容,我可以看到 pushState 很有用,但如果您想为用户提供获得最佳体验...假设您的最新游戏是用 JavaScript 或 Google Docs 之类的东西编写的,那么它用于此解决方案是有些限制的,因为与愿景相比,在用户体验痛苦之前,优雅地回退只能走这么远的网站。