javascript 在单页应用程序中,处理错误 URL(404 错误)的正确方法是什么?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/14779190/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-26 22:33:18  来源:igfitidea点击:

In a single-page app, what is the right way to deal with wrong URLs (404 errors)?

javascriptangularjshttp-status-code-404singlepagesingle-page-application

提问by jssebastian

I am currently writing a web application using angularjs, but I think this question applies to any client-side javascript framework that does routing on the client side (as angular does).

我目前正在使用 angularjs 编写一个 Web 应用程序,但我认为这个问题适用于任何在客户端进行路由的客户端 javascript 框架(就像 angular 一样)。

In a single-page app, what is the right way to deal with wrong URLs?

在单页应用程序中,处理错误 URL 的正确方法是什么?

Looking at a few major sites, I see that gmail will redirect to the inbox if you type any random URL below https://mail.google.com/mail/. This happens server-side (with an http 300 code) or client-side, depending on whether the wrong path is before or after the # character. On the other hand, twitter shows a real HTTP 404 for any invalid URL. A third option would be to show a "soft" 404, a purely client-side error page.

查看一些主要站点,我发现如果您在https://mail.google.com/mail/ 下键入任何随机 URL,gmail 将重定向到收件箱。这发生在服务器端(使用 http 300 代码)或客户端,具体取决于错误路径是在 # 字符之前还是之后。另一方面,twitter 会为任何无效的 URL 显示一个真正的 HTTP 404。第三种选择是显示“软”404,一个纯粹的客户端错误页面。

These solutions seem appropriate for different situations. Twitter wants the links to twitter users and tweets to be real links, so people can share them, post them in news articles, etc, so it is important that invalid links be recognized as such (if I have a broken link to a tweet in my website, a simple crawl will tell me that). In gmail, on the other hand, you are not expected to share links into your inbox, and I'm not even sure if the links are really permanent/persistent: it seems the url updating mostly serves the purpose of browser history navigation within the single-page app. The third approach of giving soft errors might be appropriate for situations similar to gmail, but where there is no reasonable "default" page.

这些解决方案似乎适用于不同的情况。Twitter 希望 Twitter 用户和推文的链接是真实链接,这样人们就可以分享它们,将它们发布在新闻文章中等,因此识别无效链接很重要(如果我在我的网站,一个简单的爬行就会告诉我)。另一方面,在 gmail 中,您不会将链接共享到收件箱中,我什至不确定这些链接是否真的是永久/持久的:似乎 url 更新主要用于浏览器历史记录导航单页应用程序。给出软错误的第三种方法可能适用于类似于 gmail 的情况,但没有合理的“默认”页面。

After this long introduction, here are some specific questions:

经过这么长的介绍,这里有一些具体的问题:

  • Is it ever acceptable to give a "soft" error page instead of a 404 error, or should a single-page app always redirect to a real 404 if a url is invalid?
  • Gmail's code may be perfectly bugfree, but if it did have a bug leading to invalid links that end up redirecting back to the inbox, that might be even more confusing for users than an error page. For most web apps out there, that are not as well tested as gmail, would it be better to show an error page?
  • To implement real 404s for single-page apps, it seems necessary to duplicate the routing logic on the server-side. Is there any way around this?
  • When redirecting to a 404, I think the user should be able to see the URL that caused the error, possibly in the URL bar. With the html5 history api, I think this can be accomplished by simply triggering a reload of the current page (with the wrong url), combined with the server-side routing mentioned above. For browsers that do not support this or when using hashbang notation, this does not seem possible. What's the best way to support all browsers?
  • 提供“软”错误页面而不是 404 错误是否可以接受,或者如果 url 无效,单页应用程序是否应该始终重定向到真正的 404?
  • Gmail 的代码可能完全没有错误,但如果它确实存在导致无效链接最终重定向回收件箱的错误,那么对于用户来说,这可能比错误页面更令人困惑。对于大多数网络应用程序,它们没有像 gmail 那样经过充分测试,显示错误页面会更好吗?
  • 要为单页应用实现真正的 404,似乎有必要在服务器端复制路由逻辑。有没有办法解决?
  • 当重定向到 404 时,我认为用户应该能够看到导致错误的 URL,可能在 URL 栏中。使用html5 history api,我认为这可以通过简单地触发当前页面的重新加载(使用错误的url),结合上面提到的服务器端路由来完成。对于不支持此功能或使用 hashbang 符号的浏览器,这似乎是不可能的。支持所有浏览器的最佳方式是什么?

回答by Denis Pshenov

If you care about SEO, one of the ways that angular.io was able to solve this problem(at least with Google anyway) is by using noindex meta tag"to indicate soft-404 status which will prevent crawlers from crawling the content of the page". Apparently it can be added to the document via JavaScript.

如果您关心 SEO,angular.io 能够解决此问题的方法之一(至少对于 Google 而言)是使用noindex 元标记“来指示软 404 状态,这将阻止爬虫抓取页”。显然它可以通过 JavaScript 添加到文档中。

Alternatively, using JavaScript, you can redirect to a page that will respond with an actual HTTP 404 status code. Google understands JavaScript redirectsjust fine. Your original /does-not-existpage, when redirected to /404-error?from=does-not-exist, will be associated with the 404 status code returned by the server. The URL structure does not matter, only the status code and the redirect are important here.

或者,使用 JavaScript,您可以重定向到一个页面,该页面将以实际的 HTTP 404 状态代码进行响应。谷歌理解 JavaScript 重定向就好了。您的原始/does-not-exist页面在重定向到/404-error?from=does-not-exist时将与服务器返回的 404 状态代码相关联。URL 结构无关紧要,这里只有状态代码和重定向很重要。

Your other options are SSR (Nuxt.js, Next.js, Angular Universal, etc) or pre-rendering (prerender.io, puppeteer, etc) which Google calls dynamic renderingwhere you respond to search bot requests with a pre-rendered version while human users get your normal client-side rendered app.

您的其他选项是 SSR(Nuxt.js、Next.js、Angular Universal 等)或预渲染(prerender.io、puppeteer 等),Google 称之为动态渲染,您可以在其中使用预渲染版本响应搜索机器人请求而人类用户获得您的普通客户端渲染应用程序。

回答by Adam Gent

tl;dr:Drop hashbang support and opt for PJAXlike behavior if you care about SEO.

tl; dr:如果您关心 SEO,请放弃 hashbang 支持并选择类似PJAX 的行为。

Are you making an App or a Website? If website you need to return 404so that you don't confuse google. It needs be a real 404not just show a message of page not found (ie 200with message "page not found" is very bad). Also what browsers do you care to support?

你是在做一个应用程序还是一个网站?如果您需要返回网站,404以免混淆 google。它需要是一个真实的,404而不是只显示未找到页面的消息(即200“未找到页面”消息非常糟糕)。还有你关心支持哪些浏览器?

My opinion is that the whole hashbang server side rendering should be avoided (ie the nasty Google SEO #!hack). Either use real pushstate or re-render the whole page if the URL changes for browsers that don't support pushstate (not a hash change).

我的观点是应该避免整个 hashbang 服务器端渲染(即讨厌的 Google SEO #!hack)。如果浏览器不支持 pushstate(不是哈希更改)的 URL 发生更改,则使用真实的 pushstate 或重新呈现整个页面。

Now the reason this matters is that a #!should never return a 404because it doesn't make sense and its impossible to mimic server side because the server never gets whats after the #!with out running Javascript.

现在这很重要的原因是 a#!不应该返回 a404因为它没有意义并且不可能模仿服务器端,因为服务器在没有#!运行 Javascript之后永远不会得到什么。

Thus if you really care about SEO I would do something like PJAX and only use true pushstate for routing and then just fail to old web 1.0. Consequently the links I recommend you share that can truly be a 404should not have #!(traditional #being fine so long as the contents of the page don't change drastically).

因此,如果你真的关心 SEO,我会做一些类似 PJAX 的事情,并且只使用真正的 pushstate 进行路由,然后就无法使用旧的 web 1.0。因此,我建议您分享的链接确实是404不应该有的#!#只要页面内容不会发生剧烈变化,传统上就可以)。

Finally the 404is mostly not a problem but rather 30Xie redirect responses. Thats because the browser will automatically handle redirects so your Javascript AJAX calls will never see a 30X(they will get the redirect response instead... ie 200). To handle 30Xresponses you will have to send a header back for every request to indicate what the redirected URL is/was (ie what you were redirected to) so that you don't mess up the Pushstate History.

最后,这404主要不是问题,而是30X重定向响应。那是因为浏览器会自动处理重定向,因此您的 Javascript AJAX 调用将永远不会看到 a 30X(他们将获得重定向响应......即 200)。要处理30X响应,您必须为每个请求发回一个标头,以指示重定向的 URL 是/曾经是什么(即您被重定向到什么),这样您就不会弄乱 Pushstate 历史记录。

Of course if you need to support hashbang like Twitter used too (and they are the ones that even killed hashbang), you can leverage Google Sitemaps and the rel=nofollowto try to mitigate bad SEO.

当然,如果您也需要像 Twitter 使用的那样支持 hashbang(它们甚至杀死了 hashbang),您可以利用 Google Sitemaps 和rel=nofollow尝试减轻糟糕的 SEO。