Html 如何查找网页上次更新的时间

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/23644436/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-29 01:47:08  来源:igfitidea点击:

How to find when a web page was last updated

htmlupload

提问by guisantogui

Is there a way to find out how much time has passed since a web page was changed?

有没有办法找出自网页更改以来已经过去了多长时间?

For example, I have a page hosted at: www.mywebsitenotupdated.com

例如,我有一个页面托管在: www.mywebsitenotupdated.com

Is there a way to find out when this HTML page was uploaded to the server?

有没有办法找出这个 HTML 页面何时上传到服务器?

I have no access to server; just a link to the webpage.

我无法访问服务器;只是一个网页链接。

采纳答案by Jukka K. Korpela

No, you cannot know when a page was last updated or last changed or uploaded to a server (which might, depending on interpretation, be three different things) just by accessing the page.

不,仅通过访问页面,您无法知道页面上次更新或上次更改或上传到服务器的时间(根据解释,这可能是三种不同的事情)。

A server may, and should (according to the HTTP 1.1 protocol), send a Last-Modifiedheader, which you can find out in several ways, e.g. using Rex Swain's HTTP Viewer. However, according to the protocol, this is just

服务器可以并且应该(根据 HTTP 1.1 协议)发送一个Last-Modifiedheader,您可以通过多种方式找到它,例如使用Rex Swain 的 HTTP Viewer。但是,根据协议,这只是

“the date and time at which the origin server believes the variant was last modified”.

“源服务器认为该变体上次修改的日期和时间”。

And the protocol realistically adds:

该协议实际上增加了:

“The exact meaning of this header field depends on the implementation of the origin server and the nature of the original resource. For files, it may be just the file system last-modified time. For entities with dynamically included parts, it may be the most recent of the set of last-modify times for its component parts. For database gateways, it may be the last-update time stamp of the record. For virtual objects, it may be the last time the internal state changed.”

“这个头域的确切含义取决于源服务器的实现和原始资源的性质。对于文件,它可能只是文件系统上次修改的时间。对于具有动态包含的部件的实体,它可能是其组成部件的一组上次修改时间中的最新时间。对于数据库网关,它可能是记录的最后更新时间戳。对于虚拟对象,这可能是内部状态最后一次改变。”

In practice, web pages are very often dynamically created from a Content Management System or otherwise, and in such cases, the Last-Modifiedheader typically shows a data stamp of creating the response, which is normally very close to the time of the request. This means that the header is practically useless in such cases.

在实践中,网页通常是从内容管理系统或其他方式动态创建的,在这种情况下,Last-Modified标题通常显示创建响应的数据戳,通常非常接近请求的时间。这意味着标题在这种情况下实际上是无用的。

Even in the case of a “static” page (the server simply picks up a file matching the request and sends it), the Last-Modifieddate stamp normally indicates just the last write access to the file on the server. This might relate to a time when the file was restored from a backup copy, or a time when the file was edited on the server without making any change to the content, or a time when it was uploaded onto the server, possibly replacing an older identical copy. In these cases, assuming that the time stamp is technically correct, it indicates a time after which the page has not been changed (but not necessarily the time of last change).

即使在“静态”页面的情况下(服务器只是选择与请求匹配的文件并发送它),Last-Modified日期戳通常仅指示对服务器上文件的最后一次写访问。这可能与文件从备份副本恢复的时间有关,或者文件在服务器上编辑而不对内容进行任何更改的时间,或者文件上传到服务器的时间,可能会替换旧的相同的副本。在这些情况下,假设时间戳在技术上是正确的,它表示页面未更改之前的时间(但不一定是上次更改的时间)。

回答by Vaux42

Open your browsers console(?) and enter the following:

打开浏览器控制台(?)并输入以下内容:

javascript:alert(document.lastModified)

回答by hooke

There is another way to find the page update which could be useful for some occasions (if works:).

还有另一种方法可以找到在某些情况下可能有用的页面更新(如果有效:)。

If the page has been indexed by Google, or by Wayback Machineyou can try to find out what date(s) was(were) saved by them (these methods do not work for any page, and have some limitations, which are extensively investigated in this webmasters.stackexchange question's answers. But in many cases they can help you to find out the page update date(s):

如果页面已被 Google 或Wayback Machine编入索引,您可以尝试找出它们保存的日期(这些方法不适用于任何页面,并且有一些限制,这些限制已被广泛调查)在这个 webmasters.stackexchange问题的答案中。但在许多情况下,它们可以帮助您找出页面更新日期:

  1. Google way: Go by link https://www.google.com.ua/search?q=site%3Awww.example.com&biw=1855&bih=916&source=lnt&tbs=cdr%3A1%2Ccd_min%3A1%2F1%2F2000%2Ccd_max%3A&tbm=
  2. Wayback machine way: Go by link https://web.archive.org/web/*/www.example.com
    • for this stackoverflow page wayback machine gives usmore results: Saved 6 times between June 7, 2014 and November 23, 2016., and you can view all saved copies for each date
  1. 谷歌方式:通过链接https://www.google.com.ua/search?q=site%3Awww.example.com&biw=1855&bih=916&source=lnt&tbs=cdr%3A1%2Ccd_min%3A1%2F1%2F2000%2Ccd_max% 3A&tbm=
    • 您可以通过所需的任何页面 URL 更改搜索字段中的文本。
    • 例如,当前的 stackoverflow 问题页面搜索为我们提供了 2014 年 5 月 14 日的结果 - 这是问题创建日期:在此处输入图片说明
  2. Wayback 机器方式:通过链接https://web.archive.org/web/*/www.example.com
    • 对于此 stackoverflow 页面回溯机器为我们提供了更多结果:Saved 6 times between June 7, 2014 and November 23, 2016.,您可以查看每个日期的所有保存副本

回答by Martin Thoma

For checking the Last Modifiedheader, you can use httpie(docs).

要检查Last Modified标题,您可以使用httpie( docs)。

Installation

安装

pip install httpie --user

Usage

用法

$ http -h https://martin-thoma.com/author/martin-thoma/ | grep 'Last-Modified\|Date'
Date: Fri, 06 Jan 2017 10:06:43 GMT
Last-Modified: Fri, 06 Jan 2017 07:42:34 GMT

The Dateis important as this reports the server time, not your local time. Also, not every server sends Last-Modified(e.g. superuser seems not to do it).

Date很重要,因为这会报告服务器时间,而不是您的本地时间。此外,并非每个服务器都发送Last-Modified(例如超级用户似乎不这样做)。

回答by Marc Maxmeister

This is a Pythonic wayto do it:

这是一种 Pythonic 的方式来做到这一点:

import httplib
import yaml
c = httplib.HTTPConnection(address)
c.request('GET', url_path)
r = c.getresponse()
# get the date into a datetime object
lmd = r.getheader('last-modified')
if lmd != None:
   cur_data = { url: datetime.strptime(lmd, '%a, %d %b %Y %H:%M:%S %Z') }
else:
   print "Hmmm, no last-modified data was returned from the URL."
   print "Returned header:"
   print yaml.dump(dict(r.getheaders()), default_flow_style=False)

The rest of the script includes an example of archiving a page and checking for changes against the new version, and alerting someone by email.

脚本的其余部分包括存档页面和检查新版本的更改以及通过电子邮件提醒某人的示例。

回答by Naloiko Eugene

For me it was the

对我来说是

article:modified_time

in the page source.

在页面源中。

View Page Source

查看页面源代码