python 将 Chrome 历史文件 (sqlite) 中的日期时间字段转换为可读格式

Question

提问by cit

Working on a script to collect users browser history with time stamps ( educational setting). Firefox 3 history is kept in a sqlite file, and stamps are in UNIX epoch time... getting them and converting to readable format via a SQL command in python is pretty straightforward:

使用脚本来收集带有时间戳的用户浏览器历史记录（教育设置）。Firefox 3 的历史记录保存在一个 sqlite 文件中，并且邮票是在 UNIX 纪元时间......通过 python 中的 SQL 命令获取它们并转换为可读格式非常简单：

sql_select = """ SELECT datetime(moz_historyvisits.visit_date/1000000,'unixepoch','localtime'), 
                        moz_places.url 
                 FROM moz_places, moz_historyvisits 
                 WHERE moz_places.id = moz_historyvisits.place_id
             """
get_hist = list(cursor.execute (sql_select))

Chrome also stores history in a sqlite file.. but it's history time stamp is apparently formatted as the number of microseconds since midnight UTC of 1 January 1601....

Chrome 还将历史记录存储在一个 sqlite 文件中......但它的历史时间戳显然被格式化为自 1601 年 1 月 1 日午夜 UTC 以来的微秒数......

How can this timestamp be converted to a readable format as in the Firefox example (like 2010-01-23 11:22:09)? I am writing the script with python 2.5.x ( the version on OS X 10.5 ), and importing sqlite3 module....

如何将此时间戳转换为 Firefox 示例中的可读格式（如2010-01-23 11:22:09）？我正在用 python 2.5.x（OS X 10.5 上的版本）编写脚本，并导入 sqlite3 模块....

Answer 1

采纳答案by cit

This may not be the most Pythonic code in the world, but here's a solution: Cheated by adjusting for time zone (EST here) by doing this:

这可能不是世界上最 Pythonic 的代码，但这里有一个解决方案：通过执行以下操作调整时区（此处为 EST）作弊：

utctime = datetime.datetime(1601,1,1) + datetime.timedelta(microseconds = ms, hours =-5)

Here's the function : It assumes that the Chrome history file has been copied from another account into /Users/someuser/Documents/tmp/Chrome/History

这是功能：它假设 Chrome 历史文件已从另一个帐户复制到 /Users/someuser/Documents/tmp/Chrome/History

def getcr():
    connection = sqlite3.connect('/Users/someuser/Documents/tmp/Chrome/History')
    cursor = connection.cursor()
    get_time = list(cursor.execute("""SELECT last_visit_time FROM urls"""))
    get_url = list(cursor.execute("""SELECT url from urls"""))
    stripped_time = []
    crf = open ('/Users/someuser/Documents/tmp/cr/cr_hist.txt','w' )
    itr = iter(get_time)
    itr2 = iter(get_url)

    while True:
        try:
            newdate = str(itr.next())
            stripped1 = newdate.strip(' (),L')
            ms = int(stripped1)
            utctime = datetime.datetime(1601,1,1) + datetime.timedelta(microseconds = ms, hours =-5)
            stripped_time.append(str(utctime))
            newurl = str(itr2.next())
            stripped_url = newurl.strip(' ()')
            stripped_time.append(str(stripped_url))
            crf.write('\n')
            crf.write(str(utctime))
            crf.write('\n')
            crf.write(str(newurl))
            crf.write('\n')
            crf.write('\n')
            crf.write('********* Next Entry *********') 
            crf.write('\n')
        except StopIteration:
            break

    crf.close()            

    shutil.copy('/Users/someuser/Documents/tmp/cr/cr_hist.txt' , '/Users/parent/Documents/Chrome_History_Logs')
    os.rename('/Users/someuser/Documents/Chrome_History_Logs/cr_hist.txt','/Users/someuser/Documents/Chrome_History_Logs/%s.txt' % formatdate)

Answer 2

回答by Squiqqly

Try this:

试试这个：

sql_select = """ SELECT datetime(last_visit_time/1000000-11644473600,'unixepoch','localtime'),
                        url 
                 FROM urls
                 ORDER BY last_visit_time DESC
             """
get_hist = list(cursor.execute (sql_select))

Or something along those lines

或类似的规定

seems to be working for me.

似乎对我有用。

Answer 3

回答by Andy Mikhaylenko

This is a more pythonic and memory-friendly way to do what you described (by the way, thanks for the initial code!):

这是一种更加 Pythonic 和内存友好的方式来执行您所描述的操作（顺便说一句，感谢您提供初始代码！）：

#!/usr/bin/env python

import os
import datetime
import sqlite3
import opster
from itertools import izip

SQL_TIME = 'SELECT time FROM info'
SQL_URL  = 'SELECT c0url FROM pages_content'

def date_from_webkit(webkit_timestamp):
    epoch_start = datetime.datetime(1601,1,1)
    delta = datetime.timedelta(microseconds=int(webkit_timestamp))
    return epoch_start + delta

@opster.command()
def import_history(*paths):
    for path in paths:
        assert os.path.exists(path)
        c = sqlite3.connect(path)
        times = (row[0] for row in c.execute(SQL_TIME))
        urls  = (row[0] for row in c.execute(SQL_URL))
        for timestamp, url in izip(times, urls):
            date_time = date_from_webkit(timestamp)
            print date_time, url
        c.close()

if __name__=='__main__':
    opster.dispatch()

The script can be used this way:

脚本可以这样使用：

$ ./chrome-tools.py import-history ~/.config/chromium/Default/History* > history.txt

Of course Opster can be thrown out but seems handy to me :-)

当然可以扔掉 Opster，但对我来说似乎很方便:-)

Answer 4

回答by Jason Coon

The sqlitemodule returns datetimeobjects for datetime fields, which have a format method for printing readable strings called strftime.

该sqlite模块返回datetime日期时间字段的对象，这些对象具有用于打印名为strftime.

You can do something like this once you have the recordset:

拥有记录集后，您可以执行以下操作：

for record in get_hist:
  date_string = record[0].strftime("%Y-%m-%d %H:%M:%S")
  url = record[1]

python 将 Chrome 历史文件 (sqlite) 中的日期时间字段转换为可读格式

提问by cit

采纳答案by cit

回答by Squiqqly

回答by Andy Mikhaylenko

回答by Jason Coon

相关推荐

最近更新

标签

python 将 Chrome 历史文件 (sqlite) 中的日期时间字段转换为可读格式

提问by cit

采纳答案by cit

回答by Squiqqly

回答by Andy Mikhaylenko

回答by Jason Coon

相关推荐

python 为什么我需要 DJANGO_SETTINGS_MODULE 集？

python 添加具有不同维数的数组

python QObject (QPlainTextEdit) & 多线程问题

python 在 Jinja2 中，如何将宏与块标记结合使用？

相关推荐

最近更新

标签