database 为什么我们需要时态数据库?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/800331/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Why do we need a temporal database?
提问by Arnkrishn
I was reading about temporal databases and it seems they have built in time aspects. I wonder why would we need such a model?
我正在阅读有关时态数据库的内容,似乎它们已建立在时间方面。我想知道为什么我们需要这样一个模型?
How different is it from a normal RDBMS? Can't we have a normal database i.e. RDBMS and say have a trigger which associates a time stamp with each transaction that happens? May be there would be a performance hit. But I'm still skeptical on temporal databases having a strong case in the market.
它与普通 RDBMS 有何不同?我们不能有一个普通的数据库,即 RDBMS 并说有一个触发器,它将时间戳与发生的每个事务相关联?可能会有性能下降。但我仍然怀疑时态数据库在市场上有很强的案例。
Does any of the present databases support such a feature?
是否有任何现有的数据库支持这样的功能?
采纳答案by codekaizen
A temporal database efficiently stores a time series of data, typically by having some fixed timescale (such as seconds or even milliseconds) and then storing only changes in the measured data. A timestamp in an RDBMS is a discretely stored value for each measurement, which is very inefficient. A temporal database is often used in real-time monitoring applications like SCADA. A well-established system is the PI database from OSISoft (http://www.osisoft.com/).
时态数据库可以有效地存储时间序列数据,通常通过具有一些固定的时间尺度(例如秒甚至毫秒),然后仅存储测量数据中的变化。RDBMS 中的时间戳是每个测量的离散存储值,这是非常低效的。时态数据库通常用于像 SCADA 这样的实时监控应用程序。一个完善的系统是来自 OSISoft ( http://www.osisoft.com/)的 PI 数据库。
回答by Jon Guiton
Consider your appointment/journal diary - it goes from Jan 1st to Dec 31st. Now we can query the diary for appointments/journal entries on any day. This ordering is called the valid time. However, appointments/entries are not usually inserted in order.
考虑一下您的约会/日记 - 从 1 月 1 日到 12 月 31 日。现在我们可以在日记中查询任何一天的约会/日记条目。这种排序称为有效时间。但是,约会/条目通常不是按顺序插入的。
Suppose I would like to know what appointments/entries were in my diary on April 4th. That is, all the records that existed in my diary on April 4th. This is the transaction time.
假设我想知道 4 月 4 日我的日记中有哪些约会/条目。也就是4月4日我日记中存在的所有记录。这是交易时间。
Given that appointments/entries can be created and deleted etc. A typical record has a beginning and end valid time that covers the period of the entry and a beginning and end transaction time that indicates the period during which the entry appeared in the diary.
鉴于可以创建和删除约会/条目等。典型的记录具有涵盖条目期间的开始和结束有效时间以及指示条目出现在日记中的期间的开始和结束交易时间。
This arrangement is necessary when the diary may undergo historical revision. Suppose on April 5th I realise that the appointment I had on Feb 14th actually occurred on February 12th i.e. I discover an error in my diary - I can correct the error so that the valid time picture is corrected, but now, my query of what was in the diary on April 4th would be wrong, UNLESS, the transaction times for appointments/entries are also stored. In that case if I query my diary as of April 4th it will show an appointment existed on February 14th but if I query as of April 6th it would show an appointment on February 12th.
在日记可能要进行历史修改时,这种安排是必要的。假设在 4 月 5 日我意识到我在 2 月 14 日的约会实际上发生在 2 月 12 日,即我在我的日记中发现了一个错误 - 我可以更正错误以便更正有效的时间图片,但是现在,我的查询是什么在 4 月 4 日的日记中是错误的,除非,约会/条目的交易时间也被存储。在这种情况下,如果我查询截至 4 月 4 日的日记,它将显示 2 月 14 日存在约会,但如果我查询截至 4 月 6 日,它将显示 2 月 12 日的约会。
This time travel feature of a temporal database makes it possible to record information about how errors are corrected in a database. This is necessary for a true audit picture of data that records when revisions were made and allows queries relating to how data have been revised over time.
时间数据库的这种时间旅行功能可以记录有关如何在数据库中纠正错误的信息。这对于记录修订时间的真实数据审计图是必要的,并允许查询有关数据如何随时间修订的信息。
Most business information should be stored in this bitemporal scheme in order to provide a true audit record and to maximise business intelligence - hence the need for support in a relational database. Notice that each data item occupies a (possibly unbounded) square in the two dimensional time model which is why people often use a GIST index to implement bitemporal indexing. The problem here is that a GIST index is really designed for geographic data and the requirements for temporal data are somewhat different.
大多数业务信息应该存储在这种双时态方案中,以提供真实的审计记录并最大限度地提高商业智能——因此需要关系数据库的支持。请注意,每个数据项在二维时间模型中占据一个(可能是无界的)正方形,这就是人们经常使用 GIST 索引来实现双时态索引的原因。这里的问题是 GIST 索引确实是为地理数据设计的,而对时态数据的要求有些不同。
PostgreSQL 9.0 exclusion constraints should provide new ways of organising temporal data e.g. transaction and valid time PERIODs should not overlap for the same tuple.
PostgreSQL 9.0 排除约束应该提供组织时间数据的新方法,例如事务和有效时间周期不应为同一个元组重叠。
回答by Jonathan Leffler
As I understand it (and over-simplifying enormously), a temporal database records facts about when the data was valid as well as the the data itself, and permits you to query on the temporal aspects. You end up dealing with 'valid time' and 'transaction time' tables, or 'bitemporal tables' involving both 'valid time' and 'transaction time' aspects. You should consider reading either of these two books:
据我了解(并过度简化),时态数据库记录有关数据何时有效以及数据本身的事实,并允许您查询时态方面。您最终会处理“有效时间”和“交易时间”表,或涉及“有效时间”和“交易时间”方面的“双时态表”。您应该考虑阅读这两本书中的任何一本:
- Darwen, Date and Lorentzos "Temporal Data and the Relational Model" (out of print),
- and (at a radically different extreme) "Developing Time-Oriented Database Applications in SQL", Richard T. Snodgrass, Morgan Kaufmann Publishers, Inc., San Francisco, July, 1999, 504+xxiii pages, ISBN 1-55860-436-7. That is out of print but available as PDF on his web site at cs.arizona.edu(so a Google search makes it pretty easy to find).
- Darwen、Date 和 Lorentzos “时间数据和关系模型”(绝版),
- 和(在一个完全不同的极端)“在 SQL 中开发面向时间的数据库应用程序”,Richard T. Snodgrass,Morgan Kaufmann Publishers, Inc.,旧金山,1999 年 7 月,504+xxiii 页,ISBN 1-55860-436- 7. 那是绝版的,但可以在他的网站cs.arizona.edu上以 PDF 格式提供(因此谷歌搜索很容易找到)。
回答by bob
Temporal databases are often used in the financial services industry. One reason is that you are rarely (if ever) allowed to delete any data, so ValidFrom - ValidTo type fields on records are used to provide an indication of when a record was correct.
时态数据库经常用于金融服务行业。一个原因是您很少(如果有的话)被允许删除任何数据,因此记录上的 ValidFrom - ValidTo 类型字段用于提供记录何时正确的指示。
回答by Ron Burk
Besides "what new things can I do with it", it might be useful to consider "what old things does it unify?". The temporal database represents a particular generalization of the "normal" SQL database. As such, it may give you a unified solution to problems that previously appeared unrelated. For example:
除了“我可以用它做什么新事物”之外,考虑“它统一了哪些旧事物?”可能会有所帮助。时态数据库代表了“普通”SQL 数据库的特殊概括。因此,它可以为您以前看似无关的问题提供统一的解决方案。例如:
- Web ConcurrencyWhen your database has a web UI that lets multiple users perform standard Create/Update/Delete (CRUD) modifications, you have to face the concurrent web changes problem. Basically, you need to check that an incoming data modification is not affecting any records that have changed since that user last saw those records. But if you have a temporal database, it quite possibly already associates something like a "revision ID" with each record (due to the difficulty of making timestamps unique and monotonically ascending). If so, then that becomes the natural, "already built-in" mechanism for preventing the clobbering of other users' data during database updates.
- Legal/Tax RecordsThe legal system (including taxes) places rather more emphasis on historical data than most programmers do. Thus, you will often find adviceabout schemas for invoices and such that warns you to beware of deleting records or normalizing in a natural way--which can lead to an inability to answer basic legal questions like "Forget their current address, what address did you mail this invoice to in 2001?" With a temporal framework base, all the machinations to those problems (they usually are halfway steps to having a temporal database) go away. You just use the most natural schema, and delete when it make sense, knowing that you can always go back and answer historical questions accurately.
- Web 并发当您的数据库具有允许多个用户执行标准创建/更新/删除 (CRUD) 修改的 Web UI 时,您必须面对并发 Web 更改问题。基本上,您需要检查传入的数据修改是否不会影响自该用户上次看到这些记录以来发生更改的任何记录。但是,如果您有一个时态数据库,它很可能已经将诸如“修订 ID”之类的东西与每条记录相关联(由于难以使时间戳唯一且单调递增)。如果是这样,那么这将成为一种自然的“内置”机制,用于防止在数据库更新期间破坏其他用户的数据。
- 法律/税务记录法律系统(包括税务)比大多数程序员更重视历史数据。因此,您经常会找到有关发票模式的建议,并警告您小心删除记录或以自然方式规范化——这可能导致无法回答基本的法律问题,例如“忘记他们当前的地址,地址是什么?你把这张发票寄到 2001 年?” 有了时态框架基础,这些问题的所有阴谋(它们通常是拥有时态数据库的中途步骤)都会消失。您只需使用最自然的模式,并在有意义时删除,因为您知道您始终可以返回并准确回答历史问题。
On the other hand, the temporal model itself is half-way to complete revision control, which could inspire further applications. For example, suppose you roll your own temporal facility on top of SQL and allow branching, as in revision control systems. Even limited branching could make it easy to offer "sandboxing" -- the ability to play with and modify the database with abandon without causing any visible changes to other users. That makes it easy to supply highly realistic user training on a complex database.
另一方面,时间模型本身完成了修订控制的一半,这可能会激发进一步的应用。例如,假设您在 SQL 之上滚动您自己的临时工具并允许分支,就像在修订控制系统中一样。即使是有限的分支也可以很容易地提供“沙箱”——能够随意使用和修改数据库,而不会对其他用户造成任何可见的更改。这使得在复杂数据库上提供高度逼真的用户培训变得容易。
Simple branching with a simple merge facility could also simplify some common workflow problems. For example, a non-profit might have volunteers or low-paid workers doing data entry. Giving each worker their own branch could make it easy to allow a supervisor to review their work or enhance it (e.g., de-duplification) before merging it into the main branch where it would become visible to "normal" users. Branches could also simplify permissions. If a user is only granted permission to use/see their unique branch, you don't have to worry about preventing every possible unwanted modification; you'll only merge the changes that make sense anyway.
使用简单的合并工具进行简单的分支也可以简化一些常见的工作流程问题。例如,非营利组织可能有志愿者或低薪工人进行数据输入。为每个工作人员提供自己的分支可以让主管在将其合并到“普通”用户可以看到的主分支之前,轻松地他们的工作或对其进行增强(例如,去重)。分支机构还可以简化权限。如果用户仅被授予使用/查看其独特分支的权限,则您不必担心防止所有可能的不需要的修改;您只会合并有意义的更改。
回答by Scott Weinstein
Two reasons come to mind:
想到两个原因:
- Some are optimized for insert and read only and can offer dramatic perf improvements
- Some have better understandings of time than traditional SQL - allowing for grouping operations by second, minute, hour, etc
- 有些针对插入和只读进行了优化,可以提供显着的性能改进
- 有些比传统 SQL 对时间有更好的理解 - 允许按秒、分、小时等对操作进行分组
回答by Manoj Pandey
Just an update, Temporal database is coming to SQL Server 2016.
只是一个更新,临时数据库即将加入 SQL Server 2016。
To clear all your doubts why one need a Temporal Database, rather than configuring with custom methods, and how efficiently & seamlessly SQL Server configures it for you, check the in-depth video and demo on Channel9.msdn here: https://channel9.msdn.com/Shows/Data-Exposed/Temporal-in-SQL-Server-2016
要清除您为什么需要临时数据库而不是使用自定义方法配置的所有疑问,以及 SQL Server 为您配置它的效率和无缝程度如何,请在此处查看 Channel9.msdn 上的深入视频和演示:https://channel9 .msdn.com/Shows/Data-Exposed/Temporal-in-SQL-Server-2016
MSDN link: https://msdn.microsoft.com/en-us/library/dn935015(v=sql.130).aspx
MSDN 链接:https: //msdn.microsoft.com/en-us/library/dn935015(v=sql.130).aspx
Currently with the CTP2 (beta 2) release of SQL Server 2016 you can play with it.
目前,您可以使用 SQL Server 2016 的 CTP2(测试版 2)版本。
Check this videoon how to use Temporal Tables in SQL Server 2016.
查看此视频,了解如何在 SQL Server 2016 中使用临时表。
回答by Joel
Apart from reading the Wikipedia article? A database that maintains an "audit log" or similar transaction log will have some properties of being "temporal". If you need answers to questions about who did what to whom and whenthen you've got a good candidate for a temporal database.
除了阅读维基百科文章?维护“审计日志”或类似事务日志的数据库将具有一些“临时”属性。如果您需要回答有关谁对谁做了什么以及什么时候做的问题的答案,那么您就有了一个很好的时态数据库候选者。
回答by Scott Kirkwood
You can imagine a simple temporal database that just logs your GPS location every few seconds. The opportunities for compressing this data is great, a normal database you would need to store a timestamp for every row. If you have a great deal of throughput required, knowing the data is temporal and that updates and deletes to a row will never be required permits the program to drop a lot of the complexity inherit in a typical RDBMS.
您可以想象一个简单的时态数据库,它每隔几秒钟就记录一次您的 GPS 位置。压缩这些数据的机会很大,一个普通的数据库你需要为每一行存储一个时间戳。如果您需要大量的吞吐量,知道数据是暂时的,并且永远不需要对行进行更新和删除,这样程序就可以降低典型 RDBMS 中继承的许多复杂性。
Despite this, temporal data is usually just stored in a normal RDBMS. PostgreSQL, for example has some temporal extensions, which makes this a little easier.
尽管如此,时态数据通常只存储在普通的 RDBMS 中。例如,PostgreSQL 有一些时间扩展,这使得这更容易一些。
回答by Uri
My understanding of temporal databases is that are geared towards storing certain types of temporal information. You could simulate that with a standard RDBMS, but by using a database that supports it you have built-in idioms for a lot of concepts and the query language might be optimized for these sort of queries.
我对时态数据库的理解是面向存储某些类型的时态信息。您可以使用标准 RDBMS 模拟它,但是通过使用支持它的数据库,您可以为许多概念内置习语,并且查询语言可能会针对此类查询进行优化。
To me this is a little like working with a GIS-specific database rather than an RDBMS. While you could shove coordinates in a run-of-the-mill RDBMS, having the appropriate representations (e.g., via grid files) may be faster, and having SQL primitives for things like topology is useful.
对我来说,这有点像使用 GIS 特定的数据库而不是 RDBMS。虽然您可以在普通 RDBMS 中推送坐标,但拥有适当的表示(例如,通过网格文件)可能会更快,并且拥有用于拓扑等事物的 SQL 原语很有用。
There are academic databases and some commercial ones. Timecenter has some links.
有学术数据库和一些商业数据库。时间中心有一些链接。