oracle 从数据库列创建唯一的主键(哈希)

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1329639/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 18:52:32  来源:igfitidea点击:

Create a unique primary key (hash) from database columns

sqloracleprimary-keyhash

提问by OscarRyz

I have this table which doesn't have a primary key.

我有这张没有主键的表。

I'm going to insert some records in a new table to analyze them and I'm thinking in creating a new primary key with the values from all the available columns.

我将在新表中插入一些记录来分析它们,我正在考虑使用所有可用列的值创建一个新的主键。

If this were a programming language like Java I would:

如果这是一种像 Java 这样的编程语言,我会:

 int hash = column1 * 31 + column2 * 31 + column3*31 

Or something like that. But this is SQL.

或类似的东西。但这是 SQL。

How can I create a primary key from the values of the available columns? It won't work for me to simply mark all the columns as PK, for what I need to do is to compare them with data from other DB table.

如何从可用列的值创建主键?简单地将所有列标记为 PK 对我来说是行不通的,因为我需要做的是将它们与来自其他数据库表的数据进行比较。

My table has 3 numbers and a date.

我的桌子上有 3 个数字和一个日期。

EDIT What my problem is

编辑我的问题是什么

I think a bit more of background is needed. I'm sorry for not providing it before.

我认为需要更多的背景知识。我很抱歉之前没有提供它。

I have a database ( dm ) that is being updated everyday from another db ( original source ) . It has records form the past two years.

我有一个数据库 (dm),每天从另一个 db(原始来源)更新。它有过去两年的记录。

Last month ( july ) the update process got broken and for a month there was no data being updated into the dm.

上个月(7 月)更新过程中断,一个月内没有数据更新到 dm。

I manually create a table with the same structure in my Oracle XE, and I copy the records from the original source into my db ( myxe ) I copied only records from July to create a report needed by the end of the month.

我在 Oracle XE 中手动创建了一个具有相同结构的表,并将原始源中的记录复制到我的数据库 (myxe) 中,我只复制了 7 月的记录,以创建月底所需的报告。

Finally on aug 8 the update process got fixed and the records which have been waiting to be migrated by this automatic process got copied into the database ( from originalsource to dm ).

终于在 8 月 8 日,更新过程得到修复,并且一直在等待由这个自动过程迁移的记录被复制到数据库中(从 originalsource 到 dm )。

This process does clean up from the original source the data once it is copied ( into dm ).

一旦数据被复制(到 dm ),这个过程就会从原始源中清除数据。

Everything look fine, but we have just realize that an amount of the records got lost ( about 25% of july )

一切看起来都很好,但我们刚刚意识到丢失了一些记录(大约 7 月的 25%)

So, what I want to do is to use my backup ( myxe ) and insert into the database ( dm ) all those records missing.

所以,我想要做的是使用我的备份( myxe )并将所有丢失的记录插入到数据库( dm )中。

The problem here are:

这里的问题是:

  • They don't have a well defined PK.
  • They are in separate databases.
  • 他们没有明确定义的 PK。
  • 它们位于不同的数据库中。

So I thought that If I could create a unique pk from both tables which gave the same number I could tell which were missing and insert them.

所以我想如果我可以从两个表中创建一个唯一的 pk 并给出相同的数字,我就可以分辨出哪些丢失并插入它们。

EDIT 2

编辑 2

So I did the following in my local environment:

因此,我在本地环境中执行了以下操作:

select a.* from the_table@PRODUCTION a , the_table b where
a.idle = b.idle and 
a.activity = b.activity and 
a.finishdate = b.finishdate

Which returns all the rows that are present in both databases ( the .. union? ) I've got 2,000 records.

它返回存在于两个数据库中的所有行(.. 联合?)我有 2,000 条记录。

What I'm going to do next, is delete them all from the target db and then just insert them all s from my db into the target table

我接下来要做的是将它们全部从目标数据库中删除,然后将它们全部从我的数据库中插入到目标表中

I hope I don't get in something worst : - S : -S

我希望我不会陷入最糟糕的境地:-S:-S

回答by Quassnoi

Just create a surrogate key:

只需创建一个代理键:

ALTER TABLE mytable ADD pk_col INT

UPDATE  mytable
SET     pk_col = rownum

ALTER TABLE mytable MODIFY pk_col INT NOT NULL

ALTER TABLE mytable ADD CONSTRAINT pk_mytable_pk_col PRIMARY KEY (pk_col)

or this:

或这个:

ALTER TABLE mytable ADD pk_col RAW(16)

UPDATE  mytable
SET     pk_col = SYS_GUID()

ALTER TABLE mytable MODIFY pk_col RAW(16) NOT NULL

ALTER TABLE mytable ADD CONSTRAINT pk_mytable_pk_col PRIMARY KEY (pk_col)

The latter uses GUID's which are unique across databases, but consume more spaces and are much slower to generate (your INSERT's will be slow)

后者使用GUID's 在数据库中是唯一的,但消耗更多空间并且生成速度要慢得多(你的INSERT's 会很慢)

Update:

更新:

If you need to create same PRIMARY KEYs on two tables with identicaldata, use this:

如果您需要PRIMARY KEY在具有相同数据的两个表上创建相同的s ,请使用以下命令:

MERGE
INTO    mytable v
USING   (
        SELECT  rowid AS rid, rownum AS rn
        FROM    mytable
        ORDER BY
                co1l, col2, col3
        )
ON      (v.rowid = rid)
WHEN MATCHED THEN
UPDATE
SET     pk_col = rn

Note that tables should be identical up to a single row (i. e. have same number of rows with same data in them).

请注意,表格最多应与一行相同(即具有相同数量的行,其中包含相同的数据)。

Update 2:

Update 2

For your very problem, you don't need a PKat all.

对于您的问题,您根本不需要 a PK

If you just want to select the records missing in dm, use this one (on dmside)

如果您只想选择 中缺少的记录dm,请使用此(dm侧面)

SELECT  *
FROM    mytable@myxe
MINUS
SELECT  *
FROM    mytable

This will return all records that exist in mytable@myxebut not in mytable@dm

这将返回存在于mytable@myxe但不存在于的所有记录mytable@dm

Note that it will shrink all duplicates if any.

请注意,它会缩小所有重复项(如果有)。

回答by Adamski

The danger of creating a hash value by combining the 3 numbers and the date is that it might not be unique and hence cannot be used safely as a primary key.

通过组合 3 个数字和日期来创建哈希值的危险在于它可能不是唯一的,因此不能安全地用作主键。

Instead I'd recommend using an autoincrementing ID for your primary key.

相反,我建议为主键使用自动递增 ID。

回答by Cynthia

Assuming that you have ensured uniqueness...you can do almost the same thing in SQL. The only problem will be the conversion of the date to a numeric value so that you can hash it.

假设您已确保唯一性……您可以在 SQL 中做几乎相同的事情。唯一的问题是将日期转换为数值,以便您可以对其进行散列。

Select Table2.SomeFields 
    FROM Table1 LEFT OUTER JOIN Table2 ON
        (Table1.col1 * 31) + (Table1.col2 * 31) + (Table1.col3 * 31) + 
            ((DatePart(year,Table1.date) + DatePart(month,Table1.date) + DatePart(day,Table1.date) )* 31) = Table2.hashedPk

The above query would work for SQL Server, the only difference for Oracle would be in terms of how you handle the date conversion. Moreover, there are other functions for converting dates in SQL Server as well, so this is by no means the only solution.

上述查询适用于 SQL Server,Oracle 的唯一区别在于您如何处理日期转换。此外,SQL Server 中还有其他用于转换日期的函数,因此这绝不是唯一的解决方案。

And, you can combine this with Quassnoi's SET statement to populate the new field as well. Just use the left side of the Join condition logic for the value.

而且,您还可以将其与 Quassnoi 的 SET 语句结合使用来填充新字段。只需使用 Join 条件逻辑的左侧作为值。

回答by Philip Kelley

If you're loading your new table with values from the old table, and you then need to join the two tables, you can only "properly" do this if you can uniquely identify each row in the original table. Quassnoi's solution will allow you to do this, IF you can first alter the old table by adding a new column.

如果您使用旧表中的值加载新表,然后需要连接这两个表,则只有在可以唯一标识原始表中的每一行的情况下,才能“正确”执行此操作。如果您可以先通过添加新列来更改旧表,那么 Quassnoi 的解决方案将允许您执行此操作。

If you cannot alter the original table, generating some form of hash code based on the columns of the old table would work -- but, again, only if the hash codes uniquely identify each row. (Oracle has checksum functions, right? If so, use them.)

如果您无法更改原始表,则基于旧表的列生成某种形式的哈希码会起作用——但同样,前提是哈希码唯一标识每一行。(Oracle 有校验和函数,对吗?如果有,就使用它们。)

If hash code uniqueness cannot be guaranteed, you may have to settle for a primary key composed of as many columns are required to ensure uniqueness (e.g. the natural key). If there is no natural key, well, I heard once that Oracle provides a rownum for each row of data, could you use that?

如果无法保证散列码的唯一性,您可能必须满足于由尽可能多的列组成的主键来确保唯一性(例如自然键)。如果没有自然键,嗯,我听说过Oracle为每行数据提供了一个rownum,你能用吗?