比较 SQL Server 和 Oracle 中的表

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4626174/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 22:22:09  来源:igfitidea点击:

Compare tables in SQL Server and Oracle

oraclesql-server-2005comparison

提问by Vikram

I have a set of SQL scripts to migrate data from tables in SQL Server 2005 to Oracle 11. I want to compare the data in two databases to ensure that they are migrated correctly.

我有一组 SQL 脚本来将数据从 SQL Server 2005 中的表迁移到 Oracle 11。我想比较两个数据库中的数据以确保它们被正确迁移。

My source is: SQL Server 2005. It has 10 big tables with approx 80 million rows and 20 tables with approx 100,000 rows.

我的来源是:SQL Server 2005。它有大约 8000 万行的 10 个大表和大约 100,000 行的 20 个表。

My target is: Oracle 11 on Exadata, which has the similar table structure to its source.

我的目标是:Exadata 上的 Oracle 11,它具有与其源相似的表结构。

Any idea what I can do? Are there any tools available?

知道我能做什么吗?有没有可用的工具?

采纳答案by Benoit

You could maybe design a function that will for each column on each table:

您可以设计一个函数来为每个表的每一列:

  • concatenate row number with value
  • hash it (md5 should be available on both)
  • and compute aggregate bitwise XOR of hash values.
  • 将行号与值连接起来
  • 散列它(md5 应该在两者上都可用)
  • 并计算哈希值的聚合按位异或。

If the results are the same you probably have the same values in your two databases without needing transferring large data over the network.

如果结果相同,则您的两个数据库中的值可能相同,而无需通过网络传输大量数据。

The drawback is: if the results are not the same you don't know where the difference lies. You could maybe partition your tables by chunks of 1000 rows or such for example.

缺点是:如果结果不一样,你就不知道区别在哪里。例如,您可以按 1000 行之类的块对表进行分区。

回答by bernd_k

I would try to set up a linked server in SSMO to the Oracle database. Then I would compare the tables using the except operator for each table

我会尝试在 SSMO 中设置一个链接服务器到 Oracle 数据库。然后我将使用每个表的 except 运算符比较表

select * from table1
except
select * from [link_server]..myschema.TABLE1
union
select * from [link_server]..myschema.TABLE1
except
select * from table1

NOTE: I didn't get it to work, when the oracle tables contain CLOB columns.

注意:当 oracle 表包含 CLOB 列时,我没有让它工作。

Answer to comment: Install sql server on your local machine, there you are DBA. When you have access to remote machine, you can add a linked server to it.

回复评论:在你的本地机器上安装sql server,你就是DBA。当您可以访问远程机器时,您可以向其添加链接服务器。

回答by duffymo

What you seem to be asking is: "How can I unit test my ETL solution to ensure that the data I have in my source database is faithfully replicated in my target database?"

您似乎要问的是:“我如何对 ETL 解决方案进行单元测试,以确保源数据库中的数据忠实地复制到目标数据库中?”

When you go into the details of number of tables and their sizes, it suggests to me that you want to go to the depths of comparing every value in every column in every table.

当您详细了解表的数量及其大小时,它向我表明您想要深入比较每个表中每一列中的每个值。

I'm not aware of any such tools. Such a check could run for a long time, but since it's a one shot it's doable. You could write such a thing in C# or Java or another language you're comfortable in.

我不知道有任何这样的工具。这样的检查可能会运行很长时间,但因为它是一次性的,所以是可行的。您可以用 C# 或 Java 或其他您熟悉的语言编写这样的东西。

Another thought would be to check a smaller subset of each table. If the ETL tools does a small subset properly, there's no reason to believe that it'll fail with the rest unless there's an exceptional situation.

另一个想法是检查每个表的较小子集。如果 ETL 工具正确地执行了一小部分,则没有理由相信它会与其余部分一起失败,除非出现特殊情况。

You could migrate the data in batches, checking and committing each one as you go.

您可以批量迁移数据,随时检查和提交每个数据。

Another idea might be a statisical approach: take a random sampling of rows from each and perform the checks.

另一个想法可能是一种统计方法:从每个行中随机抽样并执行检查。