SQL sql查询返回两个表之间的差异
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2077807/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
sql query to return differences between two tables
提问by Casey
I am trying to compare two tables, SQL Server, to verify some data. I want to return all the rows from both tables where data is either in one or the other. In essence, I want to show all the discrepancies. I need to check three pieces of data in doing so, FirstName, LastName and Product.
我正在尝试比较两个表 SQL Server,以验证一些数据。我想从两个表中返回数据位于一个或另一个中的所有行。从本质上讲,我想展示所有的差异。为此,我需要检查三项数据,FirstName、LastName 和 Product。
I'm fairly new to SQL and it seems like a lot of the solutions I'm finding are over complicating things. I don't have to worry about NULLs.
我对 SQL 还很陌生,而且我发现的很多解决方案似乎都使事情变得过于复杂。我不必担心 NULL。
I started by trying something like this:
我开始尝试这样的事情:
SELECT DISTINCT [First Name], [Last Name], [Product Name] FROM [Temp Test Data]
WHERE ([First Name] NOT IN (SELECT [First Name]
FROM [Real Data]))
I'm having trouble taking this further though.
不过,我在进一步处理这个问题时遇到了麻烦。
Thanks!
谢谢!
EDIT:
编辑:
Based on the answer by @treaschf I have been trying to use a variation of the following query:
根据@treaschf 的回答,我一直在尝试使用以下查询的变体:
SELECT td.[First Name], td.[Last Name], td.[Product Name]
FROM [Temp Test Data] td FULL OUTER JOIN [Data] AS d
ON td.[First Name] = d.[First Name] AND td.[Last Name] = d.[Last Name]
WHERE (d.[First Name] = NULL) AND (d.[Last Name] = NULL)
But I keep getting 0 results back, when I know that there is at least 1 row in td that is not in d.
但是当我知道 td 中至少有 1 行不在 d 中时,我不断得到 0 结果。
EDIT:
编辑:
Ok, I think I figured it out. At least in my few minutes of testing it seems to work good enough.
好吧,我想我想通了。至少在我的几分钟测试中,它似乎工作得很好。
SELECT [First Name], [Last Name]
FROM [Temp Test Data] AS td
WHERE (NOT EXISTS
(SELECT [First Name], [Last Name]
FROM [Data] AS d
WHERE ([First Name] = td.[First Name]) OR ([Last Name] = td.[Last Name])))
This is basically going to tell me what is in my test data that is notin my real data. Which is completely fine for what I need to do.
这基本上是要告诉我我的测试数据中哪些不在我的真实数据中。这对于我需要做的事情来说完全没问题。
回答by erikkallen
( SELECT * FROM table1
EXCEPT
SELECT * FROM table2)
UNION ALL
( SELECT * FROM table2
EXCEPT
SELECT * FROM table1)
回答by treaschf
IF you have tables A
and B
, both with colum C
, here are the records, which are present in table A
but not in B
:
如果您有表A
和B
,并且都带有 colum C
,那么这里是存在于表中A
但不存在于 中的记录B
:
SELECT A.*
FROM A
LEFT JOIN B ON (A.C = B.C)
WHERE B.C IS NULL
To get all the differences with a single query, a full join must be used, like this:
要通过单个查询获取所有差异,必须使用完整连接,如下所示:
SELECT A.*, B.*
FROM A
FULL JOIN B ON (A.C = B.C)
WHERE A.C IS NULL OR B.C IS NULL
What you need to know in this case is, that when a record can be found in A
, but not in B
, than the columns which come from B
will be NULL, and similarly for those, which are present in B
and not in A
, the columns from A
will be null.
在这种情况下,您需要知道的是,当可以在 中找到记录A
但在 中找不到记录时B
,来自 的列B
将为 NULL,类似地,对于那些存在于 中B
而不存在于中A
的列,来自 的列A
将为空。
回答by Maisie John
I know that this may not be a popular answer but I do agree with @Randy Minder on using third party tool when more complex comparison is needed.
我知道这可能不是一个流行的答案,但我同意 @Randy Minder 在需要更复杂的比较时使用第三方工具。
This specific case here is easy and for this case such tools are not needed but this can get complex easily if you introduce more columns, databases on two servers, more complex comparison criteria and such.
此处的这种特定情况很容易,对于这种情况,不需要此类工具,但如果您在两台服务器上引入更多列、数据库、更复杂的比较标准等,则很容易变得复杂。
There are a lot of these tools such as ApexSQL Data Diffor Quest Toadand you can always use them in trial mode to get the job done.
有很多这样的工具,例如ApexSQL Data Diff或Quest Toad,您始终可以在试用模式下使用它们来完成工作。
回答by bilelovitch
To get all the differences between two tables, you can use like me this SQL request :
要获取两个表之间的所有差异,您可以像我一样使用此 SQL 请求:
SELECT 'TABLE1-ONLY' AS SRC, T1.*
FROM (
SELECT * FROM Table1
EXCEPT
SELECT * FROM Table2
) AS T1
UNION ALL
SELECT 'TABLE2-ONLY' AS SRC, T2.*
FROM (
SELECT * FROM Table2
EXCEPT
SELECT * FROM Table1
) AS T2
;
回答by studgeek
Simple variation on @erikkallen answer that shows which table the row is present in:
@erikkallen 答案的简单变体,显示该行存在于哪个表中:
( SELECT 'table1' as source, * FROM table1
EXCEPT
SELECT * FROM table2)
UNION ALL
( SELECT 'table2' as source, * FROM table2
EXCEPT
SELECT * FROM table1)
If you get an error
如果出现错误
All queries combined using a UNION, INTERSECT or EXCEPT operator must have an equal number of expressions in their target lists.
使用 UNION、INTERSECT 或 EXCEPT 运算符组合的所有查询在其目标列表中必须具有相同数量的表达式。
then it may help to add
那么它可能有助于添加
( SELECT 'table1' as source, * FROM table1
EXCEPT
SELECT 'table1' as source, * FROM table2)
UNION ALL
( SELECT 'table2' as source, * FROM table2
EXCEPT
SELECT 'table2' as source, * FROM table1)
回答by Roman Pekar
If you want to get which column values are different, you could use Entity-Attribute-Value model:
如果要获取哪些列值不同,可以使用 Entity-Attribute-Value 模型:
declare @Data1 xml, @Data2 xml
select @Data1 =
(
select *
from (select * from Test1 except select * from Test2) as a
for xml raw('Data')
)
select @Data2 =
(
select *
from (select * from Test2 except select * from Test1) as a
for xml raw('Data')
)
;with CTE1 as (
select
T.C.value('../@ID', 'bigint') as ID,
T.C.value('local-name(.)', 'nvarchar(128)') as Name,
T.C.value('.', 'nvarchar(max)') as Value
from @Data1.nodes('Data/@*') as T(C)
), CTE2 as (
select
T.C.value('../@ID', 'bigint') as ID,
T.C.value('local-name(.)', 'nvarchar(128)') as Name,
T.C.value('.', 'nvarchar(max)') as Value
from @Data2.nodes('Data/@*') as T(C)
)
select
isnull(C1.ID, C2.ID) as ID, isnull(C1.Name, C2.Name) as Name, C1.Value as Value1, C2.Value as Value2
from CTE1 as C1
full outer join CTE2 as C2 on C2.ID = C1.ID and C2.Name = C1.Name
where
not
(
C1.Value is null and C2.Value is null or
C1.Value is not null and C2.Value is not null and C1.Value = C2.Value
)
回答by Kango_V
Try this :
尝试这个 :
SELECT
[First Name], [Last Name]
FROM
[Temp Test Data] AS td EXCEPTION JOIN [Data] AS d ON
(d.[First Name] = td.[First Name] OR d.[Last Name] = td.[Last Name])
Much simpler to read.
读起来简单多了。
回答by GiovaniSalazar
You can use except , for example something like this :
您可以使用 except ,例如这样的东西:
-- DB1..Tb1 have values than DB2..Tb1 not have
Select Col1,Col2,Col3 From DB1..Tb1
except
Select Col1,Col2,Col3 From DB2..Tb1
-- Now we change order
-- DB2..Tb1 have values than DB1..Tb1 not have
Select Col1,Col2,Col3 From DB2..Tb1
except
Select Col1,Col2,Col3 From DB1..Tb1
回答by Adrian-Bogdan Ionescu
This will do the trick, similar with Tiago's solution, return "source" table as well.
这将解决问题,类似于Tiago的解决方案,也返回“源”表。
select [First name], [Last name], max(_tabloc) as _tabloc
from (
select [First Name], [Last name], 't1' as _tabloc from table1
union all
select [First name], [Last name], 't2' as _tabloc from table2
) v
group by [Fist Name], [Last name]
having count(1)=1
result will contain differences between tables, in column _tabloc you will have table reference.
结果将包含表之间的差异,在 _tabloc 列中您将有表引用。
回答by crokusek
Presenting the Cadillac of Diffs as an SP. See within for the basic template that was based on answer by @erikkallen. It supports
将 Diffs 的凯迪拉克展示为 SP。有关基于@erikkallen 的回答的基本模板,请参阅内。它支持
- Duplicate row sensing (most other answers here do not)
- Sort results by argument
- Limit to specific columns
- Ignore columns (e.g. ModifiedUtc)
- Cross database tables names
- Temp tables (use as workaround to diff views)
- 重复行感应(此处的大多数其他答案都没有)
- 按参数对结果进行排序
- 限制到特定列
- 忽略列(例如 ModifiedUtc)
- 跨数据库表名
- 临时表(用作差异视图的解决方法)
Usage:
用法:
exec Common.usp_DiffTableRows '#t1', '#t2';
exec Common.usp_DiffTableRows
@pTable0 = 'ydb.ysh.table1',
@pTable1 = 'xdb.xsh.table2',
@pOrderByCsvOpt = null, -- Order the results
@pOnlyCsvOpt = null, -- Only compare these columns
@pIgnoreCsvOpt = null; -- Ignore these columns (ignored if @pOnlyCsvOpt is specified)
Code:
代码:
alter proc [Common].[usp_DiffTableRows]
@pTable0 varchar(300),
@pTable1 varchar(300),
@pOrderByCsvOpt nvarchar(1000) = null, -- Order the Results
@pOnlyCsvOpt nvarchar(4000) = null, -- Only compare these columns
@pIgnoreCsvOpt nvarchar(4000) = null, -- Ignore these columns (ignored if @pOnlyCsvOpt is specified)
@pDebug bit = 0
as
/*---------------------------------------------------------------------------------------------------------------------
Purpose: Compare rows between two tables.
Usage: exec Common.usp_DiffTableRows '#a', '#b';
Modified By Description
---------- ---------- -------------------------------------------------------------------------------------------
2015.10.06 crokusek Initial Version
2019.03.13 crokusek Added @pOrderByCsvOpt
2019.06.26 crokusek Support for @pIgnoreCsvOpt, @pOnlyCsvOpt.
2019.09.04 crokusek Minor debugging improvement
2020.03.12 crokusek Detect duplicate rows in either source table
---------------------------------------------------------------------------------------------------------------------*/
begin try
if (substring(@pTable0, 1, 1) = '#')
set @pTable0 = 'tempdb..' + @pTable0; -- object_id test below needs full names for temp tables
if (substring(@pTable1, 1, 1) = '#')
set @pTable1 = 'tempdb..' + @pTable1; -- object_id test below needs full names for temp tables
if (object_id(@pTable0) is null)
raiserror('Table name is not recognized: ''%s''', 16, 1, @pTable0);
if (object_id(@pTable1) is null)
raiserror('Table name is not recognized: ''%s''', 16, 1, @pTable1);
create table #ColumnGathering
(
Name nvarchar(300) not null,
Sequence int not null,
TableArg tinyint not null
);
declare
@usp varchar(100) = object_name(@@procid),
@sql nvarchar(4000),
@sqlTemplate nvarchar(4000) =
'
use $database$;
insert into #ColumnGathering
select Name, column_id as Sequence, $TableArg$ as TableArg
from sys.columns c
where object_id = object_id(''$table$'', ''U'')
';
set @sql = replace(replace(replace(@sqlTemplate,
'$TableArg$', 0),
'$database$', (select DatabaseName from Common.ufn_SplitDbIdentifier(@pTable0))),
'$table$', @pTable0);
if (@pDebug = 1)
print 'Sql #CG 0: ' + @sql;
exec sp_executesql @sql;
set @sql = replace(replace(replace(@sqlTemplate,
'$TableArg$', 1),
'$database$', (select DatabaseName from Common.ufn_SplitDbIdentifier(@pTable1))),
'$table$', @pTable1);
if (@pDebug = 1)
print 'Sql #CG 1: ' + @sql;
exec sp_executesql @sql;
if (@pDebug = 1)
select * from #ColumnGathering;
select Name,
min(Sequence) as Sequence,
convert(bit, iif(min(TableArg) = 0, 1, 0)) as InTable0,
convert(bit, iif(max(TableArg) = 1, 1, 0)) as InTable1
into #Columns
from #ColumnGathering
group by Name
having ( @pOnlyCsvOpt is not null
and Name in (select Value from Common.ufn_UsvToNVarcharKeyTable(@pOnlyCsvOpt, default)))
or
( @pOnlyCsvOpt is null
and @pIgnoreCsvOpt is not null
and Name not in (select Value from Common.ufn_UsvToNVarcharKeyTable(@pIgnoreCsvOpt, default)))
or
( @pOnlyCsvOpt is null
and @pIgnoreCsvOpt is null)
if (exists (select 1 from #Columns where InTable0 = 0 or InTable1 = 0))
begin
select 1; -- without this the debugging info doesn't stream sometimes
select * from #Columns order by Sequence;
waitfor delay '00:00:02'; -- give results chance to stream before raising exception
raiserror('Columns are not equal between tables, consider using args @pIgnoreCsvOpt, @pOnlyCsvOpt. See Result Sets for details.', 16, 1);
end
if (@pDebug = 1)
select * from #Columns order by Sequence;
declare
@columns nvarchar(4000) = --iif(@pOnlyCsvOpt is null and @pIgnoreCsvOpt is null,
-- '*',
(
select substring((select ',' + ac.name
from #Columns ac
order by Sequence
for xml path('')),2,200000) as csv
);
if (@pDebug = 1)
begin
print 'Columns: ' + @columns;
waitfor delay '00:00:02'; -- give results chance to stream before possibly raising exception
end
-- Based on https://stackoverflow.com/a/2077929/538763
-- - Added sensing for duplicate rows
-- - Added reporting of source table location
--
set @sqlTemplate = '
with
a as (select ~, Row_Number() over (partition by ~ order by (select null)) -1 as Duplicates from $a$),
b as (select ~, Row_Number() over (partition by ~ order by (select null)) -1 as Duplicates from $b$)
select 0 as SourceTable, ~
from
(
select * from a
except
select * from b
) anb
union all
select 1 as SourceTable, ~
from
(
select * from b
except
select * from a
) bna
order by $orderBy$
';
set @sql = replace(replace(replace(replace(@sqlTemplate,
'$a$', @pTable0),
'$b$', @pTable1),
'~', @columns),
'$orderBy$', coalesce(@pOrderByCsvOpt, @columns + ', SourceTable')
);
if (@pDebug = 1)
print 'Sql: ' + @sql;
exec sp_executesql @sql;
end try
begin catch
declare
@CatchingUsp varchar(100) = object_name(@@procid);
if (xact_state() = -1)
rollback;
-- Disabled for S.O. post
--exec Common.usp_Log
--@pMethod = @CatchingUsp;
--exec Common.usp_RethrowError
--@pCatchingMethod = @CatchingUsp;
throw;
end catch
go
create function Common.Trim
(
@pOriginalString nvarchar(max),
@pCharsToTrim nvarchar(50) = null -- specify null or 'default' for whitespae
)
returns table
with schemabinding
as
/*--------------------------------------------------------------------------------------------------
Purpose: Trim the specified characters from a string.
Modified By Description
---------- -------------- --------------------------------------------------------------------
2012.09.25 S.Rutszy/crok Modified from https://dba.stackexchange.com/a/133044/9415
--------------------------------------------------------------------------------------------------*/
return
with cte AS
(
select patindex(N'%[^' + EffCharsToTrim + N']%', @pOriginalString) AS [FirstChar],
patindex(N'%[^' + EffCharsToTrim + N']%', reverse(@pOriginalString)) AS [LastChar],
len(@pOriginalString + N'~') - 1 AS [ActualLength]
from
(
select EffCharsToTrim = coalesce(@pCharsToTrim, nchar(0x09) + nchar(0x20) + nchar(0x0d) + nchar(0x0a))
) c
)
select substring(@pOriginalString, [FirstChar],
((cte.[ActualLength] - [LastChar]) - [FirstChar] + 2)
) AS [TrimmedString]
--
--cte.[ActualLength],
--[FirstChar],
--((cte.[ActualLength] - [LastChar]) + 1) AS [LastChar]
from cte;
go
create function [Common].[ufn_UsvToNVarcharKeyTable] (
@pCsvList nvarchar(MAX),
@pSeparator nvarchar(1) = ',' -- can pass keyword 'default' when calling using ()'s
)
--
-- SQL Server 2012 distinguishes nvarchar keys up to maximum of 450 in length (900 bytes)
--
returns @tbl table (Value nvarchar(450) not null primary key(Value)) as
/*-------------------------------------------------------------------------------------------------
Purpose: Converts a comma separated list of strings into a sql NVarchar table. From
http://www.programmingado.net/a-398/SQL-Server-parsing-CSV-into-table.aspx
This may be called from RunSelectQuery:
GRANT SELECT ON Common.ufn_UsvToNVarcharTable TO MachCloudDynamicSql;
Modified By Description
---------- -------------- -------------------------------------------------------------------
2011.07.13 internet Initial version
2011.11.22 crokusek Support nvarchar strings and a custom separator.
2017.12.06 crokusek Trim leading and trailing whitespace from each element.
2019.01.26 crokusek Remove newlines
-------------------------------------------------------------------------------------------------*/
begin
declare
@pos int,
@textpos int,
@chunklen smallint,
@str nvarchar(4000),
@tmpstr nvarchar(4000),
@leftover nvarchar(4000),
@csvList nvarchar(max) = iif(@pSeparator not in (char(13), char(10), char(13) + char(10)),
replace(replace(@pCsvList, char(13), ''), char(10), ''),
@pCsvList); -- remove newlines
set @textpos = 1
set @leftover = ''
while @textpos <= len(@csvList)
begin
set @chunklen = 4000 - len(@leftover)
set @tmpstr = ltrim(@leftover + substring(@csvList, @textpos, @chunklen))
set @textpos = @textpos + @chunklen
set @pos = charindex(@pSeparator, @tmpstr)
while @pos > 0
begin
set @str = substring(@tmpstr, 1, @pos - 1)
set @str = (select TrimmedString from Common.Trim(@str, default));
insert @tbl (value) values(@str);
set @tmpstr = ltrim(substring(@tmpstr, @pos + 1, len(@tmpstr)))
set @pos = charindex(@pSeparator, @tmpstr)
end
set @leftover = @tmpstr
end
-- Handle @leftover
set @str = (select TrimmedString from Common.Trim(@leftover, default));
if @str <> ''
insert @tbl (value) values(@str);
return
end
GO
create function Common.ufn_SplitDbIdentifier(@pIdentifier nvarchar(300))
returns @table table
(
InstanceName nvarchar(300) not null,
DatabaseName nvarchar(300) not null,
SchemaName nvarchar(300),
BaseName nvarchar(300) not null,
FullTempDbBaseName nvarchar(300), -- non-null for tempdb (e.g. #Abc____...)
InstanceWasSpecified bit not null,
DatabaseWasSpecified bit not null,
SchemaWasSpecified bit not null,
IsCurrentInstance bit not null,
IsCurrentDatabase bit not null,
IsTempDb bit not null,
OrgIdentifier nvarchar(300) not null
) as
/*-----------------------------------------------------------------------------------------------------------
Purpose: Split a Sql Server Identifier into its parts, providing appropriate default values and
handling temp table (tempdb) references.
Example: select * from Common.ufn_SplitDbIdentifier('t')
union all
select * from Common.ufn_SplitDbIdentifier('s.t')
union all
select * from Common.ufn_SplitDbIdentifier('d.s.t')
union all
select * from Common.ufn_SplitDbIdentifier('i.d.s.t')
union all
select * from Common.ufn_SplitDbIdentifier('#d')
union all
select * from Common.ufn_SplitDbIdentifier('tempdb..#d');
-- Empty
select * from Common.ufn_SplitDbIdentifier('illegal name');
Modified By Description
---------- -------------- -----------------------------------------------------------------------------
2013.09.27 crokusek Initial version.
-----------------------------------------------------------------------------------------------------------*/
begin
declare
@name nvarchar(300) = ltrim(rtrim(@pIdentifier));
-- Return an empty table as a "throw"
--
--Removed for SO post
--if (Common.ufn_IsSpacelessLiteralIdentifier(@name) = 0)
-- return;
-- Find dots starting from the right by reversing first.
declare
@revName nvarchar(300) = reverse(@name);
declare
@firstDot int = charindex('.', @revName);
declare
@secondDot int = iif(@firstDot = 0, 0, charindex('.', @revName, @firstDot + 1));
declare
@thirdDot int = iif(@secondDot = 0, 0, charindex('.', @revName, @secondDot + 1));
declare
@fourthDot int = iif(@thirdDot = 0, 0, charindex('.', @revName, @thirdDot + 1));
--select @firstDot, @secondDot, @thirdDot, @fourthDot, len(@name);
-- Undo the reverse() (first dot is first from the right).
--
set @firstDot = iif(@firstDot = 0, 0, len(@name) - @firstDot + 1);
set @secondDot = iif(@secondDot = 0, 0, len(@name) - @secondDot + 1);
set @thirdDot = iif(@thirdDot = 0, 0, len(@name) - @thirdDot + 1);
set @fourthDot = iif(@fourthDot = 0, 0, len(@name) - @fourthDot + 1);
--select @firstDot, @secondDot, @thirdDot, @fourthDot, len(@name);
declare
@baseName nvarchar(300) = substring(@name, @firstDot + 1, len(@name) - @firstdot);
declare
@schemaName nvarchar(300) = iif(@firstDot - @secondDot - 1 <= 0,
null,
substring(@name, @secondDot + 1, @firstDot - @secondDot - 1));
declare
@dbName nvarchar(300) = iif(@secondDot - @thirdDot - 1 <= 0,
null,
substring(@name, @thirdDot + 1, @secondDot - @thirdDot - 1));
declare
@instName nvarchar(300) = iif(@thirdDot - @fourthDot - 1 <= 0,
null,
substring(@name, @fourthDot + 1, @thirdDot - @fourthDot - 1));
with input as (
select
coalesce(@instName, '[' + @@servername + ']') as InstanceName,
coalesce(@dbName, iif(left(@baseName, 1) = '#', 'tempdb', db_name())) as DatabaseName,
coalesce(@schemaName, iif(left(@baseName, 1) = '#', 'dbo', schema_name())) as SchemaName,
@baseName as BaseName,
iif(left(@baseName, 1) = '#',
(
select [name] from tempdb.sys.objects
where object_id = object_id('tempdb..' + @baseName)
),
null) as FullTempDbBaseName,
iif(@instName is null, 0, 1) InstanceWasSpecified,
iif(@dbName is null, 0, 1) DatabaseWasSpecified,
iif(@schemaName is null, 0, 1) SchemaWasSpecified
)
insert into @table
select i.InstanceName, i.DatabaseName, i.SchemaName, i.BaseName, i.FullTempDbBaseName,
i.InstanceWasSpecified, i.DatabaseWasSpecified, i.SchemaWasSpecified,
iif(i.InstanceName = '[' + @@servername + ']', 1, 0) as IsCurrentInstance,
iif(i.DatabaseName = db_name(), 1, 0) as IsCurrentDatabase,
iif(left(@baseName, 1) = '#', 1, 0) as IsTempDb,
@name as OrgIdentifier
from input i;
return;
end
GO