SQL 将具有相同 ID 的多行合并为一行
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/28569589/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Merge multiple rows with same ID into one row
提问by Hemus San
How can I merge multiple rows with same ID
into one row.
如何将多行相同的行合并ID
为一行。
When value in first and second row in the same column is the same or when there is value in first row and NULL
in second row.
I don't want to merge when value in first and second row in the same column is different.
当同一列的第一行和NULL
第二行的值相同或第一行和第二行有值时。当同一列中第一行和第二行的值不同时,我不想合并。
I have table:
我有表:
ID |A |B |C
1 NULL 31 NULL
1 412 NULL 1
2 567 38 4
2 567 NULL NULL
3 2 NULL NULL
3 5 NULL NULL
4 6 1 NULL
4 8 NULL 5
4 NULL NULL 5
I want to get table:
我想得到表:
ID |A |B |C
1 412 31 1
2 567 38 4
3 2 NULL NULL
3 5 NULL NULL
4 6 1 NULL
4 8 NULL 5
4 NULL NULL 5
采纳答案by Jason W
I think there's a simpler solution to the above answers (which is also correct). It basically gets the merged values that can be merged within a CTE, then merges that with the data not able to be merged.
我认为上述答案有一个更简单的解决方案(这也是正确的)。它基本上获取可以在 CTE 内合并的合并值,然后将其与无法合并的数据合并。
WITH CTE AS (
SELECT
ID,
MAX(A) AS A,
MAX(B) AS B,
MAX(C) AS C
FROM dbo.Records
GROUP BY ID
HAVING MAX(A) = MIN(A)
AND MAX(B) = MIN(B)
AND MAX(C) = MIN(C)
)
SELECT *
FROM CTE
UNION ALL
SELECT *
FROM dbo.Records
WHERE ID NOT IN (SELECT ID FROM CTE)
SQL Fiddle: http://www.sqlfiddle.com/#!6/29407/1/0
SQL 小提琴:http://www.sqlfiddle.com/#!6/29407/1/0
回答by ErikE
WITH Collapsed AS (
SELECT
ID,
A = Min(A),
B = Min(B),
C = Min(C)
FROM
dbo.MyTable
GROUP BY
ID
HAVING
EXISTS (
SELECT Min(A), Min(B), Min(C)
INTERSECT
SELECT Max(A), Max(B), Max(C)
)
)
SELECT
*
FROM
Collapsed
UNION ALL
SELECT
*
FROM
dbo.MyTable T
WHERE
NOT EXISTS (
SELECT *
FROM Collapsed C
WHERE T.ID = C.ID
);
See this working in a SQL Fiddle
看到这个在 SQL Fiddle 中工作
This works by creating all the mergeable rows through the use of Min
and Max
--which should be the same for each column within an ID
and which usefully exclude NULL
s--then appending to this list all the rows from the table that couldn't be merged. The special trick with EXISTS ... INTERSECT
allows for the case when a column has all NULL
values for an ID
(and thus the Min
and Max
are NULL
and can't equal each other). That is, it functions like Min(A) = Max(A) AND Min(B) = Max(B) AND Min(C) = Max(C)
but allows for NULL
s to compare as equal.
这是通过使用Min
和创建所有可合并行来工作的——这Max
对于 an 中的每一列应该是相同的,ID
并且有用地排除 s——NULL
然后将表中无法合并的所有行附加到此列表中。特殊技巧 withEXISTS ... INTERSECT
允许列具有NULL
an 的所有值ID
(因此Min
and Max
areNULL
和 can't 彼此相等)的情况。也就是说,它的功能类似于Min(A) = Max(A) AND Min(B) = Max(B) AND Min(C) = Max(C)
但允许NULL
s 比较相等。
Here's a slightly different (earlier) solution I gave that may offer different performance characteristics, and being more complicated, I like less, but being a single flowing query (without a UNION
) I kind of like more, too.
这是我给出的一个略有不同(早期)的解决方案,它可能提供不同的性能特征,并且更复杂,我喜欢的更少,但作为一个单一的流动查询(没有UNION
),我也更喜欢。
WITH Collapsible AS (
SELECT
ID
FROM
dbo.MyTable
GROUP BY
ID
HAVING
EXISTS (
SELECT Min(A), Min(B), Min(C)
INTERSECT
SELECT Max(A), Max(B), Max(C)
)
), Calc AS (
SELECT
T.*,
Grp = Coalesce(C.ID, Row_Number() OVER (PARTITION BY T.ID ORDER BY (SELECT 1)))
FROM
dbo.MyTable T
LEFT JOIN Collapsible C
ON T.ID = C.ID
)
SELECT
ID,
A = Min(A),
B = Min(B),
C = Min(C)
FROM
Calc
GROUP BY
ID,
Grp
;
This is also in the above SQL Fiddle.
这也在上面的 SQL Fiddle 中。
This uses similar logic as the first query to calculate whether a group should be merged, then uses this to create a grouping key that is either the same for all rows within an ID
or is different for all rows within an ID
. With a final Min
(Max
would have worked just as well) the rows that should be merged are merged because they share a grouping key, and the rows that shouldn't be merged are not because they have distinct grouping keys over the ID
.
这使用与第一个查询类似的逻辑来计算是否应该合并一个组,然后使用它来创建一个分组键,该分组键ID
对于ID
. 使用 final Min
(Max
本来也可以),应该合并的行会被合并,因为它们共享一个分组键,而不应合并的行不是因为它们在ID
.
Depending on your data set, indexes, table size, and other performance factors, either of these queries may perform better, though the second query has some work to do to catch up, with two sorts instead of one.
根据您的数据集、索引、表大小和其他性能因素,这些查询中的任何一个都可能执行得更好,尽管第二个查询需要做一些工作才能赶上,使用两种而不是一种。
回答by peter.petrov
You can try something like this:
你可以尝试这样的事情:
select
isnull(t1.A, t2.A) as A,
isnull(t1.B, t2.B) as B,
isnull(t1.C, t2.C) as C
from
table_name t1
join table_name t2 on t1.ID = t2.ID and .....
You mention the concepts of first and second. How do
you define this order? Place that order defining condition
in here: .....
你提到了第一和第二的概念。你如何
定义这个顺序?将订单定义条件
放在此处:.....
Also, I assume you have exactly 2 rows for each ID value.
另外,我假设每个 ID 值正好有 2 行。