SQL Server 中最优雅的生成排列方式
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3621494/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
The most elegant way to generate permutations in SQL server
提问by SDReyes
Given a the following table:
给出下表:
Index | Element
---------------
1 | A
2 | B
3 | C
4 | D
We want to generate all the possible permutations (without repetitions) using the elements. the final result (skipping some rows) will look like this:
我们希望使用元素生成所有可能的排列(无重复)。最终结果(跳过一些行)将如下所示:
Results
----------
ABCD
ABDC
ACBD
ACDB
ADAC
ADCA
...
DABC
DACB
DBCA
DBAC
DCAB
DCBA
(24 Rows)
How would you do it?
你会怎么做?
采纳答案by Philip Kelley
After making some perhaps snarky comments, this problem stuck in my brain all evening, and I eventually came up with the following set-based approach. I believe it definitely qualifies as "elegant", but then I also think it qualifies as "kinda dumb". You make the call.
在发表了一些尖刻的评论后,这个问题整晚都在我的脑海里萦绕,我最终想出了以下基于集合的方法。我相信它绝对符合“优雅”的条件,但我也认为它符合“有点愚蠢”的条件。你打电话。
First, set up some tables:
首先,设置一些表:
-- For testing purposes
DROP TABLE Source
DROP TABLE Numbers
DROP TABLE Results
-- Add as many rows as need be processed--though note that you get N! (number of rows, factorial) results,
-- and that gets big fast. The Identity column must start at 1, or the algorithm will have to be adjusted.
-- Element could be more than char(1), though the algorithm would have to be adjusted again, and each element
-- must be the same length.
CREATE TABLE Source
(
SourceId int not null identity(1,1)
,Element char(1) not null
)
INSERT Source (Element) values ('A')
INSERT Source (Element) values ('B')
INSERT Source (Element) values ('C')
INSERT Source (Element) values ('D')
--INSERT Source (Element) values ('E')
--INSERT Source (Element) values ('F')
-- This is a standard Tally table (or "table of numbers")
-- It only needs to be as long as there are elements in table Source
CREATE TABLE Numbers (Number int not null)
INSERT Numbers (Number) values (1)
INSERT Numbers (Number) values (2)
INSERT Numbers (Number) values (3)
INSERT Numbers (Number) values (4)
INSERT Numbers (Number) values (5)
INSERT Numbers (Number) values (6)
INSERT Numbers (Number) values (7)
INSERT Numbers (Number) values (8)
INSERT Numbers (Number) values (9)
INSERT Numbers (Number) values (10)
-- Results are iteratively built here. This could be a temp table. An index on "Length" might make runs
-- faster for large sets. Combo must be at least as long as there are characters to be permuted.
CREATE TABLE Results
(
Combo varchar(10) not null
,Length int not null
)
Here's the routine:
这是例行公事:
SET NOCOUNT on
DECLARE
@Loop int
,@MaxLoop int
-- How many elements there are to process
SELECT @MaxLoop = max(SourceId)
from Source
-- Initialize first value
TRUNCATE TABLE Results
INSERT Results (Combo, Length)
select Element, 1
from Source
where SourceId = 1
SET @Loop = 2
-- Iterate to add each element after the first
WHILE @Loop <= @MaxLoop
BEGIN
-- See comments below. Note that the "distinct" remove duplicates, if a given value
-- is to be included more than once
INSERT Results (Combo, Length)
select distinct
left(re.Combo, @Loop - nm.Number)
+ so.Element
+ right(re.Combo, nm.Number - 1)
,@Loop
from Results re
inner join Numbers nm
on nm.Number <= @Loop
inner join Source so
on so.SourceId = @Loop
where re.Length = @Loop - 1
-- For performance, add this in if sets will be large
--DELETE Results
-- where Length <> @Loop
SET @Loop = @Loop + 1
END
-- Show results
SELECT *
from Results
where Length = @MaxLoop
order by Combo
The general idea is: when adding a new element (say "B") to any string (say, "A"), to catch all permutations you would add B to all possible positions (Ba, aB), resulting in a new set of strings. Then iterate: Add a new element (C) to each position in a string (AB becomes Cab, aCb, abC), for all strings (Cba, bCa, baC), and you have the set of permutations. Iterate over each result set with the next character until you run out of characters... or resources. 10 elements is 3.6 million permutations, roughly 48MB with the above algorithm, and 14 (unique) elements would hit 87 billion permutations and 1.163 terabytes.
一般的想法是:当向任何字符串(例如“A”)添加新元素(例如“B”)时,要捕获所有排列,您将 B 添加到所有可能的位置(Ba,aB),从而产生一个新集合的字符串。然后迭代:向字符串中的每个位置添加一个新元素(C)(AB 变为 Cab, aCb, abC),对于所有字符串(Cba, bCa, baC),您就有了一组排列。用下一个字符迭代每个结果集,直到用完字符...或资源。10 个元素是 360 万个排列,上述算法大约为 48MB,14 个(唯一)元素将达到 870 亿个排列和 1.163 TB。
I'm sure it could eventually be wedged into a CTE, but in the end all that would be is a glorified loop. The logic is clearer this way, and I can't help but think the CTE execution plan would be a nightmare.
我相信它最终会被纳入 CTE,但最终这将是一个美化的循环。这样逻辑更清晰,我不禁认为CTE执行计划将是一场噩梦。
回答by A-K
DECLARE @s VARCHAR(5);
SET @s = 'ABCDE';
WITH Subsets AS (
SELECT CAST(SUBSTRING(@s, Number, 1) AS VARCHAR(5)) AS Token,
CAST('.'+CAST(Number AS CHAR(1))+'.' AS VARCHAR(11)) AS Permutation,
CAST(1 AS INT) AS Iteration
FROM dbo.Numbers WHERE Number BETWEEN 1 AND 5
UNION ALL
SELECT CAST(Token+SUBSTRING(@s, Number, 1) AS VARCHAR(5)) AS Token,
CAST(Permutation+CAST(Number AS CHAR(1))+'.' AS VARCHAR(11)) AS
Permutation,
s.Iteration + 1 AS Iteration
FROM Subsets s JOIN dbo.Numbers n ON s.Permutation NOT LIKE
'%.'+CAST(Number AS CHAR(1))+'.%' AND s.Iteration < 5 AND Number
BETWEEN 1 AND 5
--AND s.Iteration = (SELECT MAX(Iteration) FROM Subsets)
)
SELECT * FROM Subsets
WHERE Iteration = 5
ORDER BY Permutation
Token Permutation Iteration
----- ----------- -----------
ABCDE .1.2.3.4.5. 5
ABCED .1.2.3.5.4. 5
ABDCE .1.2.4.3.5. 5
(snip)
EDBCA .5.4.2.3.1. 5
EDCAB .5.4.3.1.2. 5
EDCBA .5.4.3.2.1. 5
first posted a while ago here
不久前首次发布在这里
However, it would be better to do it in a better language such as C# or C++.
但是,最好使用更好的语言(例如 C# 或 C++)来完成。
回答by Dave
Just using SQL, without any code, you could do it if you can crowbar yourself another column into the table. Clearly you need to have one joined table for each of the values to be permuted.
只需使用 SQL,无需任何代码,如果您可以将另一列插入表中,您就可以做到。很明显,您需要为每个要排列的值创建一个连接表。
with llb as (
select 'A' as col,1 as cnt union
select 'B' as col,3 as cnt union
select 'C' as col,9 as cnt union
select 'D' as col,27 as cnt
)
select a1.col,a2.col,a3.col,a4.col
from llb a1
cross join llb a2
cross join llb a3
cross join llb a4
where a1.cnt + a2.cnt + a3.cnt + a4.cnt = 40
回答by Tegiri Nenashi
Am I correctly understanding that you built Cartesian product n x n x n x n, and then filter out unwanted stuff? The alternative would be generating all the numbers up to n! and then using factorial number systemto map them via element encoding.
我是否正确理解您构建了笛卡尔积 nxnxnxn,然后过滤掉不需要的东西?另一种方法是生成最多 n 的所有数字!然后使用阶乘数系统通过元素编码来映射它们。
回答by Peter Radocchia
Simpler than a recursive CTE:
比递归 CTE 更简单:
declare @Number Table( Element varchar(MAX), Id varchar(MAX) )
Insert Into @Number Values ( 'A', '01')
Insert Into @Number Values ( 'B', '02')
Insert Into @Number Values ( 'C', '03')
Insert Into @Number Values ( 'D', '04')
select a.Element, b.Element, c.Element, d.Element
from @Number a
join @Number b on b.Element not in (a.Element)
join @Number c on c.Element not in (a.Element, b.Element)
join @Number d on d.Element not in (a.Element, b.Element, c.Element)
order by 1, 2, 3, 4
For an arbitrary number of elements, script it out:
对于任意数量的元素,编写脚本:
if object_id('tempdb..#number') is not null drop table #number
create table #number (Element char(1), Id int, Alias as '_'+convert(varchar,Id))
insert #number values ('A', 1)
insert #number values ('B', 2)
insert #number values ('C', 3)
insert #number values ('D', 4)
insert #number values ('E', 5)
declare @sql nvarchar(max)
set @sql = '
select '+stuff((
select char(13)+char(10)+'+'+Alias+'.Element'
from #number order by Id for xml path (''), type
).value('.','NVARCHAR(MAX)'),3,1,' ')
set @sql += '
from #number '+(select top 1 Alias from #number order by Id)
set @sql += (
select char(13)+char(10)+'join #number '+Alias+' on '+Alias+'.Id not in ('
+stuff((
select ', '+Alias+'.Id'
from #number b where a.Id > b.Id
order by Id for xml path ('')
),1,2,'')
+ ')'
from #number a where Id > (select min(Id) from #number)
order by Element for xml path (''), type
).value('.','NVARCHAR(MAX)')
set @sql += '
order by 1'
print @sql
exec (@sql)
To generate this:
要生成这个:
select
_1.Element
+_2.Element
+_3.Element
+_4.Element
+_5.Element
from #number _1
join #number _2 on _2.Id not in (_1.Id)
join #number _3 on _3.Id not in (_1.Id, _2.Id)
join #number _4 on _4.Id not in (_1.Id, _2.Id, _3.Id)
join #number _5 on _5.Id not in (_1.Id, _2.Id, _3.Id, _4.Id)
order by 1
回答by Keith Gresham
This method uses a binary mask to select the correct rows:
此方法使用二进制掩码来选择正确的行:
;with src(t,n,p) as (
select element, index, power(2,index-1)
from table
)
select s1.t+s2.t+s3.t+s4.t
from src s1, src s2, src s3, src s4
where s1.p+s2.p+s3.p+s4.p=power(2,4)-1
My original post:
我的原帖:
declare @t varchar(4) = 'ABCD'
;with src(t,n,p) as (
select substring(@t,1,1),1,power(2,0)
union all
select substring(@t,n+1,1),n+1,power(2,n)
from src
where n < len(@t)
)
select s1.t+s2.t+s3.t+s4.t
from src s1, src s2, src s3, src s4
where s1.p+s2.p+s3.p+s4.p=power(2,len(@t))-1
This is one of those problems that haunts you. I liked the simplicity of my original answer but there was this issue where I was still building all the possible solutions and then selecting the correct ones. One more try to make this process more efficient by only building the solutions that were correct yielded this answer. Add a character to the string only if that character didn't exist in the string. Patindex seemed like the perfect companion for a CTE solution. Here it is.
这是困扰您的问题之一。我喜欢我原始答案的简单性,但有一个问题,我仍在构建所有可能的解决方案,然后选择正确的解决方案。再一次尝试通过仅构建正确的解决方案来使此过程更有效,从而产生此答案。仅当字符串中不存在该字符时才将字符添加到字符串中。Patindex 似乎是 CTE 解决方案的完美伴侣。这里是。
declare @t varchar(10) = 'ABCDEFGHIJ'
;with s(t,n) as (
select substring(@t,1,1),1
union all
select substring(@t,n+1,1),n+1
from s where n<len(@t)
)
,j(t) as (
select cast(t as varchar(10)) from s
union all
select cast(j.t+s.t as varchar(10))
from j,s where patindex('%'+s.t+'%',j.t)=0
)
select t from j where len(t)=len(@t)
I was able to build all 3.6 million solutions in 3 minutes and 2 seconds. Hopefully this solution will not get missed just because it's not the first.
我能够在 3 分 2 秒内构建所有 360 万个解决方案。希望这个解决方案不会因为它不是第一个而被错过。
回答by Paul Bamford
--Hopefully this is a quick solution, just change the values going into #X
--希望这是一个快速的解决方案,只需更改进入 #X 的值
IF OBJECT_ID('tempdb.dbo.#X', 'U') IS NOT NULL DROP TABLE #X; CREATE table #X([Opt] [nvarchar](10) NOT NULL)
Insert into #X values('a'),('b'),('c'),('d')
declare @pSQL NVarChar(max)='select * from #X X1 ', @pN int =(select count(*) from #X), @pC int = 0;
while @pC<@pN begin
if @pC>0 set @pSQL = concat(@pSQL,' cross join #X X', @pC+1);
set @pC = @pC +1;
end
execute(@pSQL)
--or as single column result
-- 或作为单列结果
IF OBJECT_ID('tempdb.dbo.#X', 'U') IS NOT NULL DROP TABLE #X; CREATE table #X([Opt] [nvarchar](10) NOT NULL)
Insert into #X values('a'),('b'),('c'),('d')
declare @pSQL NVarChar(max)=' as R from #X X1 ',@pSelect NVarChar(Max)=' ',@pJoin NVarChar(Max)='', @pN int =(select count(*) from #X), @pC int = 0;
while @pC<@pN begin
if @pC>0 set @pJoin = concat(@pJoin ,' cross join #X X', @pC+1) set @pSelect = concat(@pSelect ,'+ X', @pC+1,'.Opt ')
set @pC = @pC +1;
end
set @pSQL = concat ('select X1.Opt', @pSelect,@pSQL ,@pJoin)
exec(@pSQL)
回答by SDReyes
Current solution using a recursive CTE.
当前使用递归 CTE 的解决方案。
-- The base elements
Declare @Number Table( Element varchar(MAX), Id varchar(MAX) )
Insert Into @Number Values ( 'A', '01')
Insert Into @Number Values ( 'B', '02')
Insert Into @Number Values ( 'C', '03')
Insert Into @Number Values ( 'D', '04')
-- Number of elements
Declare @ElementsNumber int
Select @ElementsNumber = COUNT(*)
From @Number;
-- Permute!
With Permutations( Permutation, -- The permutation generated
Ids, -- Which elements where used in the permutation
Depth ) -- The permutation length
As
(
Select Element,
Id + ';',
Depth = 1
From @Number
Union All
Select Permutation + ' ' + Element,
Ids + Id + ';',
Depth = Depth + 1
From Permutations,
@Number
Where Depth < @ElementsNumber And -- Generate only the required permutation number
Ids Not like '%' + Id + ';%' -- Do not repeat elements in the permutation (this is the reason why we need the 'Ids' column)
)
Select Permutation
From Permutations
Where Depth = @ElementsNumber
回答by brettlyman
Assuming your table is named Elements and has 4 rows, this is as simple as:
假设您的表名为 Elements 并且有 4 行,这很简单:
select e1.Element + e2.Element + e3.Element + e4.Element
from Elements e1
join Elements e2 on e2.Element != e1.Element
join Elements e3 on e3.Element != e2.Element AND e3.Element != e1.Element
join Elements e4 on e4.Element != e3.Element AND e4.Element != e2.Element AND e4.Element != e1.Element
回答by Matthew Dunn
Way too much rust on my SQL skills, but i took a different tack for a similar problem and thought it worth sharing.
我的 SQL 技能生锈太多了,但我对类似的问题采取了不同的策略,并认为值得分享。
Table1 - X strings in a single field Uno
Table2 - Y strings in a single field Dos
(SELECT Uno, Dos
FROM Table1
CROSS JOIN Table2 ON 1=1)
UNION
(SELECT Dos, Uno
FROM Table1
CROSS JOIN Table2 ON 1=1)
Same principle for 3 tables with an added CROSS JOIN
增加了 CROSS JOIN 的 3 个表的相同原则
(SELECT Tres, Uno, Dos
FROM Table1
CROSS JOIN Table2 ON 1=1
CROSS JOIN Table3 ON 1=1)
although it takes 6 cross-join sets in the union.
尽管它在联合中需要 6 个交叉连接集。