vba 宏在excel表中的各个列中进行所有可能的数据组合?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/12335323/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Macro to make all possible combinations of data in various columns in excel sheet?
提问by user1657410
In an another post, the user Excellllprovided a macro address the aforementioned question.
在另一篇文章中,用户Excelll提供了一个宏地址来解决上述问题。
I have a worksheet which has data as below:
我有一个工作表,其中包含如下数据:
A B C
abc,def,ghi,jkl 1,2,3 a1,e3,h5,j8
The following solutionturns it into
在下面的解决方案把它变成
abc 1 a1
abc 2 a1
abc 3 a1
abc 1 e3
abc 2 e3
abc 3 h5
However, I wanted to know how the macro can be modified as the number of columns of data grow from 3 columns of data to 10 columns of data.
但是,我想知道如何在数据列数从 3 列数据增长到 10 列数据时修改宏。
I tried modifying the macro a number of times based upon the patterns in the code that I saw, but I kept getting an error.
我尝试根据我看到的代码中的模式多次修改宏,但我一直收到错误消息。
回答by chris neilsen
Here's a generalised solution that uses Recursion to handle any number of columns (greater than 1)
这是一个通用的解决方案,它使用递归来处理任意数量的列(大于 1)
Sub Combinations()
Dim aSrc As Variant
' Get Data into an array
' This section is an example to get the source data into an array
' Replace this section if your data is sourced differently.
' The required format of aSrc is Array(1 To NumberOfColumnsOfData)
' where each element aSrc(n) is Array(1 To NumberOfRowsInColumnN, 1 To 1) of Variant
Dim rSrc As Range, colR As Range
Dim sh As Worksheet
Dim a As Variant
Dim i As Long
Set sh = ActiveSheet ' <-- Adjust to suit
Set rSrc = sh.[A:D] ' <-- Adjust to suit
ReDim aSrc(1 To rSrc.Columns.Count)
With sh
For i = 1 To rSrc.Columns.Count
Set colR = rSrc.Columns(i)
aSrc(i) = .Range(colR.Cells(1, 1), colR.Cells(.Rows.Count, 1).End(xlUp))
Next
End With
' Generate output
' This populates aDst(1 To lSize, 1 To NumberOfSourceColumns)
' where lSize is total number of combinations
Dim aDst As Variant
Dim lSize As Long
Dim n As Long
Dim aBase() As String
lSize = 1
For i = 1 To UBound(aSrc)
lSize = lSize * UBound(aSrc(i), 1)
Next
ReDim aDst(1 To lSize, 1 To UBound(aSrc))
ReDim aBase(0 To UBound(aSrc) - 1)
n = 1
aBase = Split(String(UBound(aSrc) - 1, ","), ",")
aBase(0) = aSrc(1)(1, 1)
Generate aSrc, aDst, aBase, 1, n
' Place output into sheet
' Starting at cell rDst
Dim rDst As Range
Set rDst = [E1] ' <-- Adjust to suit
Set rDst = rDst.Resize(UBound(aDst, 1), UBound(aDst, 2))
rDst = aDst
End Sub
Private Sub Generate(ByRef aSrc As Variant, ByRef aDst As Variant, ByRef aBase As Variant, ByVal pCol As Long, ByRef pDst As Long)
Dim i As Long, j As Long
If pCol = UBound(aSrc) Then
' If iterating the last source column, output to aDst
For i = 1 To UBound(aSrc(pCol), 1)
For j = 1 To UBound(aBase)
aDst(pDst, j) = aBase(j - 1)
Next
aDst(pDst, j) = aSrc(pCol)(i, 1)
pDst = pDst + 1
Next
Else
' If NOT iterating the last source column, aBase and call Generate again
For i = 1 To UBound(aSrc(pCol), 1)
aBase(pCol - 1) = aSrc(pCol)(i, 1)
Generate aSrc, aDst, aBase, pCol + 1, pDst
Next
End If
End Sub
回答by Tony Dallimore
I am a fan of recursion but only if I believe it provides the simpliest solution. I do not believe it is appropriate for this problem.
我是递归的粉丝,但前提是我相信它提供了最简单的解决方案。我不认为它适合这个问题。
In the original question, UJ9 had:
在最初的问题中,UJ9 有:
Column A B C
Row 1 abc,def,ghi,jkl 1,2,3 a1,e3,h5,j8
and wanted:
并想要:
Column A B C
Row 1 abc 1 a1
Row 2 abc 2 a1
Row 3 abc 3 a1
Row 4 abc 1 e3
Row 5 abc 2 e3
Row 6 abc 3 h5
:
Row 48 jkl 3 j8
user1657410 wants the same but with 10 columns.
user1657410 想要相同但有 10 列。
The solutions for the original problem use three (one per column) nested for-loops. Adapting those solutions for ten nested for-loops is possible but not an easy implementation. Let us consider the principle behind those solutions and then look for a different implementation strategy.
原始问题的解决方案使用三个(每列一个)嵌套的 for 循环。为十个嵌套的 for 循环调整这些解决方案是可能的,但不是一个简单的实现。让我们考虑这些解决方案背后的原则,然后寻找不同的实施策略。
If we index the values in each column we get:
如果我们索引每一列中的值,我们会得到:
Column A B C
Row 1 abc,def,ghi,jkl 1,2,3 a1,e3,h5,j8
Index 0 1 2 3 0 1 2 0 1 2 3
What the solutions do is generate every combination of index: 000 001 002 003 010 011 012 013 020 021 021 023 100 ... 323 and use the digits to select the appropriate substring from the appropriate string.
解决方案的作用是生成索引的每个组合:000 001 002 003 010 011 012 013 020 021 021 023 100 ... 323 并使用数字从适当的字符串中选择适当的子字符串。
To adapt this approach for a larger number of columns we need to switch from nested for-loops to arrays with one entry per column. One array hold the maximum value of the index for the column and the other holds the currently selected index. The initial state would be something like:
为了使这种方法适用于更多的列,我们需要从嵌套的 for 循环切换到每列一个条目的数组。一个数组保存列索引的最大值,另一个数组保存当前选定的索引。初始状态类似于:
Column A B C D E F G H I J
Maximum index array 4 3 4 4 3 2 6 3 4 2
Current index array 0 0 0 0 0 0 0 0 0 0
We now need a loop that will increment the Current index array like a speedometer except each column has its own maximum. That is, we want to add one to the rightmost element of the Current index array unless it is already at its maximum value. If it is at its maximum value, it is reset to zero and the next column to the left is incremented unless it is at its maximum value. This continues until the loop wants to increment the leftmost index past its maximum value. That is, we need a loop which will set the Current index array to the following values:
我们现在需要一个循环来像速度计一样递增 Current 索引数组,但每列都有自己的最大值。也就是说,我们要向 Current 索引数组的最右边元素加 1,除非它已经达到最大值。如果它处于最大值,则将其重置为零并且向左的下一列递增,除非它处于最大值。这一直持续到循环想要将最左边的索引增加到超过其最大值为止。也就是说,我们需要一个循环将当前索引数组设置为以下值:
Column A B C D E F G H I J
Maximum index array 4 3 4 4 3 2 6 3 4 2
Current index array 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 1
0 0 0 0 0 0 0 0 0 2
0 0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 1 1
0 0 0 0 0 0 0 0 1 2
0 0 0 0 0 0 0 0 2 0
0 0 0 0 0 0 0 0 2 1
0 0 0 0 0 0 0 0 2 2
0 0 0 0 0 0 0 0 3 0
0 0 0 0 0 0 0 0 3 1
0 0 0 0 0 0 0 0 3 2
0 0 0 0 0 0 0 1 0 0
: :
4 3 4 4 3 2 6 3 4 2
For each different value of the Current index array, you select the appropriate substring from each column and generate a row containing the substrings.
对于 Current 索引数组的每个不同值,从每一列中选择适当的子字符串并生成包含子字符串的行。
Before we go any further, are you sure you want to generate a row per combination of sub-string? With the maximum index values I selected for my example, you would get 2,520,000 rows.
在我们继续之前,您确定要为每个子字符串组合生成一行吗?使用我为示例选择的最大索引值,您将获得 2,520,000 行。
The code below assumes the source row is row 1. It outputs the generated rows starting at row 3. This code generates a table like the one above so you can properly understand how the code works. Below this code are instructions to amend it to output substrings. The code adjusts to the number of columns in the source row. The code does not check that your version of Excel can support the number of rows generated.
下面的代码假定源行是第 1 行。它从第 3 行开始输出生成的行。此代码生成一个类似于上面的表,以便您可以正确理解代码的工作原理。此代码下方是修改它以输出子字符串的说明。代码会根据源行中的列数进行调整。该代码不会检查您的 Excel 版本是否可以支持生成的行数。
Sub Combinations()
Dim ColCrnt As Long
Dim ColMax As Long
Dim IndexCrnt() As Long
Dim IndexMax() As Long
Dim RowCrnt As Long
Dim SubStrings() As String
Dim TimeStart As Single
TimeStart = Timer
With Worksheets("Combinations")
' Use row 1 as the source row. Find last used column.
ColMax = .Cells(1, Columns.Count).End(xlToLeft).Column
' Size Index arrays according to number of columns
' Use one based arrays so entry number matches column number
ReDim IndexCrnt(1 To ColMax)
ReDim IndexMax(1 To ColMax)
' Initialise arrays
For ColCrnt = 1 To ColMax
SubStrings = Split(.Cells(1, ColCrnt).Value, ",")
' SubStrings is a zero-based array with one entry
' per comma separated value.
IndexMax(ColCrnt) = UBound(SubStrings)
IndexCrnt(ColCrnt) = 0
Next
RowCrnt = 3 ' Output generated values starting at row 3
Do While True
' Use IndexCrnt() here.
' For this version I output the index values
For ColCrnt = 1 To ColMax
' This will generate an error if RowCrnt exceeds the maximum number
' of columns for your version of Excel.
.Cells(RowCrnt, ColCrnt).Value = IndexCrnt(ColCrnt)
Next
RowCrnt = RowCrnt + 1
' Increment values in IndexCrnt() from right to left
For ColCrnt = ColMax To 1 Step -1
If IndexCrnt(ColCrnt) < IndexMax(ColCrnt) Then
' This column's current index can be incremented
IndexCrnt(ColCrnt) = IndexCrnt(ColCrnt) + 1
Exit For
End If
If ColCrnt = 1 Then
' Leftmost column has overflowed.
' All combinations of index value have been generated.
Exit Do
End If
IndexCrnt(ColCrnt) = 0
' Loop to increment next column
Next
Loop
End With
Debug.Print Format(Timer - TimeStart, "#,###.##")
End Sub
If you are happy that you understand the above code, replace:
如果您很高兴理解上述代码,请替换:
' For this version I output the index values
For ColCrnt = 1 To ColMax
.Cells(RowCrnt, ColCrnt).Value = IndexCrnt(ColCrnt)
Next
by:
经过:
For ColCrnt = 1 To ColMax
SubStrings = Split(.Cells(1, ColCrnt).Value, ",")
.Cells(RowCrnt, ColCrnt).Value = SubStrings(IndexCrnt(ColCrnt))
Next
This revised code output the appropriate substring for each combination but it will be slow with large numbers of combination because it extracts the required substring from the source cell for every generated row. For example, it generates 27,648 rows in 12.66 seconds. The code below takes 9.15 seconds but uses a more advanced technique.
此修改后的代码为每个组合输出适当的子字符串,但是对于大量组合,它会很慢,因为它从源单元格中为每个生成的行提取所需的子字符串。例如,它在 12.66 秒内生成 27,648 行。下面的代码需要 9.15 秒,但使用了更高级的技术。
Step 1, replace:
第一步,替换:
Dim SubStrings() As String
by:
经过:
Dim SubStrings() As Variant
With Dim SubStrings() As String
, SubString(N) can only contain a string. With Dim SubStrings() As Variant
, SubString(N) can contain a string or an integer or a floating-point value. This is not good in most situations because a variant is slower to process than a string or a long and you will not be warned if you set it to the wrong sort of value for your code. However, I am going to store an array in SubString(N). I will be using what is called a ragged array because each row has a different number of columns.
使用 时Dim SubStrings() As String
,SubString(N) 只能包含一个字符串。使用Dim SubStrings() As Variant
,SubString(N) 可以包含字符串或整数或浮点值。这在大多数情况下都不好,因为变体的处理速度比字符串或长整型慢,而且如果您为代码将其设置为错误的值类型,您将不会收到警告。但是,我将在 SubString(N) 中存储一个数组。我将使用所谓的不规则数组,因为每一行都有不同的列数。
Step 2, replace:
第二步,替换:
ReDim IndexCrnt(1 To ColMax)
ReDim IndexMax(1 To ColMax)
by:
经过:
ReDim IndexCrnt(1 To ColMax)
ReDim IndexMax(1 To ColMax)
ReDim SubStrings(1 To ColMax)
Step 3, replace:
第三步,替换:
' Initialise arrays
For ColCrnt = 1 To ColMax
SubStrings = Split(.Cells(1, ColCrnt).Value, ",")
' SubStrings is a zero-based array with one entry
' per comma separated value.
IndexMax(ColCrnt) = UBound(SubStrings)
IndexCrnt(ColCrnt) = 0
Next
by:
经过:
' Initialise arrays
For ColCrnt = 1 To ColMax
SubStrings(ColCrnt) = Split(.Cells(1, ColCrnt).Value, ",")
IndexMax(ColCrnt) = UBound(SubStrings(ColCrnt))
IndexCrnt(ColCrnt) = 0
Next
With the first version, I overwrite the array SubStrings everytime I split a cell. With the second version, I save each column's substrings. With the values used by UJ9 in the original question, the new SubString looks like:
在第一个版本中,每次拆分单元格时我都会覆盖数组 SubStrings。在第二个版本中,我保存了每一列的子字符串。使用 UJ9 在原始问题中使用的值,新的 SubString 如下所示:
---- Columns -----
Row 0 1 2 3
1 abc def ghi jkl
2 1 2 3
3 a1 e3 h5 j8
Step 4: replace:
第四步:替换:
For ColCrnt = 1 To ColMax
SubStrings = Split(.Cells(1, ColCrnt).Value, ",")
.Cells(RowCrnt, ColCrnt).Value = SubStrings(IndexCrnt(ColCrnt))
Next
by:
经过:
For ColCrnt = 1 To ColMax
.Cells(RowCrnt, ColCrnt).Value = SubStrings(ColCrnt)(IndexCrnt(ColCrnt))
Next
With the revised code I do not split a source cell for every generated value. I extract the substring I require from the array.
使用修改后的代码,我不会为每个生成的值拆分源单元格。我从数组中提取我需要的子字符串。
Note: if you have ever used two dimensional arrays, you will have written something like MyArray(Row,Column)
. Ragged arrays are different; you write MyArray(Row)(Column)
.
注意:如果你曾经使用过二维数组,你会写出类似MyArray(Row,Column)
. 参差不齐的数组是不同的;你写MyArray(Row)(Column)
。