在 Excel VBA 中需要最快的搜索方法

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/19840955/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-08 17:08:15  来源:igfitidea点击:

Need Fastest Search Method in Excel VBA

excelvbaexcel-vba

提问by Solution Seeker

Consider a scenario, I have 2 columns (Column "A" & Column "B").

考虑一个场景,我有 2 列(列“A”和列“B”)。

Column A has around 130000 rows/Strings Column B has around 10000 rows/Strings

A 列大约有 130000 行/字符串 B 列大约有 10000 行/字符串

I would like to search each string of Column "B" from Column "A".

我想从“A”列中搜索“B”列的每个字符串。

As you can see the volume of data is very high. I have already tried with Range.Find() method. But it's taking lot of time to complete. I am searching for a method/way that will give me result in very less turnaround time.

如您所见,数据量非常大。我已经尝试过 Range.Find() 方法。但是要花很多时间才能完成。我正在寻找一种方法/方式,可以让我的周转时间非常短。

* Some more Clarification on my requirement *

* 关于我的要求的更多说明*

(1) Column A & B contains string values, NOT NUMBERS. And the string can be very large

(1) A & B 列包含字符串值,而不是数字。并且字符串可以非常大

(2) For each cell in column "B", There can be many occurrence in column "A"

(2)对于“B”列中的每个单元格,“A”列中可以出现多次

(3) I would like to fetch all the occurrence of column "B" in column "A" with Row Number

(3)我想用行号获取“A”列中“B”列的所有出现

(4) For a string present in column "B". It can be found as a Substring of any cell in column "A"

(4) 对于“B”列中存在的字符串。它可以作为“A”列中任何单元格的子字符串找到



Download file link - wikisend.com/download/431054/StackOverFlow_Sample.xlsx *

下载文件链接 - wikisend.com/download/431054/StackOverFlow_Sample.xlsx *

Any Suggestions ?

有什么建议 ?

Feel free incase you need any extra details to solve above problem !

如果您需要任何额外的细节来解决上述问题,请随意!

回答by Siddharth Rout

Try this.

尝试这个。

This took 3seconds for 130000 rows in Col Aand 10000 rows in Col B. The output is generated in Col C.

这花了3几秒钟130000 rows in Col A10000 rows in Col B。输出在Col C.

NOTE: I have taken the worstcase scenario where all 10000values in Col B are present in Col A

注意:我采用了Col Bworst中的所有10000值都存在于 Col A 中的情况

This is how my data looks.

这就是我的数据的样子。

enter image description here

在此处输入图片说明

Sub Sample()
    Debug.Print Now

    Dim col As New Collection
    Dim ws As Worksheet
    Dim i As Long

    Set ws = ThisWorkbook.Sheets("Sheet1")

    Application.ScreenUpdating = False

    With ws
        .Range("C1:C10000").Value = "No"

        For i = 1 To 130000
            On Error Resume Next
            col.Add .Range("A" & i).Value, CStr(.Range("A" & i).Value)
            On Error GoTo 0
        Next i

        On Error Resume Next
        For i = 1 To 10000
            col.Add .Range("B" & i).Value, CStr(.Range("B" & i).Value)
            If Err.Number <> 0 Then .Range("C" & i).Value = "Yes"
            Err.Clear
        Next i
    End With

    Application.ScreenUpdating = True

    Debug.Print Now
End Sub

And this was the result

这就是结果

enter image description here

在此处输入图片说明

回答by usncahill

NEWColumn A 130000 100-character strings, Column B 10000 30-character strings, 27 minutes.

A 列 130000 个 100 个字符的字符串,B 列 10000 个 30 个字符的字符串,27 分钟。

Column C is populated with row locations of occurrences of Column B string. Column D is populated with number of occurrences of Column B string.

C 列填充了 B 列字符串出现的行位置。D 列填充了 B 列字符串的出现次数。

Public Sub searchcells()
    Dim arrA(1 To 130000) As String, arrB(1 To 10000) As String, t As Date, nLen As Integer
    t = Now
    Me.Range("c:d") = ""

    For i = 1 To 130000
        arrA(i) = Me.Cells(i, 1)
    Next
    For i = 1 To 10000
        arrB(i) = Me.Cells(i, 2)
    Next

    For i = 1 To 130000
        nLen = Len(arrA(i))
        For j = 1 To 10000
            If InStrRev(arrA(i), arrB(j), nLen - Len(arrB(j)) + 1) > 0 Then Me.Cells(j, 4) = Me.Cells(j, 4) + 1: Me.Cells(j, 3) = Me.Cells(j, 3) & i & "; "
        Next
        Me.Cells(1, 5) = i
    Next

    Debug.Print CDbl(Now - t) * 24 * 3600 & " seconds"
End Sub

The cells can be populated easily with the following, changing i and j limits for the desired number of strings and string lengths in each section.

可以使用以下内容轻松填充单元格,更改每个部分中所需字符串数量和字符串长度的 i 和 j 限制。

Public Sub fillcells()
    Dim temp As String
    Randomize

    For i = 1 To 13000
        temp = ""
        For j = 1 To 100
            temp = temp & Chr(70 + Int(10 * Rnd()))
        Next
        Me.Cells(i, 1) = temp
    Next
    For i = 1 To 10000
        temp = ""
        For j = 1 To 30
            temp = temp & Chr(70 + Int(10 * Rnd()))
        Next
        Me.Cells(i, 2) = temp
    Next
End Sub

I am unable to download your spreadsheet at work, so disregard this if it missed the mark.

我无法在工作时下载您的电子表格,因此如果它没有达到标记,请忽略此内容。