Excel 查找速度与 VBA 二进制搜索?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1857404/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-11 11:04:51  来源:igfitidea点击:

Excel Find Speed vs. VBA binary Search?

vbaexcel-vbabinary-searchsortingexcel

提问by ExcelCyclist

How good/fast is Excel VBA's Find vs. binary search? My platform is Office 11|2003 and I'll be searching for strings against Column A on three sheets of values. Total number of rows ~140,000

Excel VBA 的查找与二分搜索的性能/速度如何?我的平台是 Office 11|2003,我将针对三张值的 A 列搜索字符串。总行数 ~140,000

If worth it which Library & functions should I reference to do the sorting and then the binary search? Binary searching strings/text reportedly has potential problems.

如果值得,我应该参考哪些库和函数来进行排序然后进行二分搜索?据报道,二进制搜索字符串/文本存在潜在问题。

... one thing must be noted. Using binary search formulas with sortedtextrequires caution. Aladin A., Excel MVP

……必须注意一件事。对 sortedtext 使用二分搜索公式需要谨慎。 Aladin A.,Excel MVP

Excel Find:

Excel 查找:

Worksheets(1).Range("A:A").Find("PN-String-K9", LookIn:=xlValues, LookAt:=xlWhole)

回答by mikemay

Much against my intuition a VBA binary search strongly outperforms an Excel Find. At least with the scenario below where 120,000 6 character strings are distributed evenly over 3 worksheets.

与我的直觉大相径庭的是,VBA 二进制搜索的性能明显优于 Excel 查找。至少在下面的场景中,120,000 个 6 字符串均匀分布在 3 个工作表上。

Excel Find takes 1 minute 58 seconds,
VBA binary search takes 36 seconds on my particular machine.

Excel Find 需要 1 分 58 秒,
VBA 二进制搜索在我的特定机器上需要 36 秒。

The advantage of knowing that the text is in order obviously outweighs Excel's natural advantage. Note Aladin A's warning about sort order.

知道文字有序的优势显然超过了Excel的天然优势。注意 Aladin A 关于排序顺序的警告。

Option Explicit

' Call Search to look for a thousand random strings
' in 3 worksheets of a workbook

' requires a workbook with 3 sheets and
' column A populated with values between "00001" to "120000"
' split evenly 40,000 to a worksheet in ascending order.
' They must be text, not numbers.

Private Const NUM_ROWS As Long = 120000
Private Const SHEET_1 As String = "Sheet1"
Private Const SHEET_2 As String = "Sheet2"
Private Const SHEET_3 As String = "Sheet3"

' This uses VBA Binary Search
Public Sub Search()
    Worksheets(SHEET_1).Range("B:B").ClearContents
    Worksheets(SHEET_2).Range("B:B").ClearContents
    Worksheets(SHEET_3).Range("B:B").ClearContents
    DoSearch True       ' change to False to test Excel search
End Sub

' Searches for a thousand values using binary  or excel search depending on
' value of bBinarySearch
Public Sub DoSearch(ByVal bBinarySearch As Boolean)
    Debug.Print Now
    Dim ii As Long

    For ii = 1 To 1000
        Dim rr As Long
        rr = Int((NUM_ROWS) * Rnd + 1)
        If bBinarySearch Then
            Dim strSheetName As String
            Dim nRow As Long
            If BinarySearch(MakeSearchArg(rr), strSheetName, nRow) Then
                Worksheets(strSheetName).Activate
                Cells(nRow, 1).Activate
            End If
        Else
            If Not ExcelSearch(SHEET_1, MakeSearchArg(rr)) Then
                If Not ExcelSearch(SHEET_2, MakeSearchArg(rr)) Then
                    ExcelSearch SHEET_3, MakeSearchArg(rr)
                End If
            End If
        End If
        ActiveCell.Offset(0, 1).Value = "FOUND"
    Next
    Debug.Print Now

End Sub

' look for one cell value using Excel Find
Private Function ExcelSearch(ByVal strWorksheet As String _
  , ByVal strSearchArg As String) As Boolean
    On Error GoTo Err_Exit
    Worksheets(strWorksheet).Activate
    Worksheets(strWorksheet).Range("A:A").Find(What:=strSearchArg, LookIn:=xlValues, LookAt:= 
        xlWhole, SearchOrder:=xlByRows, SearchDirection:=xlNext, MatchCase:=True 
        , SearchFormat:=False).Activate
    ExcelSearch = True
    Exit Function
Err_Exit:
    ExcelSearch = False
End Function

' Look for value using a vba based binary search
' returns true if the search argument is found in the workbook
' strSheetName contains the name of the worksheet on exit and nRow gives the row
Private Function BinarySearch(ByVal strSearchArg As String _
  , ByRef strSheetName As String, ByRef nRow As Long) As Boolean
    Dim nFirst As Long, nLast As Long
    nFirst = 1
    nLast = NUM_ROWS
    Do While True
        Dim nMiddle As Long
        Dim strValue As String
        If nFirst > nLast Then
            Exit Do     ' Failed to find search arg
        End If
        nMiddle = Round((nLast - nFirst) / 2 + nFirst)
        SheetNameAndRowFromIdx nMiddle, strSheetName, nRow
        strValue = Worksheets(strSheetName).Cells(nRow, 1)
        If strSearchArg < strValue Then
            nLast = nMiddle - 1
        ElseIf strSearchArg > strValue Then
            nFirst = nMiddle + 1
        Else
            BinarySearch = True
            Exit Do
        End If
    Loop
End Function

' convert 1 -> "000001", 120000 -> "120000", etc
Private Function MakeSearchArg(ByVal nArg As Long) As String
    MakeSearchArg = Right(CStr(nArg + 1000000), 6)
End Function

' converts some number to a worksheet name and a row number
' This is depenent on the worksheets being named sheet1, sheet2, sheet3

' and containing an equal number of vlaues in each sheet where
' the total number of values is NUM_ROWS
Private Sub SheetNameAndRowFromIdx(ByVal nIdx As Long _
  , ByRef strSheetName As String, ByRef nRow As Long)
    If nIdx <= NUM_ROWS / 3 Then

        strSheetName = SHEET_1
        nRow = nIdx
    ElseIf nIdx > (NUM_ROWS / 3) * 2 Then
        strSheetName = SHEET_3
        nRow = nIdx - (NUM_ROWS / 3) * 2
    Else
        strSheetName = SHEET_2
        nRow = nIdx - (NUM_ROWS / 3)
    End If
End Sub

回答by El Batman

I find using the AutoFilter works a lot faster than manually searching the records with any method.

我发现使用自动筛选比使用任何方法手动搜索记录要快得多。

I filter, check to see if there are any results, then move on. If any are found (by checking the count of results), I can search the small portion that are filtered manually or return them all.

我过滤,检查是否有任何结果,然后继续。如果找到(通过检查结果计数),我可以搜索手动过滤的小部分或全部返回。

I used it on around 44,000 records, searching for a list of 100+ parts against it.

我在大约 44,000 条记录上使用了它,并针对它搜索了 100 多个部分的列表。

Binary searches can easily get stuck in infinite loops if you're not careful.

如果您不小心,二进制搜索很容易陷入无限循环。

回答by Zig-Peru

If you use vlookup with the sorted option, it will likely be faster than your vba.

如果您使用带有 sorted 选项的 vlookup,它可能会比您的 vba 更快。

回答by user8068006

I have become interested in this because I was using the .Find function, and on one PC it failed to work on some lookups, but on another it was OK! So I've done some testing on timings - I've got a sheet with 985 names sorted into order, and I've written a small subroutine to run through them, and look each one up in the same list using a different method (the times are in milliseconds):

我对此很感兴趣,因为我正在使用 .Find 函数,在一台 PC 上它无法进行某些查找,但在另一台上它没问题!所以我对时间做了一些测试 - 我有一张按顺序排列的 985 个名字,我写了一个小子程序来运行它们,并使用不同的方法在同一个列表中查找每个名字(时间以毫秒为单位):

  1. Brute force: 2000
  2. .Find: 750
  3. Application.VLookup: 265
  4. Binary Search: 234
  1. 蛮力:2000
  2. .查找:750
  3. 应用程序.Vlookup:265
  4. 二分查找:234

The problem with VLookup is that it can't return the row number unless you include it in your table.

VLookup 的问题在于它无法返回行号,除非您将其包含在表中。

Here is my code for the Binary search, I've assumed that the sheet has a header row, but you could easily modify the header and code to pass that information. The optional Col parameter is used to indicated if you want the row number, or a cell's value. The function returns 0 (zero) if the find fails.

这是我的二进制搜索代码,我假设工作表有一个标题行,但您可以轻松修改标题和代码以传递该信息。可选的 Col 参数用于指示您想要行号还是单元格的值。如果查找失败,该函数返回 0(零)。

Function Find(Sheet As Worksheet, What As String, Optional Col As Long = 0) As Variant
Dim Top As Long
Dim Mid As Long
Dim Bot As Long 'Bottom
Dim S As String
Dim T As String

   With Sheet
     Top = 2 'Sheet has a header row
     Bot = .UsedRange.Rows.Count
     S = LCase(What)
     Do
       Mid = (Top + Bot) / 2
       T = LCase(.Cells(Mid, 1))
       Select Case True
       Case T > S
         Bot = Mid - 1
       Case T < S
         Top = Mid + 1
       Case Else 'T = S
         If Col = 0 Then
           Find = Mid  'Return the row
         Else
           Find = .Cells(Mid, Col).Value2 'Return the cell's value
         End If
         Exit Function
       End Select
     Loop Until Bot < Top
   End With
   Find = 0
End Function