VBA-Excel 和大数据集导致程序崩溃

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/5737655/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-11 13:04:43  来源:igfitidea点击:

VBA-Excel and large data sets causes program to crash

exceloptimizationvba

提问by Sham

First time poster and new to programming in general. I have a project in which i have to build a financial model to mine for data in excel. I have succeeded in building said model on VBA. I have ran tests on 3,000 line dataset and it was successful. I will briefly explain what it does.

第一次海报和一般的编程新手。我有一个项目,我必须建立一个财务模型来挖掘 Excel 中的数据。我已经成功地在 VBA 上构建了所​​述模型。我已经对 3,000 行数据集进行了测试,并且成功了。我将简要解释它的作用。

I track a given stock on a given day on multiple exchanges. I download the data (roughly 935,000 lines) The first step is copy all the data for a given exchange (roughly 290,000) onto a new sheet (this takes roughly 8min), then I create a new column to log the bid ask spread (12secs), the next step is what Im having trouble with, I basically rank each line of data twice, one column for Bid size and one column for Ask size. I created a function which uses excel Percentile function and ranks based on where the given bid and ask size lands. As of right now, I have been running the Macro for the last 35min and has yet to execute. I cant attempt the other macros since each macro depends on the previous one.

我在给定的一天在多个交易所跟踪给定的股票。我下载数据(大约 935,000 行)第一步是将给定交易所的所有数据(大约 290,000)复制到一个新表上(这大约需要 8 分钟),然后我创建一个新列来记录买卖价差(12 秒) ),下一步就是我遇到的问题,我基本上对每一行数据进行了两次排序,一列是投标大小,一列是要价大小。我创建了一个函数,它使用 excel Percentile 函数,并根据给定的买入价和卖出价的位置进行排名。截至目前,我在过去 35 分钟内一直在运行宏,但尚未执行。我无法尝试其他宏,因为每个宏都依赖于前一个宏。

So my basic issue is that since my data set is large, my model keeps crashing.The code seems to be fine when working with the test data, and it doesn't throw any errors when I run the program, but with the larger data set it just crashes. Does anyone have any suggestions? Is this normal with such large amounts of data?

所以我的基本问题是,由于我的数据集很大,我的模型不断崩溃。 使用测试数据时代码似乎很好,并且在我运行程序时不会抛出任何错误,但是使用较大的数据设置它只是崩溃。有没有人有什么建议?如此大量的数据,这正常吗?

Thanks in advance. Sham

提前致谢。假

Here is the sub and function thats giving me the trouble, the sub takes in the required inputs to run the function and then pops into the assigned cell. The code is suppose to repeat the process for three separate sheets. For now, Id like it to work on one sheet, hence used the comments to not include the loop

这是给我带来麻烦的 sub 和 function,sub 接受所需的输入来运行该函数,然后弹出到指定的单元格中。该代码假设对三个单独的工作表重复该过程。现在,我喜欢它在一张纸上工作,因此使用注释不包括循环

Sub Bucketting()

Dim firstRow As Long
Dim lastRow As Long
Dim counter As Long
Dim bidRange As Range
Dim offerRange As Range
Dim bidScroll As Range
Dim offerScroll As Range
Dim Ex As String
Dim i As Integer

'For i = 1 To 1 Step 1 'Sheet Selection Process
 '   If i = 1 Then
  '      Ex = "Z"
   ' ElseIf i = 2 Then
    '    Ex = "P"
   ' Else
    '    Ex = "T"
   ' End If

Sheets("Z").Select 'Sheet selected

With ActiveSheet

    firstRow = .UsedRange.Cells(1).Row + 1
    lastRow = .UsedRange.Rows.Count

   Set bidRange = .Range("F2:F" & lastRow)
   Set offerRange = .Range("G2:G" & lastRow)

    For counter = lastRow To firstRow Step -1

        Set bidScroll = .Range("F" & counter)
        Set offerScroll = .Range("G" & counter)

        With .Cells(counter, "J")
        .Value = DECILE_RANK(bidRange, bidScroll)
        End With

        With .Cells(counter, "K")
        .Value = DECILE_RANK(offerRange, offerScroll)
        End With

    Next counter

End With

Range("J1").Select
ActiveCell = "Bid Rank"

ActiveCell.Offset(0, 1) = "Offer Rank"

'Next i

End Sub

 Function DECILE_RANK(DataRange, RefCell)

    'Credit: BJRaid 
    'DECILE_RANK(The Range of data)
    'Declares the function that can be called in the spreadsheet cell  - enter '=DECILE_RANK(A5:A50,A5)

    'Using the percentile worksheet function calculate where the 10th, 20th etc percentile of the reference range are

    DEC1 = Application.WorksheetFunction.Percentile(DataRange, 0.1)
    DEC2 = Application.WorksheetFunction.Percentile(DataRange, 0.2)
    DEC3 = Application.WorksheetFunction.Percentile(DataRange, 0.3)
    DEC4 = Application.WorksheetFunction.Percentile(DataRange, 0.4)
    DEC5 = Application.WorksheetFunction.Percentile(DataRange, 0.5)
    DEC6 = Application.WorksheetFunction.Percentile(DataRange, 0.6)
    DEC7 = Application.WorksheetFunction.Percentile(DataRange, 0.7)
    DEC8 = Application.WorksheetFunction.Percentile(DataRange, 0.8)
    DEC9 = Application.WorksheetFunction.Percentile(DataRange, 0.9)


    ' Calculate the Decile rank that the reference cell value sits within

    If (RefCell <= DEC1) Then DECILE_RANK = 1
    If (RefCell > DEC1) And (RefCell <= DEC2) Then DECILE_RANK = 2
    If (RefCell > DEC2) And (RefCell <= DEC3) Then DECILE_RANK = 3
    If (RefCell > DEC3) And (RefCell <= DEC4) Then DECILE_RANK = 4
    If (RefCell > DEC4) And (RefCell <= DEC5) Then DECILE_RANK = 5
    If (RefCell > DEC5) And (RefCell <= DEC6) Then DECILE_RANK = 6
    If (RefCell > DEC6) And (RefCell <= DEC7) Then DECILE_RANK = 7
    If (RefCell > DEC7) And (RefCell <= DEC8) Then DECILE_RANK = 8
    If (RefCell > DEC8) And (RefCell <= DEC9) Then DECILE_RANK = 9
    If (RefCell > DEC9) Then DECILE_RANK = 10

End Function

采纳答案by Lance Roberts

The problem is that your looping through each row individually, the Excel way is to try and work with whole ranges at once whenever possible. I would load the ranges into arrays, then modify your DECILE_RANK code to work with the items in the array.

问题是您单独循环遍历每一行,Excel 方法是尽可能尝试一次处理整个范围。我会将范围加载到数组中,然后修改您的 DECILE_RANK 代码以处理数组中的项目。

Note that variant arrays that read ranges in are 2-D.

请注意,读取范围的变体数组是二维的。

Here is the fully functioning code including my custom VBA array slicer. Note that it was only tested on a small dataset:

这是功能齐全的代码,包括我的自定义 VBA 数组切片器。请注意,它仅在一个小数据集上进行了测试:

Sub Bucketting()

Dim lastRow As Long
Dim bidArray As Variant
Dim offerArray As Variant

Sheets("Sheet1").Select 'Sheet selected

With ActiveSheet

  lastRow = .UsedRange.Rows.Count + 1

  bidArray = .Range("F2:F" & lastRow)
  offerArray = .Range("G2:G" & lastRow)

  Range("J2:J" & lastRow).Value = GetArraySlice2D(DECILE_RANK(bidArray), "column", 1, 1, 0)
  Range("K2:K" & lastRow).Value = GetArraySlice2D(DECILE_RANK(offerArray), "column", 1, 1, 0)

End With

Range("J1").Select
ActiveCell = "Bid Rank"

ActiveCell.Offset(0, 1) = "Offer Rank"

End Sub

Function DECILE_RANK(DataRange As Variant) As Variant

' Credit:     BJRaid
' DECILE_RANK(The Range of data)
' Declares the function that can be called in the spreadsheet cell  - enter '=DECILE_RANK(A5:A50,A5)

Dim DEC(0 To 10) As Variant
Dim i As Integer, j As Integer

'Using the percentile worksheet function calculate where the 10th, 20th etc percentile of the reference range are
DEC(0) = 0
For i = 1 To 9
  DEC(i) = Application.WorksheetFunction.Percentile(DataRange, 0.1 * i)
Next i
DEC(10) = Application.WorksheetFunction.Max(DataRange)

' Calculate the Decile rank that the reference cell value sits within
For i = 1 To UBound(DataRange, 1)
  For j = 1 To 10
    If ((DataRange(i, 1) > DEC(j - 1)) And (DataRange(i, 1) <= DEC(j))) Then
      DataRange(i, 1) = j
      Exit For
    End If
  Next j
Next i

DECILE_RANK = DataRange

End Function

Public Function GetArraySlice2D(Sarray As Variant, Stype As String, Sindex As Integer, Sstart As Integer, Sfinish As Integer) As Variant

' this function returns a slice of an array, Stype is either row or column
' Sstart is beginning of slice, Sfinish is end of slice (Sfinish = 0 means entire
' row or column is taken), Sindex is the row or column to be sliced (NOTE:
' 1 is always the first row or first column)
' an Sindex value of 0 means that the array is one dimensional 3/20/09 Lance Roberts

Dim vtemp() As Variant
Dim i As Integer

On Err GoTo ErrHandler

Select Case Sindex
    Case 0
        If Sfinish - Sstart = UBound(Sarray) - LBound(Sarray) Then
            vtemp = Sarray
        Else
            ReDim vtemp(1 To Sfinish - Sstart + 1)
            For i = 1 To Sfinish - Sstart + 1
                vtemp(i) = Sarray(i + Sstart - 1)
            Next i
        End If
    Case Else
        Select Case Stype
            Case "row"
                If Sfinish = 0 Or (Sstart = LBound(Sarray, 2) And Sfinish = UBound(Sarray, 2)) Then
                    vtemp = Application.WorksheetFunction.Index(Sarray, Sindex, 0)
                Else
                    ReDim vtemp(1 To Sfinish - Sstart + 1)
                    For i = 1 To Sfinish - Sstart + 1
                        vtemp(i) = Sarray(Sindex, i + Sstart - 1)
                    Next i
                End If
            Case "column"
                If Sfinish = 0 Or (Sstart = LBound(Sarray, 1) And Sfinish = UBound(Sarray, 1)) Then
                    vtemp = Application.WorksheetFunction.Index(Sarray, 0, Sindex)
                Else
                    ReDim vtemp(1 To Sfinish - Sstart + 1)
                    For i = 1 To Sfinish - Sstart + 1
                        vtemp(i) = Sarray(i + Sstart - 1, Sindex)
                    Next i
                End If
        End Select
End Select
GetArraySlice2D = vtemp
Exit Function

ErrHandler:
    Dim M As Integer
    M = MsgBox("Bad Array Input", vbOKOnly, "GetArraySlice2D")

End Function

回答by Jody

935,000 lines is a lot for excel. Like, really a lot. Barring saying using a real database, If your application is literally putting a =Percentile(...) in each cell, I would recommend Trying to use another tool for that. Perhaps something within VBA itself. More generally, use something outside of a cell - then store the result value in the cell. There is a lot of overhead in maintaining those formulas that are interdependent on 935k rows of data.

935,000 行对于 excel 来说已经很多了。喜欢,真的很多。除非说使用真实的数据库,如果您的应用程序实际上是在每个单元格中放置一个 =Percentile(...),我会建议尝试使用其他工具。也许是 VBA 本身的东西。更一般地,使用单元格之外的东西 - 然后将结果值存储在单元格中。维护那些与 935k 行数据相互依赖的公式有很多开销。

回答by Phil.Wheeler

I'm not sure if this will directly address your problem, but have you considered using Application.ScreenUpdating = False? Don't forget to set it back to true once your data has processed.

我不确定这是否会直接解决您的问题,但您是否考虑过使用Application.ScreenUpdating = False? 处理完数据后,不要忘记将其设置回 true。