VBA,文件系统对象,速度/优点/缺点

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/18387447/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-11 22:54:40  来源:igfitidea点击:

VBA, File System Object, speed/advantages/disadvantages

performancefilevbanetworkingsystem

提问by Finch042

This turned into a rather long post, and there's not really an "answer" per say. I'm more looking for an explanation as opposed to some silver bullet to fix the problem. As such, any aspect you'd like to answer would be quite appreciated. Thanks in advance!

这变成了一个相当长的帖子,而且每个人都没有真正的“答案”。我更在寻找一个解释,而不是一些解决问题的灵丹妙药。因此,您想回答的任何方面都将不胜感激。提前致谢!



I'm running into what may be a "problem" with the file system object, and that's lead to a question about the functionality etc. of how the File System Object in VBA works vs. "something else" (I don't know if there's an alternative to use in Excel for what I'm doing) in .net etc. I don't know of a better place to ask, and I'm not sure what to look into to research it for myself. So here I am!

我遇到了文件系统对象可能存在的“问题”,这导致了一个关于 VBA 中文件系统对象如何工作与“其他东西”的功能等相关的问题(我不知道如果有其他方法可以在 Excel 中用于我正在做的事情)在 .net 等中。我不知道有什么更好的地方可以问,而且我不确定要为自己研究什么。所以我来了!

So! To the problem. The short explanation is that I iterate through folders, gathering file information (name, extension, full path, etc.) and place it into a spreadsheet. I eventually use this information to copy the files to a new location. However, on a large scale (1,000+ files) this seems to work just fine locally, but it is considerably slower on a network location (at work). It will chew through like 1,500 files, wait a while, do 1,500 more etc. Either while listing or copying the files. Again, this is not the case when done locally, it will just run through without issue, so I can probably assume it's probably nothing to do with my code. It's almost as if the network is opening and closing a gate intermittently.

所以!到问题。简短的解释是我遍历文件夹,收集文件信息(名称、扩展名、完整路径等)并将其放入电子表格中。我最终使用此信息将文件复制到新位置。但是,在大规模(1,000 多个文件)上,这似乎在本地工作得很好,但在网络位置(工作)上速度要慢得多。它将像 1,500 个文件一样咀嚼,稍等片刻,再做 1,500 个等等。在列出或复制文件时。同样,在本地完成时情况并非如此,它只会毫无问题地运行,所以我可以假设它可能与我的代码无关。这几乎就像网络间歇性地打开和关闭一扇门。

Alternatively, using other programs from an end user perspective (I tried it against the same files I was using with my program, on our work network) it is MUCH faster without any of the aforementioned delays. I'm assuming the alternative program is using some version of .net, if it matters. Long story short, I don't think I can inherently blame our network for the speed issues I'm running into.

或者,从最终用户的角度使用其他程序(我在我们的工作网络上针对与我的程序一起使用的相同文件进行了尝试),速度要快得多,而且没有上述任何延迟。如果重要的话,我假设替代程序正在使用某些版本的 .net。长话短说,我认为我不能天生地将我遇到的速度问题归咎于我们的网络。

So my question/curiosity/issue comes down to a few key points:

所以我的问题/好奇心/问题归结为几个关键点:

-What's the difference between the FSO in VBA and the default libraries in .Net, and could the difference between the cause of the issue I'm running into? Clearly it's possible to read this sort of data much more quickly than it is being done.

- VBA 中的 FSO 和 .Net 中的默认库之间有什么区别,我遇到的问题的原因之间有什么区别吗?显然,读取此类数据的速度可能比读取速度快得多。

-Is the FSO not intended to be used this way (over a network, with large amounts of remote data, or... ?)? Is it just dated/outmoded? And is there an alternative that can be used through VBA?

- 是否不打算以这种方式使用 FSO(通过网络,具有大量远程数据,或...?)?它只是过时/过时了吗?是否有可以通过 VBA 使用的替代方法?

-I only nebulously understand that our network functions in a different way than a local drive. It stores many terabytes of data, etc. and I'm not sure what the difference is at a very deep level between accessing a local drive and a network location. I know I'm not giving details on the network that would probably be very beneficial in diagnosis, I just don't the information unfortunately. I guess I'd just ask if it "potentially" an explanation that using the FSO in such a way with some/all sorts of networks is just not the way it's meant to be used. Is it possible that the network is set up in such a way to limit the sort of way I'm trying to interact with it?

- 我只是模糊地理解我们的网络以与本地驱动器不同的方式运行。它存储了许多 TB 的数据等,我不确定访问本地驱动器和网络位置之间在非常深的层次上有什么区别。我知道我没有提供可能对诊断非常有益的网络详细信息,不幸的是我没有提供这些信息。我想我只是问它是否“可能”解释以这种方式使用 FSO 与某些/所有类型的网络并不是它应该使用的方式。网络的设置方式是否有可能限制我尝试与之交互的方式?

-Even though I haven't run into any issues doing this locally, is it possible that something in my code is much more taxing to a network location vs. a local drive?

-即使我在本地执行此操作时没有遇到任何问题,我的代码中的某些内容是否可能对网络位置比本地驱动器造成更大的负担?

Thanks for any insight you can provide.

感谢您提供的任何见解。

采纳答案by pstraton

Finch042 acknowledges that he is only "nebulous" about the specifics of what is different when accessing a network server's file system vs. a local file system, and that his question is really about the relativespeed different between those two circumstances. All of the other posts here assume that the issue is with his design choices and/or coding techniques but I think the underlying question has gone unanswered: why is it that network file operations can be so much slower?

Finch042 承认,他只是对访问网络服务器的文件系统与本地文件系统时不同之处的具体细节“含糊不清”,他的问题实际上是关于这两种情况之间的相对速度差异。这里的所有其他帖子都假设问题出在他的设计选择和/或编码技术上,但我认为潜在的问题没有得到解答:为什么网络文件操作会这么慢?

The short answer is that a networked file system is on a different computer's disc at the end of LAN cable (or, worse, a Wifi signal), and such intermediary technology is much more limited in its data-transfer bandwidth than the electronics between a computer's processor and its local disc. It is true that modern LAN capacities are, relative to the stone-age, blindingly fast, but they are still way, way slower than the disc-interface electronics on a PC's motherboard. So you will always experience some level of performance degradation when accessing remote files.

简短的回答是,网络文件系统位于 LAN 电缆末端的另一台计算机磁盘上(或者更糟的是,Wifi 信号),并且这种中间技术在数据传输带宽方面的限制要远多于计算机之间的电子设备。计算机的处理器及其本地磁盘。的确,相对于石器时代,现代 LAN 容量快得令人眼花缭乱,但它们仍然比 PC 主板上的磁盘接口电子设备慢得多。因此,在访问远程文件时,您总是会遇到某种程度的性能下降。

Furthermore, many modern server farm systems may include mirroring (i.e. storage redundancy) for data-integrity maintenance and may also include automatic version-backup capabilities, both of which can add access time to some server operations, especially when writing new files or updating existing ones.

此外,许多现代服务器群系统可能包括用于数据完整性维护的镜像(即存储冗余),还可能包括自动版本备份功能,这两者都可以增加某些服务器操作的访问时间,尤其是在写入新文件或更新现有文件时那些。

As for the fluctuations in the data transfer rates to/from the server, which Finch042 describes as an apparent "gating" of the data flow: whenever you are using a common-access technology, such as LAN systems and shared servers, you are usually competing with others who are trying to do similar stuff. For example, LAN technologies such as traditional Ethernet actually allow the various users to stomp all over each other's transmission attempts and, when that does result in a failed attempt, it retrys until it succeeds. It is a design that trades simplicity and, thereby, ultimate overall reliability, for a (usually) minor loss in throughput speed. But when the demand on the network is high, it can result in a dramatic degradation in throughput for all users.

至于到/从服务器的数据传输速率的波动,Finch042 将其描述为数据流的明显“门控”:每当您使用公共访问技术时,例如 LAN 系统和共享服务器,您通常与试图做类似事情的其他人竞争。例如,LAN 技术(如传统以太网)实际上允许不同的用户在彼此的传输尝试中进行踩踏,当这确实导致尝试失败时,它会重试直到成功。这种设计以简单性和最终的整体可靠性为代价,换取了吞吐速度的(通常)较小损失。但是当对网络的需求很高时,可能会导致所有用户的吞吐量急剧下降。

Similarly, a file server has a limited capacity to service file-system access requests and it, too, can become overloaded at times of high demand.

类似地,文件服务器处理文件系统访问请求的能力有限,而且在需求量大的时候也可能过载。

I suspect that Finch042's experience is likely related to those kinds of issues, especially if his organization's network and server system grew incrementally, and therefore in a non-optimized way, over a long time, and/or is at or near its capacity limit. And his experience of inconsistent data-transfer rates is likely just the ebb and flow of demand on the common, shared network/server systems.

我怀疑 Finch042 的经历可能与这些类型的问题有关,特别是如果他的组织的网络和服务器系统逐渐增长,因此在很长一段时间内以非优化的方式增长,和/或达到或接近其容量限制。他对数据传输速率不一致的经历可能只是对通用共享网络/服务器系统的需求潮起潮落。

Also, be aware that virus protection systems can interfere with file access speeds, especially for network server files.

此外,请注意病毒防护系统可能会影响文件访问速度,尤其是对于网络服务器文件。

回答by Andy G

(I'm posting as an answer as the following is too long for a comment.)

(我发布作为答案,因为以下内容太长,无法发表评论。)

I get the impression you might be feeding values into Excel cells one at a time, or maybe a row at a time. I would use an array Dim arr(100, 4) As Stringfill it with values then fill a large range in one go Range("A1:E101") = arr. I would experiment with the size of 100 as I suspect it could be muchlarger. In preference to FSO I would use (VBA methods) Dir, FileCopy and Kill, only using FSO where necessary.

我的印象是您可能一次将一个值输入 Excel 单元格,或者一次输入一行。我会使用一个数组Dim arr(100, 4) As String填充它,然后一次性填充一个大范围Range("A1:E101") = arr。我怀疑这可能是我会的100大小的实验很多大。与 FSO 相比,我会使用(VBA 方法)Dir、FileCopy 和 Kill,仅在必要时使用 FSO。

VB.NET has a number of other options, such as Lists (of a Class, perhaps) in-memory Stream, StringBuilder. However, if Excel Interop is still needed, then the advantage of these approaches may be lost. In which case I might consider writing to a csv file, which can be opened directly by Excel. Excel Interop could still be used, but I would write to the csv and then open it (as a single statement) in Excel.

VB.NET 有许多其他选项,例如列表(可能属于类)内存流、StringBuilder。但是,如果仍然需要 Excel Interop,则可能会失去这些方法的优势。在这种情况下,我可能会考虑写入一个 csv 文件,该文件可以由 Excel 直接打开。Excel Interop 仍然可以使用,但我会写入 csv,然后在 Excel 中打开它(作为单个语句)。

Logically, I assume it would be more efficient to create this text-file in the same location as the network files, then move it afterwards - but someone might correct this assumption.

从逻辑上讲,我认为在与网络文件相同的位置创建这个文本文件会更有效,然后再移动它 - 但有人可能会纠正这个假设。

回答by Graham Anderson

What do you mean by fast, for 1500 files on a network I think that the following implementation using FSO isn't too slow, but how fast were you hoping for?

您所说的快速是什么意思,对于网络上的 1500 个文件,我认为以下使用 FSO 的实现并不太慢,但是您希望有多快?

Sub TestBuildFileStructure()
' Call to test GetFiles function.

Const sDIRECTORYTOCHECK As String = <enter path to check from as string>

Dim varItem         As Variant
Dim wkbOutputFile   As Workbook
Dim shtOutputSheet  As Worksheet
Dim sDate           As String
Dim sPath           As String
Dim lRowNumber      As Long
Dim vSplit          As Variant

sPath = ThisWorkbook.Path

sDate = CStr(Now)
vSplit = Split(sDate, "/")
sDate = vSplit(0) & vSplit(1) & vSplit(2)
vSplit = Split(sDate, ":")
sDate = vSplit(0) & vSplit(1) & vSplit(2)

sDate = "Check " & sDate

Set wkbOutputFile = Workbooks.Add
'wkbOutputFile.Name = sDate
Set shtOutputSheet = wkbOutputFile.Sheets.Add
shtOutputSheet.Name = "Output"

lRowNumber = 1


Call BuildFileStructure(sDIRECTORYTOCHECK, shtOutputSheet, lRowNumber, True)

wkbOutputFile.SaveAs (sPath & "\" & sDate)



Cleanup:

Set shtOutputSheet = Nothing
Set wkbOutputFile = Nothing

End Sub

Function BuildFileStructure(ByVal strPath As String, _
                ByRef shtOutputSheet As Worksheet, _
                ByRef lRowNumber As Long, _
                Optional ByVal blnRecursive As Boolean) As Boolean

   ' This procedure returns all the files in a directory into
   ' an excel file. If called recursively, it also returns
   ' all files in subfolders.

    Const iNAMECOLUMN As Integer = 1

    Dim fsoSysObj       As FileSystemObject
    Dim fdrFolder       As Folder
    Dim fdrSubFolder    As Folder
    Dim filFile         As File

    ' Return new FileSystemObject.
    Set fsoSysObj = New FileSystemObject

    On Error Resume Next
    ' Get folder.
    Set fdrFolder = fsoSysObj.GetFolder(strPath)

    If Err <> 0 Then
      ' Incorrect path.
        BuildFileStructure = False
        GoTo BuildFileStructure_End
    End If
    On Error GoTo 0

    ' Loop through Files collection, adding to dictionary.
    For Each filFile In fdrFolder.Files
      shtOutputSheet.Cells(lRowNumber, iNAMECOLUMN).Value = filFile.Path
       lRowNumber = lRowNumber + 1
    Next filFile

    ' If Recursive flag is true, call recursively.
    If blnRecursive Then
        For Each fdrSubFolder In fdrFolder.SubFolders
            Call BuildFileStructure(fdrSubFolder.Path, shtOutputSheet, lRowNumber, True)
        Next fdrSubFolder
    End If

    ' Return True if no error occurred.
    BuildFileStructure = True

BuildFileStructure_End:
    Set fdrSubFolder = Nothing
    Set fdrFolder = Nothing
    Set filFile = Nothing
    Set fsoSysObj = Nothing

    Exit Function
End Function

回答by kpark

Instead of using FSO, I would use DIR()if I want faster speed.
However, it is not so fail-safe so you would need to conduct couple of tests and make sure it works in all occasions.
For example, you may need to check individual parent folder in order to make sure they exist.

DIR()如果我想要更快的速度,我会使用而不是使用 FSO 。
但是,它并不是那么安全,因此您需要进行几次测试并确保它在所有情况下都能正常工作。
例如,您可能需要检查单个父文件夹以确保它们存在。

Anyways, Dir()should be faster because it is a native function.

无论如何,Dir()应该更快,因为它是本机功能。

Another way of solving this would be to use Batch (if you're on Widows of course!) or use command line to simply copy from one file to another file. You should see a dramatic increase in speed and you don't need to worry about checking every single subfolder for existence!

解决此问题的另一种方法是使用 Batch(当然,如果您在 Widows 上!)或使用命令行简单地从一个文件复制到另一个文件。您应该会看到速度的显着提高,而且您无需担心检查每个子文件夹是否存在!

I happen to have a VBA code that would use windows commandline to do what I want. I got it from the internet but tweaked some error acknowledgements to bypass what I wanted to do:

我碰巧有一个 VBA 代码,可以使用 Windows 命令行来做我想做的事。我从互联网上得到它,但调整了一些错误确认以绕过我想做的事情:

Option Explicit
Option Base 0
Option Compare Text

Private Type SECURITY_ATTRIBUTES
    nLength As Long
    lpSecurityDescriptor As Long
    bInheritHandle As Long
End Type

Private Type PROCESS_INFORMATION
    hProcess As Long
    hThread As Long
    dwProcessId As Long
    dwThreadId As Long
End Type

Private Type STARTUPINFO
    cb As Long
    lpReserved As Long
    lpDesktop As Long
    lpTitle As Long
    dwX As Long
    dwY As Long
    dwXSize As Long
    dwYSize As Long
    dwXCountChars As Long
    dwYCountChars As Long
    dwFillAttribute As Long
    dwFlags As Long
    wShowWindow As Integer
    cbReserved2 As Integer
    lpReserved2 As Byte
    hStdInput As Long
    hStdOutput As Long
    hStdError As Long
End Type

Private Const WAIT_INFINITE         As Long = (-1&)
Private Const STARTF_USESHOWWINDOW  As Long = &H1
Private Const STARTF_USESTDHANDLES  As Long = &H100
Private Const SW_HIDE               As Long = 0&

Private Declare Function CreatePipe Lib "kernel32" (phReadPipe As Long, phWritePipe As Long, lpPipeAttributes As SECURITY_ATTRIBUTES, ByVal nSize As Long) As Long
Private Declare Function CreateProcess Lib "kernel32" Alias "CreateProcessA" (ByVal lpApplicationName As Long, ByVal lpCommandLine As String, lpProcessAttributes As Any, lpThreadAttributes As Any, ByVal bInheritHandles As Long, ByVal dwCreationFlags As Long, lpEnvironment As Any, ByVal lpCurrentDriectory As String, lpStartupInfo As STARTUPINFO, lpProcessInformation As PROCESS_INFORMATION) As Long
Private Declare Function ReadFile Lib "kernel32" (ByVal hFile As Long, lpBuffer As Any, ByVal nNumberOfBytesToRead As Long, lpNumberOfBytesRead As Long, lpOverlapped As Any) As Long
Private Declare Function CloseHandle Lib "kernel32" (ByVal hObject As Long) As Long
Private Declare Function WaitForSingleObject Lib "kernel32" (ByVal hHandle As Long, ByVal dwMilliseconds As Long) As Long
Private Declare Function GetExitCodeProcess Lib "kernel32" (ByVal hProcess As Long, lpExitCode As Long) As Long
Private Declare Sub GetStartupInfo Lib "kernel32" Alias "GetStartupInfoA" (lpStartupInfo As STARTUPINFO)
Private Declare Function GetFileSize Lib "kernel32" (ByVal hFile As Long, lpFileSizeHigh As Long) As Long

Public Function Redirect(szBinaryPath As String, szCommandLn As String) As String

Dim tSA_CreatePipe              As SECURITY_ATTRIBUTES
Dim tSA_CreateProcessPrc        As SECURITY_ATTRIBUTES
Dim tSA_CreateProcessThrd       As SECURITY_ATTRIBUTES
Dim tSA_CreateProcessPrcInfo    As PROCESS_INFORMATION
Dim tStartupInfo                As STARTUPINFO
Dim hRead                       As Long
Dim hWrite                      As Long
Dim bRead                       As Long
Dim abytBuff()                  As Byte
Dim lngResult                   As Long
Dim szFullCommand               As String
Dim lngExitCode                 As Long
Dim lngSizeOf                   As Long

tSA_CreatePipe.nLength = Len(tSA_CreatePipe)
tSA_CreatePipe.lpSecurityDescriptor = 0&
tSA_CreatePipe.bInheritHandle = True

tSA_CreateProcessPrc.nLength = Len(tSA_CreateProcessPrc)
tSA_CreateProcessThrd.nLength = Len(tSA_CreateProcessThrd)

If (CreatePipe(hRead, hWrite, tSA_CreatePipe, 0&) <> 0&) Then
    tStartupInfo.cb = Len(tStartupInfo)
    GetStartupInfo tStartupInfo

    With tStartupInfo
        .hStdOutput = hWrite
        .hStdError = hWrite
        .dwFlags = STARTF_USESHOWWINDOW Or STARTF_USESTDHANDLES
        .wShowWindow = SW_HIDE
    End With

    szFullCommand = """" & szBinaryPath & """" & " " & szCommandLn
    lngResult = CreateProcess(0&, szFullCommand, tSA_CreateProcessPrc, tSA_CreateProcessThrd, True, 0&, 0&, vbNullString, tStartupInfo, tSA_CreateProcessPrcInfo)

    If (lngResult <> 0&) Then
        lngResult = WaitForSingleObject(tSA_CreateProcessPrcInfo.hProcess, WAIT_INFINITE)
        lngSizeOf = GetFileSize(hRead, 0&)
        If (lngSizeOf > 0) Then
            ReDim abytBuff(lngSizeOf - 1)
            If ReadFile(hRead, abytBuff(0), UBound(abytBuff) + 1, bRead, ByVal 0&) Then
                Redirect = StrConv(abytBuff, vbUnicode)
            End If
        End If
        Call GetExitCodeProcess(tSA_CreateProcessPrcInfo.hProcess, lngExitCode)
        CloseHandle tSA_CreateProcessPrcInfo.hThread
        CloseHandle tSA_CreateProcessPrcInfo.hProcess

        'If (lngExitCode <> 0&) Then Err.Raise vbObject + 1235&, "GetExitCodeProcess", "Non-zero Application exist code"

        CloseHandle hWrite
        CloseHandle hRead
    Else
        Err.Raise vbObject + 1236&, "CreateProcess", "CreateProcess Failed, Code: " & Err.LastDllError
    End If
End If
End Function

You would use the commandline through
resp = Redirect("cmd", strCmd)
where cmdis equivalent to pressing windows + R and strCmdis the string you input into that Run prompt.

您可以通过
resp = Redirect("cmd", strCmd)
where使用命令行,cmd相当于按 windows + R 并且strCmd是您输入到该运行提示中的字符串。

To further answer your question about the difference in performance between local drives and network drives, working with network drives will always be slower in any type of code. The background code that runs when we access network drive is complex but I don't know the specifics.

为了进一步回答您关于本地驱动器和网络驱动器之间性能差异的问题,在任何类型的代码中使用网络驱动器总是会变慢。我们访问网络驱动器时运行的后台代码很复杂,但我不知道具体情况。

Hope it helps,
Cheers,
kpark

希望它有帮助,
干杯,
kpark