vb.net 使用 File.ReadAllText(x) 读取大文件时如何避免“内存不足”异常

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/20074963/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-17 15:53:11  来源:igfitidea点击:

How to avoid "Out Of Memory" exception when reading large files using File.ReadAllText(x)

vb.netvisual-studio-2010visual-studio-2012memory-leaks

提问by Fariz Luqman

This is my code to search for a string for all files and folders in the drive "G:\" that contains the string "hello":

这是我的代码,用于在驱动器“G:\”中搜索包含字符串“hello”的所有文件和文件夹的字符串:

Dim path = "g:\"
Dim fileList As New List(Of String)

 GetAllAccessibleFiles(path, fileList)

 'Convert List<T> to string array if you want
  Dim files As String() = fileList.ToArray

  For Each s As String In fileList
      Dim text As String = File.ReadAllText(s)
      Dim index As Integer = text.IndexOf("hello")
      If index >= 0 Then
           MsgBox("FOUND!")
           ' String is in file, starting at character "index"
      End If
  Next

This code will also results in memory leak/out of memory exception (as I read file as big as 5GB!). Perhaps it will bring the whole file to the RAM, then went for the string check.

此代码还将导致内存泄漏/内存不足异常(因为我读取的文件大到 5GB!)。也许它将整个文件带到 RAM 中,然后进行字符串检查。

Dim text As String = File.ReadAllText("C:\Users\Farizluqman\mybigmovie.mp4") 
    ' this file sized as big as 5GB!
    Dim index As Integer = text.IndexOf("hello")
    If index >= 0 Then
        MsgBox("FOUND!")
        ' String is in file, starting at character "index"
    End If

But, the problem is: This code is really DANGEROUS, that may lead to memory leak or using 100% of the RAM. The question is, is there any way or workaround for the code above? Maybe chunking or reading part of the file and then dispose to avoid memory leak/out of memory? Or is there any way to minimize the memory usage when using the code? As I felt responsible for other's computer stability. Please Help :)

但是,问题是:这段代码真的很危险,可能会导致内存泄漏或使用 100% 的 RAM。问题是,上面的代码有什么方法或解决方法吗?也许分块或读取文件的一部分,然后处理以避免内存泄漏/内存不足?或者有什么方法可以在使用代码时最大限度地减少内存使用?因为我觉得对他人的计算机稳定性负责。请帮忙 :)

回答by varocarbas

You should use System.IO.StreamReader, which reads line by line instead all the lines at the same time (here you have a similar post in C#); I personally never use ReadAll*** unless under very specific conditions. Sample adaptation of your code:

您应该使用System.IO.StreamReader, 它逐行读取而不是同时读取所有行(这里您在 C# 中有类似的帖子);我个人从不使用 ReadAll***,除非在非常特殊的条件下。代码的示例改编:

Dim index As Integer = -1
Dim lineCount As Integer = -1
Using reader As System.IO.StreamReader = New System.IO.StreamReader("C:\Users\Farizluqman\mybigmovie.mp4")
    Dim line As String
    line = reader.ReadLine
    If (line IsNot Nothing AndAlso line.Contains("hello")) Then
        index = line.IndexOf("hello")
    Else
        If (line IsNot Nothing) Then lineCount = line.Length
        Do While (Not line Is Nothing)
            line = reader.ReadLine
            If (line IsNot Nothing) Then
                lineCount = lineCount + line.Length
                If (line.Contains("hello")) Then
                    index = lineCount - line.Length + line.IndexOf("hello")
                    Exit Do
                End If
            End If
        Loop
    End If
End Using

If index >= 0 Then
    MsgBox("FOUND!")
    ' String is in file, starting at character "index"
End If