在 vb.net 中读取非常大的文本文件时出现内存不足错误

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/14699039/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-17 12:11:39  来源:igfitidea点击:

Out-of-memory error while reading very large text file in vb.net

.netvb.netout-of-memory

提问by Spacehamster

I've been tasked with processing a 3.2GB fixed-width delimited text file. Each line is 1563 chars long, and there are approximately 2.1 million lines in the text file. After reading about 1 million lines, my program crashes with an out-of-memory exception error.

我的任务是处理 3.2GB 的固定宽度分隔文本文件。每行长 1563 个字符,文本文件中大约有 210 万行。在阅读了大约 100 万行之后,我的程序因内存不足异常错误而崩溃。

Imports System.IO
Imports Microsoft.VisualBasic.FileIO

Module TestFileCount
    ''' <summary>
    ''' Gets the total number of lines in a text file by reading a line at a time
    ''' </summary>
    ''' <remarks>Crashes when count reaches 1018890</remarks>
    Sub Main()
        Dim inputfile As String = "C:\Split\BIGFILE.txt"
        Dim count As Int32 = 0
        Dim lineoftext As String = ""

        If File.Exists(inputfile) Then
            Dim _read As New StreamReader(inputfile)
            Try
                While (_read.Peek <> -1)
                    lineoftext = _read.ReadLine()
                    count += 1
                End While

                Console.WriteLine("Total Lines in " & inputfile & ": " & count)
            Catch ex As Exception
                Console.WriteLine(ex.Message)
            Finally
                _read.Close()
            End Try
        End If
    End Sub
End Module

It's a pretty straightforward program that reads the text file one line at a time, so I assume it shouldn't take up too much memory in the buffer.

这是一个非常简单的程序,一次读取一行文本文件,所以我认为它不应该在缓冲区中占用太多内存。

For the life of me, I can't figure out why it's crashing. Does anyone here have any ideas?

对于我的生活,我无法弄清楚它为什么会崩溃。这里有人有任何想法吗?

采纳答案by Scott Chamberlain

I don't know if this will fix your problem but don't use peek, change your loop to: (this is C# but you should be able to translate it to VB)

我不知道这是否会解决您的问题,但不要使用 peek,将循环更改为:(这是 C#,但您应该能够将其转换为 VB)

while (_read.ReadLine() != null)
{
    count += 1
}

If you need to use the line of text inside the loop instead of just counting lines just modify the code to

如果您需要在循环内使用文本行而不是仅计算行数,只需将代码修改为

while ((lineoftext = _read.ReadLine()) != null)
{
    count += 1
    //Do something with lineoftext
}


Kind of off topic and kind of cheating, if each line really is 1563 chars long (including the line ending) and the file is pure ASCII (so all chars take up one byte) you could just do (once again C# but you should be able to translate)

有点题外话和作弊,如果每行真的是 1563 个字符长(包括行尾)并且文件是纯 ASCII(所以所有字符占用一个字节),你可以这样做(再次 C#,但你应该可以翻译)

long bytesPerLine = 1563;
string inputfile = @"C:\Split\BIGFILE.txt"; //The @ symbol is so we don't have to escape the `\`
long length;

using(FileStream stream = File.Open(inputFile, FileMode.Open)) //This is the C# equivilant of the try/finally to close the stream when done.
{
    length = stream.Length;
}

Console.WriteLine("Total Lines in {0}: {1}", inputfile, (length / bytesPerLine ));

回答by saysansay

Try to use ReadAsync, or you can use DiscardBufferedData(but this slow )

尝试使用 ReadAsync,或者您可以使用 DiscardBufferedData(但是这很慢)

Dim inputfile As String = "C:\Example\existingfile.txt" 
    Dim result() As String 
    Dim builder As StringBuilder = New StringBuilder()

    Try
        Using reader As StreamReader = File.OpenText(inputfile)
            ReDim result(reader.BaseStream.Length)
            Await reader.ReadAsync(result, 0, reader.BaseStream.Length)
        End Using 

        For Each str As String In result
            builder.Append(str)         
        Next
      Dim count as Integer=builder.Count()
       Console.WriteLine("Total Lines in " & inputfile & ": " & count)
    Catch ex As Exception
            Console.WriteLine(ex.Message)
    End Try