Java 用于大 ByteBuffer 的 BufferedReader?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1045632/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-11 22:42:06  来源:igfitidea点击:

BufferedReader for large ByteBuffer?

javaniobufferedreaderbytebuffer

提问by Rob

Is there a way to read a ByteBuffer with a BufferedReader without having to turn it into a String first? I want to read through a fairly large ByteBuffer as lines of text and for performance reasons I want to avoid writing it to the disk. Calling toString on the ByteBuffer doesn't work because the resulting String is too large (it throws java.lang.OutOfMemoryError: Java heap space). I would have thought there would be something in the API to wrap a ByteBuffer in a suitable reader, but I can't seem to find anything suitable.

有没有办法使用 BufferedReader 读取 ByteBuffer 而不必先将其转换为 String ?我想通读相当大的 ByteBuffer 作为文本行,出于性能原因,我想避免将其写入磁盘。在 ByteBuffer 上调用 toString 不起作用,因为生成的 String 太大(它抛出 java.lang.OutOfMemoryError: Java heap space)。我原以为 API 中会有一些东西可以将 ByteBuffer 包装在合适的阅读器中,但我似乎找不到任何合适的东西。

Here's an abbreviated code sample the illustrates what I am doing):

这是一个简短的代码示例,它说明了我在做什么):

// input stream is from Process getInputStream()
public String read(InputStream istream)
{
  ReadableByteChannel source = Channels.newChannel(istream);
  ByteArrayOutputStream ostream = new ByteArrayOutputStream(bufferSize);
  WritableByteChannel destination = Channels.newChannel(ostream);
  ByteBuffer buffer = ByteBuffer.allocateDirect(writeBufferSize);

  while (source.read(buffer) != -1)
  {
    buffer.flip();
    while (buffer.hasRemaining())
    {
      destination.write(buffer);
    }
    buffer.clear();
  }

  // this data can be up to 150 MB.. won't fit in a String.
  result = ostream.toString();
  source.close();
  destination.close();
  return result;
}

// after the process is run, we call this method with the String
public void readLines(String text)
{
  BufferedReader reader = new BufferedReader(new StringReader(text));
  String line;

  while ((line = reader.readLine()) != null)
  {
    // do stuff with line
  }
}

采纳答案by Jon Skeet

It's not clear why you're using a byte buffer to start with. If you've got an InputStreamand you want to read lines for it, why don't you just use an InputStreamReaderwrapped in a BufferedReader? What's the benefit in getting NIO involved?

不清楚为什么要使用字节缓冲区开始。如果你有 anInputStream并且你想为它读几行,为什么不直接使用一个InputStreamReader包裹在 a 中的BufferedReader呢?让 NIO 参与进来有什么好处?

Calling toString()on a ByteArrayOutputStreamsounds like a bad idea to me even if you had the space for it: better to get it as a byte array and wrap it in a ByteArrayInputStreamand then an InputStreamReader, if you really have to have a ByteArrayOutputStream. If you reallywant to call toString(), at least use the overload which takes the name of the character encoding to use - otherwise it'll use the system default, which probably isn't what you want.

即使您有足够的空间,调用toString()aByteArrayOutputStream对我来说也是一个坏主意:最好将它作为字节数组获取并将其包装在 aByteArrayInputStream和 an 中InputStreamReader,如果您真的必须有一个ByteArrayOutputStream. 如果你真的想调用toString(),至少使用需要使用字符编码名称的重载 - 否则它将使用系统默认值,这可能不是你想要的。

EDIT: Okay, so you really want to use NIO. You're still writing to a ByteArrayOutputStreameventually, so you'll end up with a BAOS with the data in it. If you want to avoid making a copy of that data, you'll need to derive from ByteArrayOutputStream, for instance like this:

编辑:好的,所以你真的想使用 NIO。您ByteArrayOutputStream最终仍在写入 a ,因此您最终会得到一个包含数据的 BAOS。如果您想避免复制该数据,则需要从 派生ByteArrayOutputStream,例如像这样:

public class ReadableByteArrayOutputStream extends ByteArrayOutputStream
{
    /**
     * Converts the data in the current stream into a ByteArrayInputStream.
     * The resulting stream wraps the existing byte array directly;
     * further writes to this output stream will result in unpredictable
     * behavior.
     */
    public InputStream toInputStream()
    {
        return new ByteArrayInputStream(array, 0, count);
    }
}

Then you can create the input stream, wrap it in an InputStreamReader, wrap that in a BufferedReader, and you're away.

然后,您可以创建输入流,将其InputStreamReader包装在 中BufferedReader,然后将其包装在 a 中,然后您就可以离开了。

回答by Matthew Flaschen

You can use NIO, but there's no real need here. As Jon Skeet suggested:

您可以使用 NIO,但这里没有真正的必要。正如乔恩·斯基特所建议的:

public byte[] read(InputStream istream)
{
  ByteArrayOutputStream baos = new ByteArrayOutputStream();
  byte[] buffer = new byte[1024]; // Experiment with this value
  int bytesRead;

  while ((bytesRead = istream.read(buffer)) != -1)
  {
    baos.write(buffer, 0, bytesRead);
  }

  return baos.toByteArray();
}


// after the process is run, we call this method with the String
public void readLines(byte[] data)
{
  BufferedReader reader = new BufferedReader(new InputStreamReader(new ByteArrayInputStream(data)));
  String line;

  while ((line = reader.readLine()) != null)
  {
    // do stuff with line
  }
}

回答by user1079877

This is a sample:

这是一个示例:

public class ByteBufferBackedInputStream extends InputStream {

    ByteBuffer buf;

    public ByteBufferBackedInputStream(ByteBuffer buf) {
        this.buf = buf;
    }

    public synchronized int read() throws IOException {
        if (!buf.hasRemaining()) {
            return -1;
        }
        return buf.get() & 0xFF;
    }

    @Override
    public int available() throws IOException {
        return buf.remaining();
    }

    public synchronized int read(byte[] bytes, int off, int len) throws IOException {
        if (!buf.hasRemaining()) {
            return -1;
        }

        len = Math.min(len, buf.remaining());
        buf.get(bytes, off, len);
        return len;
    }
}

And you can use it like this:

你可以像这样使用它:

    String text = "this is text";   // It can be Unicode text
    ByteBuffer buffer = ByteBuffer.wrap(text.getBytes("UTF-8"));

    InputStream is = new ByteBufferBackedInputStream(buffer);
    InputStreamReader r = new InputStreamReader(is, "UTF-8");
    BufferedReader br = new BufferedReader(r);