如何查看InputStream中的前两个字节?

时间:2020-03-06 14:52:06  来源:igfitidea点击:

应该非常简单:我有一个InputStream,我想在其中偷看(而不是读取)前两个字节,即,我希望在偷看之后InputStream的"当前位置"保持为0。最佳和最安全的方法是什么?

答案正如我所怀疑的,解决方案是将其包装在提供可标记性的BufferedInputStream中。谢谢拉斯姆斯。

解决方案

对于一般的InputStream,我将其包装在BufferedInputStream中,然后执行以下操作:

BufferedInputStream bis = new BufferedInputStream(inputStream);
bis.mark(2);
int byte1 = bis.read();
int byte2 = bis.read();
bis.reset();
// note: you must continue using the BufferedInputStream instead of the inputStream

我在这里找到了PeekableInputStream的实现:

http://www.heatonresearch.com/articles/147/page2.html

本文中显示的实现方式的想法是,它在内部保留了一系列"偷看"的值。当我们调用read时,首先从偷看的数组中返回值,然后从输入流中返回。调用peek时,将读取值并将其存储在" peeked"数组中。

由于示例代码的许可证是LGPL,因此可以添加到此帖子:

package com.heatonresearch.httprecipes.html;

import java.io.*;

/**
 * The Heaton Research Spider Copyright 2007 by Heaton
 * Research, Inc.
 * 
 * HTTP Programming Recipes for Java ISBN: 0-9773206-6-9
 * http://www.heatonresearch.com/articles/series/16/
 * 
 * PeekableInputStream: This is a special input stream that
 * allows the program to peek one or more characters ahead
 * in the file.
 * 
 * This class is released under the:
 * GNU Lesser General Public License (LGPL)
 * http://www.gnu.org/copyleft/lesser.html
 * 
 * @author Jeff Heaton
 * @version 1.1
 */
public class PeekableInputStream extends InputStream
{

  /**
   * The underlying stream.
   */
  private InputStream stream;

  /**
   * Bytes that have been peeked at.
   */
  private byte peekBytes[];

  /**
   * How many bytes have been peeked at.
   */
  private int peekLength;

  /**
   * The constructor accepts an InputStream to setup the
   * object.
   * 
   * @param is
   *          The InputStream to parse.
   */
  public PeekableInputStream(InputStream is)
  {
    this.stream = is;
    this.peekBytes = new byte[10];
    this.peekLength = 0;
  }

  /**
   * Peek at the next character from the stream.
   * 
   * @return The next character.
   * @throws IOException
   *           If an I/O exception occurs.
   */
  public int peek() throws IOException
  {
    return peek(0);
  }

  /**
   * Peek at a specified depth.
   * 
   * @param depth
   *          The depth to check.
   * @return The character peeked at.
   * @throws IOException
   *           If an I/O exception occurs.
   */
  public int peek(int depth) throws IOException
  {
    // does the size of the peek buffer need to be extended?
    if (this.peekBytes.length <= depth)
    {
      byte temp[] = new byte[depth + 10];
      for (int i = 0; i < this.peekBytes.length; i++)
      {
        temp[i] = this.peekBytes[i];
      }
      this.peekBytes = temp;
    }

    // does more data need to be read?
    if (depth >= this.peekLength)
    {
      int offset = this.peekLength;
      int length = (depth - this.peekLength) + 1;
      int lengthRead = this.stream.read(this.peekBytes, offset, length);

      if (lengthRead == -1)
      {
        return -1;
      }

      this.peekLength = depth + 1;
    }

    return this.peekBytes[depth];
  }

  /*
   * Read a single byte from the stream. @throws IOException
   * If an I/O exception occurs. @return The character that
   * was read from the stream.
   */
  @Override
  public int read() throws IOException
  {
    if (this.peekLength == 0)
    {
      return this.stream.read();
    }

    int result = this.peekBytes[0];
    this.peekLength--;
    for (int i = 0; i < this.peekLength; i++)
    {
      this.peekBytes[i] = this.peekBytes[i + 1];
    }

    return result;
  }

}

当使用BufferedInputStream时,请确保inputStream尚未被缓冲,双重缓冲将导致一些很难发现的错误。
另外,我们还需要以不同的方式处理Reader,转换为StreamReader后,如果Reader是Buffered的,则Buffering将导致字节丢失。
另外,如果我们使用的是阅读器,请记住,我们不是在读取字节,而是在使用默认编码的字符(除非设置了显式编码)。
我们可能不知道的缓冲输入流的一个示例是URL url; url.openStream();

我没有此信息的任何引用,它来自调试代码。
对我而言,发生问题的主要情况是从文件读取到压缩流中的代码。
如果我没记错的话,一旦开始通过代码进行调试,Java源代码中就会出现注释,指出某些事情并非总是能正常工作。
我不记得使用BufferedReader和BufferedInputStream的信息在哪里
来自,但我认为即使是最简单的测试也不会立即失败。
记住要进行测试,我们需要标记的内容超过缓冲区的大小(对于BufferedReader与BufferedInputStream,这是不同的),当读取的字节到达缓冲区的末尾时,就会出现问题。
请注意,源代码缓冲区大小可能与我们在构造函数中设置的缓冲区大小不同。
自从我这样做以来已经有一段时间了,所以我对细节的回忆可能会有点少。
测试是使用FilterReader / FilterInputStream完成的,将其中一个添加到直接流中,再将一个添加到缓冲流中,以查看区别。

我们可能会发现PushbackInputStream很有用:

http://docs.oracle.com/javase/6/docs/api/java/io/PushbackInputStream.html