java DataInputStream 和 readLine() 与 UTF8

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/6370808/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-30 15:37:51  来源:igfitidea点击:

DataInputStream and readLine() with UTF8

javautf-8

提问by martin

I've got some trouble with sending a UTF8 string from a c socket to a java socket. The following method works fine:

我在将 UTF8 字符串从 ac 套接字发送到 java 套接字时遇到了一些麻烦。以下方法工作正常:

BufferedReader in = new BufferedReader(new InputStreamReader(socket.getInputStream(), "UTF8"));
main.title = in.readLine();

but then I need a int java.io.InputStream.read(byte[] b, int offset, int length)method which does not exist for a BufferedReader. So then I tried to take a DataInputStream

但后来我需要int java.io.InputStream.read(byte[] b, int offset, int length)一个 BufferedReader 不存在的方法。然后我尝试采用 DataInputStream

DataInputStream in2 = new DataInputStream(socket.getInputStream());

but everything it reads is just rubbish.

但它读到的一切都是垃圾。

Then I tried to use the readLine()method from DataInputStreambut this doesn't give me the correct UTF8 string.

然后我尝试使用readLine()方法 fromDataInputStream但这并没有给我正确的 UTF8 字符串。

You see my dilemma. Can't I use two readers for one InputStream? Or can I convert the DataInputStream.readLine()result and convert it to UTF8?

你看到我的困境。我不能为一个 InputStream 使用两个阅读器吗?或者我可以转换DataInputStream.readLine()结果并将其转换为UTF8吗?

Thanks, Martin

谢谢,马丁

回答by McDowell

We know from the design of the UTF-8 encodingthat the only usage of the value 0x0Ais the LINE FEED ('\n'). Therefore, you can read until you hit it:

我们从UTF-8 编码设计中知道,该值的唯一用途0x0A是 LINE FEED ( '\n')。因此,您可以阅读直到您点击它:

  /** Reads UTF-8 character data; lines are terminated with '\n' */
  public static String readLine(InputStream in) throws IOException {
    ByteArrayOutputStream buffer = new ByteArrayOutputStream();
    while (true) {
      int b = in.read();
      if (b < 0) {
        throw new IOException("Data truncated");
      }
      if (b == 0x0A) {
        break;
      }
      buffer.write(b);
    }
    return new String(buffer.toByteArray(), "UTF-8");
  }

I am making the assumption that your protocol uses \nas a line terminator. If it doesn't - well, it is generally useful to point out the constraints you're writing to.

我假设您的协议\n用作行终止符。如果没有 - 好吧,指出您正在写入的约束通常很有用。

回答by welcomb

Do NOTuse BufferedReader and DataInputStream on the same InputStream!! I did that and spent days trying to figure out why my code broke. BufferedReader can read more than what you extract from it into its buffer, resulting in situation when the data I was supposed to read with the DataInputStream being "in the BufferedReader". This resulted in lost data which caused my program to "hang" waiting for it to arrive.

千万不要使用的BufferedReader DataInputStream所,并在同一InputStream的!我这样做了,并花了几天时间试图弄清楚为什么我的代码会损坏。BufferedReader 可以读取比您从中提取的内容更多的内容到其缓冲区中,从而导致我应该使用 DataInputStream 读取的数据“在 BufferedReader 中”的情况。这导致数据丢失,导致我的程序“挂起”等待它到达。

回答by AlexR

I believe that you should not mismatch the BufferedReaderand DataInputStreamhere. DataInputStreamhas readLine()too, so use it. And yet another comment. I am not sure it is a problem but avoid multiple calls of socket.getInputStream(). Do it once and then wrap it as you want using other streams and readers.

我相信你不应该在这里与BufferedReader和错配DataInputStreamDataInputStream也有readLine(),所以用它。还有一个评论。我不确定这是一个问题,但避免多次调用socket.getInputStream(). 做一次,然后根据需要使用其他流和阅读器包装它。

回答by pap

Am I understanding it correctly that you are sending both text and binary data on the same socket, in the same "conversation"? There should be no problem creating two readers for the same inputstream. The problem is knowing when (and how much) to read which reader. They will both consume (and advance) the underlying stream when you read from them, since you have mixed types of data. You could just read the stream as bytes and then convert the bytes explicitly in your code (new String(bytes, "UTF-8") etc). Or you could split your communication onto two different sockets.

我是否正确理解您在同一个套接字上,在同一个“对话”中同时发送文本和二进制数据?为同一个输入流创建两个阅读器应该没有问题。问题是知道何时(以及阅读多少)阅读哪个读者。当您从它们中读取数据时,它们都将消耗(并推进)底层流,因为您拥有混合类型的数据。您可以将流作为字节读取,然后在代码中显式转换字节(new String(bytes, "UTF-8") 等)。或者您可以将您的通信拆分到两个不同的套接字上。