C# 从字节数组中读取行(不将字节数组转换为字符串)
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/492454/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Read line from byte array (not convert byte array to string)
提问by jxpx777
I have a byte array that I am reading in from a NetworkStream. The first two bytes tell the length of the packet that follows and then the packet is read into a byte array of that length. The data in that I need to read from the NetworkStream/byte array has a few Strings, i.e. variable length data terminated by new line characters, and some fixed width fields like bytes and longs. So, something like this:
我有一个字节数组,我正在从 NetworkStream 中读取它。前两个字节告诉后面的数据包的长度,然后将数据包读入该长度的字节数组。我需要从 NetworkStream/byte 数组中读取的数据有一些字符串,即由换行符终止的可变长度数据,以及一些固定宽度的字段,如字节和长整型。所以,像这样:
// I would have delimited these for clarity but I didn't want
// to imply that the stream was delimited because it's not.
StringbyteStringStringbytebytebytelonglongbytelonglong
I know (and have some say in) the format of the data packet that is coming across, and what I need to do is read a "line" for each string value, but read a fixed number of bytes for the bytes and longs. So far, my proposed solution is to use a while
loop to read bytes into a temp byte array until there is a newline character. Then, convert the bytes to a string. This seems kludgy to me, but I don't see another obvious way. I realize I could use StreamReader.ReadLine()
but that would involve another stream and I already have a NetworkStream
. But if that's the better solution, I'll give it a shot.
我知道(并且有发言权)遇到的数据包的格式,我需要做的是为每个字符串值读取一个“行”,但读取字节和长整数的固定数量的字节。到目前为止,我提出的解决方案是使用while
循环将字节读入临时字节数组,直到出现换行符为止。然后,将字节转换为字符串。这对我来说似乎很笨拙,但我没有看到另一种明显的方式。我意识到我可以使用,StreamReader.ReadLine()
但这会涉及另一个流,而且我已经有了一个NetworkStream
. 但如果这是更好的解决方案,我会试一试。
The other option I have considered is to have my backend team write a byte or two for those String values' lengths so I can read the length and then read the String based on the length specified.
我考虑的另一个选项是让我的后端团队为这些字符串值的长度写入一两个字节,以便我可以读取长度,然后根据指定的长度读取字符串。
So, as you can see, I have some options for how to go about this, and I'd like your input about what you would consider the best way to do it. Here's the code that I have right now for reading in the entire packet as a string. The next step is to break out the various fields of the packet and do the actual programming work that needs to be done, creating objects, updating UI, etc. based on the data in the packet.
所以,正如您所看到的,我有一些关于如何进行此操作的选项,我希望您提供有关您认为最佳方法的意见。这是我现在用于将整个数据包作为字符串读取的代码。下一步就是将数据包的各个字段打出来,根据数据包中的数据做实际需要做的编程工作,创建对象,更新UI等。
string line = null;
while (stream.DataAvailable)
{
//Get the packet length;
UInt16 packetLength = 0;
header = new byte[2];
stream.Read(header, 0, 2);
// Need to reverse the header array for BitConverter class if architecture is little endian.
if (BitConverter.IsLittleEndian)
Array.Reverse(header);
packetLength = BitConverter.ToUInt16(header,0);
buffer = new byte[packetLength];
stream.Read(buffer, 0, BitConverter.ToUInt16(header, 0));
line = System.Text.ASCIIEncoding.ASCII.GetString(buffer);
Console.WriteLine(line);
}
采纳答案by Binary Worrier
Personally I would
我个人会
- Put an Int16 at the start of the strings, so you know how long they're going to be, and
- Use the IO.BinaryReader class to do the reading, it'll "read", ints, strings, chars etc into variable e.g. BinReader.ReadInt16() will read two bytes, return the int16 they represent, and move two bytes on in the stream
- 将 Int16 放在字符串的开头,这样您就知道它们将持续多长时间,并且
- 使用 IO.BinaryReader 类进行读取,它会将整数、字符串、字符等“读取”到变量中,例如 BinReader.ReadInt16() 将读取两个字节,返回它们代表的 int16,并在溪流
Hope this helps.
希望这可以帮助。
P.S. Be careful using the ReadString method, it assumes the string is prepended with custom 7 bit integers i.e. that it was written by the BinaryWriter class. The following is from this CodeGurupost
PS 请小心使用 ReadString 方法,它假定字符串前面带有自定义的 7 位整数,即它是由 BinaryWriter 类编写的。以下来自这篇CodeGuru帖子
The BinaryWriter class has two methods for writing strings: the overloaded Write() method and the WriteString() method. The former writes the string as a stream of bytes according to the encoding the class is using. The WriteString() method also uses the specified encoding, but it prefixes the string's stream of bytes with the actual length of the string. Such prefixed strings are read back in via BinaryReader.ReadString().
The interesting thing about the length value it that as few bytes as possible are used to hold this size, it is stored as a type called a 7-bit encoded integer. If the length fits in 7 bits a single byte is used, if it is greater than this then the high bit on the first byte is set and a second byte is created by shifting the value by 7 bits. This is repeated with successive bytes until there are enough bytes to hold the value. This mechanism is used to make sure that the length does not become a significant portion of the size taken up by the serialized string. BinaryWriter and BinaryReader have methods to read and write 7-bit encoded integers, but they are protected and so you can use them only if you derive from these classes.
BinaryWriter 类有两种写入字符串的方法:重载的 Write() 方法和 WriteString() 方法。前者根据类使用的编码将字符串作为字节流写入。WriteString() 方法也使用指定的编码,但它使用字符串的实际长度作为字符串的字节流的前缀。这些带前缀的字符串通过 BinaryReader.ReadString() 读回。
关于长度值的有趣之处在于,使用尽可能少的字节来保持这个大小,它被存储为一种称为 7 位编码整数的类型。如果长度适合 7 位,则使用单个字节,如果大于此值,则设置第一个字节的高位,并通过将值移动 7 位来创建第二个字节。这对连续的字节重复,直到有足够的字节来保存值。此机制用于确保长度不会成为序列化字符串占用的大小的重要部分。BinaryWriter 和 BinaryReader 具有读取和写入 7 位编码整数的方法,但它们是受保护的,因此只有从这些类派生时才能使用它们。
回答by Jon Skeet
I would go with length-prefixed strings. It will make your life a lot simpler, and it means you can represent strings with line breaks in. A few comments on your code though:
我会使用长度为前缀的字符串。它会让你的生活变得更简单,这意味着你可以用换行符来表示字符串。 不过对你的代码有一些注释:
- Don't use Stream.DataAvailable. Just because there's not data available nowdoesn't mean you've read the end of the stream.
- Unless you're absolutely sure you'll never need text beyond ASCII, don't use ASCIIEncoding.
- Don't assume that Stream.Read will read all the data you ask it to. Alwayscheck the return value.
- BinaryReader makes a lot of this a lot easier (including length-prefixed strings and a Read that loops until it's read what you've asked it to)
- You're calling BitConverter.ToUInt16 twice on the same data. Why?
- 不要使用 Stream.DataAvailable。仅仅因为现在没有可用数据并不意味着您已经阅读了流的末尾。
- 除非您绝对确定您永远不需要 ASCII 以外的文本,否则不要使用 ASCIIEncoding。
- 不要假设 Stream.Read 会读取您要求的所有数据。始终检查返回值。
- BinaryReader 使很多事情变得更容易(包括长度前缀字符串和循环读取,直到它读取您要求的内容)
- 您在同一数据上两次调用 BitConverter.ToUInt16。为什么?