在 Java 中压缩用于客户端/服务器传输的字符串
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1414037/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Compressing strings for client/server transport in Java
提问by filsa
I work with a propriety client/server message format that restricts what I can send over the wire. I can't send a serialized object, I have to store the data in the message as a String. The data I am sending are large comma-separated values, and I want to compress the data before I pack it into the message as a String.
我使用适当的客户端/服务器消息格式,该格式限制了我可以通过网络发送的内容。我无法发送序列化对象,我必须将消息中的数据作为字符串存储。我发送的数据是大的逗号分隔值,我想在将数据作为字符串打包到消息中之前对其进行压缩。
I attempted to use Deflater/Inflater to achieve this, but somewhere along the line I am getting stuck.
我尝试使用 Deflater/Inflater 来实现这一点,但在此过程中我被卡住了。
I am using the two methods below to deflate/inflate. However, passing the result of the compressString() method to decompressStringMethod() returns a null result.
我正在使用以下两种方法来放气/充气。但是,将 compressString() 方法的结果传递给 decompressStringMethod() 会返回空结果。
public String compressString(String data) {
Deflater deflater = new Deflater();
byte[] target = new byte[100];
try {
deflater.setInput(data.getBytes(UTF8_CHARSET));
deflater.finish();
int deflateLength = deflater.deflate(target);
return new String(target);
} catch (UnsupportedEncodingException e) {
//TODO
}
return data;
}
public String decompressString(String data) {
String result = null;
try {
byte[] input = data.getBytes();
Inflater inflater = new Inflater();
int inputLength = input.length;
inflater.setInput(input, 0, inputLength);
byte[] output = new byte[100];
int resultLength = inflater.inflate(output);
inflater.end();
result = new String(output, 0, resultLength, UTF8_CHARSET);
} catch (DataFormatException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (UnsupportedEncodingException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return result;
}
回答by Stephen C
From what I can tell, your current approach is:
据我所知,您目前的方法是:
- Convert String to byte array using
getBytes("UTF-8"). - Compress byte array
- Convert compressed byte array to String using
new String(bytes, ..., "UTF-8"). - Transmit compressed string
- Receive compressed string
- Convert compressed string to byte array using
getBytes("UTF-8"). - Decompress byte array
- Convert decompressed byte array to String using
new String(bytes, ..., "UTF-8").
- 使用 将字符串转换为字节数组
getBytes("UTF-8")。 - 压缩字节数组
- 使用 将压缩的字节数组转换为字符串
new String(bytes, ..., "UTF-8")。 - 传输压缩字符串
- 接收压缩字符串
- 使用 将压缩字符串转换为字节数组
getBytes("UTF-8")。 - 解压字节数组
- 使用 将解压缩的字节数组转换为字符串
new String(bytes, ..., "UTF-8")。
The problem with this approach is in step 3. When you compress the byte array, you create a sequence of bytes which may no longer be valid UTF-8. The result will be an exception in step 3.
这种方法的问题在于第 3 步。当您压缩字节数组时,您创建的字节序列可能不再是有效的 UTF-8。结果将是步骤 3 中的异常。
The solution is to use a "bytes to characters" encoding scheme like Base64 to turn the compressed bytes into a transmissible string. In other words, replace step 3 with a call to a Base64 encode function, and step 6 with a call to a Base64 decode function.
解决方案是使用“字节到字符”编码方案(如 Base64)将压缩的字节转换为可传输的字符串。换句话说,将步骤 3 替换为对 Base64 编码函数的调用,将步骤 6 替换为对 Base64 解码函数的调用。
Notes:
笔记:
- For small strings, compressing and encoding is likely to actually increase the size of the transmitted string.
- If the compacted String is going to be incorporated into a URL, you may want to pick a different encoding to Base64 that avoids characters that need to be URL escaped.
- Depending on the nature of the data you are transmitting, you may find that a domain specific compression works better than a generic one. Consider compressing the data before creating the comma-separated string. Consider alternatives to comma-separated strings.
- 对于小字符串,压缩和编码实际上可能会增加传输字符串的大小。
- 如果压缩后的字符串要合并到 URL 中,您可能需要选择与 Base64 不同的编码,以避免需要对 URL 进行转义的字符。
- 根据您传输的数据的性质,您可能会发现特定于域的压缩比通用压缩效果更好。在创建逗号分隔的字符串之前考虑压缩数据。考虑逗号分隔字符串的替代方法。
回答by Denis Tulskiy
The problem is that you convert compressed bytes to a string, which breaks the data. Your compressStringand decompressStringshould work on byte[]
问题是您将压缩字节转换为字符串,这会破坏数据。你compressString和decompressString应该努力byte[]
EDIT: Here is revised version. It works
编辑:这是修订版。有用
EDIT2: And about base64. you're sending bytes, not strings. You don't need base64.
EDIT2:关于base64。您发送的是字节,而不是字符串。你不需要base64。
public static void main(String[] args) {
String input = "Test input";
byte[] data = new byte[100];
int len = compressString(input, data, data.length);
String output = decompressString(data, len);
if (!input.equals(output)) {
System.out.println("Test failed");
}
System.out.println(input + " " + output);
}
public static int compressString(String data, byte[] output, int len) {
Deflater deflater = new Deflater();
deflater.setInput(data.getBytes(Charset.forName("utf-8")));
deflater.finish();
return deflater.deflate(output, 0, len);
}
public static String decompressString(byte[] input, int len) {
String result = null;
try {
Inflater inflater = new Inflater();
inflater.setInput(input, 0, len);
byte[] output = new byte[100]; //todo may oveflow, find better solution
int resultLength = inflater.inflate(output);
inflater.end();
result = new String(output, 0, resultLength, Charset.forName("utf-8"));
} catch (DataFormatException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return result;
}
回答by matt b
If you have a piece of code which seems to be silently failing, perhaps you shouldn't catch and swallow Exceptions:
如果您有一段似乎无声无息地失败的代码,也许您不应该捕获并吞下异常:
catch (UnsupportedEncodingException e) {
//TODO
}
But the real reason why decompress returns null is because your exception handling doesn't specify what to do with resultwhen you catch an exception - resultis left as null. Are you checking the output to see if any Exceptions are occuring?
但 decompress 返回 null 的真正原因是因为您的异常处理没有指定result捕获异常时要做什么-result保留为 null。您是否正在检查输出以查看是否发生任何异常?
If I run your decompress() on a badly formatted String, Inflater throws me this DataFormatException:
如果我在格式错误的字符串上运行你的 decompress(),Inflater 会抛出这个DataFormatException:
java.util.zip.DataFormatException: incorrect header check
at java.util.zip.Inflater.inflateBytes(Native Method)
at java.util.zip.Inflater.inflate(Inflater.java:223)
at java.util.zip.Inflater.inflate(Inflater.java:240)
回答by NawaMan
TO ME: write compress algorithm myself is difficult but writing binary to string is not. So if I were you, I will serialize the object normally and zip it with compression (as provided by ZipFile) then convert to string using something like Base64 Encode/Decode.
对我来说:自己编写压缩算法很困难,但将二进制写入字符串则不然。因此,如果我是您,我将正常序列化对象并使用压缩(由 ZipFile 提供)将其压缩,然后使用Base64 Encode/Decode 之类的内容转换为字符串。
I actually have BASE64 ENCODE/DECODE functions. If you wanted I can post it here.
我实际上有 BASE64 编码/解码功能。如果你愿意,我可以把它贴在这里。
回答by user1105392
Inflator/Deflator is not a solution for compress string. I think GZIPInputString and GZIPOutputString is the proper tool to compress the string
Inflator/Deflator 不是压缩字符串的解决方案。我认为 GZIPInputString 和 GZIPOutputString 是压缩字符串的合适工具
回答by C Deepak
I was facing similar issue which was resolved by base64 decoding the input.
i.e instead of
我遇到了类似的问题,通过 base64 解码输入解决了这个问题。
即代替
data.getBytes(UTF8_CHARSET)
i tried
我试过
Base64.decodeBase64(data)
and it worked.
它奏效了。

