java中字符串数据的压缩和解压
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/16351668/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
compression and decompression of string data in java
提问by rampuriyaaa
I am using the following code to compress and decompress string data, but the problem which I am facing is, it is easily getting compressed without error, but decompress method throws following error.
我正在使用以下代码来压缩和解压缩字符串数据,但我面临的问题是,它很容易被压缩而不会出错,但是解压缩方法会引发以下错误。
Exception in thread "main" java.io.IOException: Not in GZIP format
线程“main”中的异常 java.io.IOException: 不是 GZIP 格式
public static void main(String[] args) throws Exception {
String string = "I am what I am hhhhhhhhhhhhhhhhhhhhhhhhhhhhh"
+ "bjggujhhhhhhhhh"
+ "rggggggggggggggggggggggggg"
+ "esfffffffffffffffffffffffffffffff"
+ "esffffffffffffffffffffffffffffffff"
+ "esfekfgy enter code here`etd`enter code here wdd"
+ "heljwidgutwdbwdq8d"
+ "skdfgysrdsdnjsvfyekbdsgcu"
+"jbujsbjvugsduddbdj";
System.out.println("after compress:");
String compressed = compress(string);
System.out.println(compressed);
System.out.println("after decompress:");
String decomp = decompress(compressed);
System.out.println(decomp);
}
public static String compress(String str) throws Exception {
if (str == null || str.length() == 0) {
return str;
}
System.out.println("String length : " + str.length());
ByteArrayOutputStream obj=new ByteArrayOutputStream();
GZIPOutputStream gzip = new GZIPOutputStream(obj);
gzip.write(str.getBytes("UTF-8"));
gzip.close();
String outStr = obj.toString("UTF-8");
System.out.println("Output String length : " + outStr.length());
return outStr;
}
public static String decompress(String str) throws Exception {
if (str == null || str.length() == 0) {
return str;
}
System.out.println("Input String length : " + str.length());
GZIPInputStream gis = new GZIPInputStream(new ByteArrayInputStream(str.getBytes("UTF-8")));
BufferedReader bf = new BufferedReader(new InputStreamReader(gis, "UTF-8"));
String outStr = "";
String line;
while ((line=bf.readLine())!=null) {
outStr += line;
}
System.out.println("Output String lenght : " + outStr.length());
return outStr;
}
Still couldn't figure out how to fix this issue!!!
还是不知道怎么解决这个问题!!!
采纳答案by SudoRahul
This is because of
这是因为
String outStr = obj.toString("UTF-8");
Send the byte[]
which you can get from your ByteArrayOutputStream
and use it as such in your ByteArrayInputStream
to construct your GZIPInputStream
. Following are the changes which need to be done in your code.
发送byte[]
您可以从您那里获得的 ,ByteArrayOutputStream
并在您ByteArrayInputStream
的GZIPInputStream
. 以下是需要在您的代码中完成的更改。
byte[] compressed = compress(string); //In the main method
public static byte[] compress(String str) throws Exception {
...
...
return obj.toByteArray();
}
public static String decompress(byte[] bytes) throws Exception {
...
GZIPInputStream gis = new GZIPInputStream(new ByteArrayInputStream(bytes));
...
}
回答by Stephen C
The problem is this line:
问题是这一行:
String outStr = obj.toString("UTF-8");
The byte array obj
contains arbitrary binary data. You can't "decode" arbitrary binary data as if it was UTF-8. If you try you will get a String that cannot then be "encoded" back to bytes. Or at least, the bytes you get will be different to what you started with ... to the extent that they are no longer a valid GZIP stream.
字节数组obj
包含任意二进制数据。您不能像 UTF-8 一样“解码”任意二进制数据。如果您尝试,您将得到一个不能被“编码”回字节的字符串。或者至少,您获得的字节将与您开始时的不同......以至于它们不再是有效的 GZIP 流。
The fix is to store or transmit the contents of the byte array as-is. Don't try to convert it into a String. It is binary data, not text.
解决方法是按原样存储或传输字节数组的内容。不要尝试将其转换为字符串。它是二进制数据,而不是文本。
回答by JeffersonZhang
If you ever need to transfer the zipped content via network or store it as text, you have to use Base64 encoder(such as apache commons codec Base64) to convert the byte array to a Base64 String, and decode the string back to byte array at remote client. Found an example at Use Zip Stream and Base64 Encoder to Compress Large String Data!
如果您需要通过网络传输压缩内容或将其存储为文本,则必须使用 Base64 编码器(例如 apache commons codec Base64)将字节数组转换为 Base64 字符串,并将字符串解码回字节数组远程客户端。在Use Zip Stream and Base64 Encoder to Compress Large String Data 中找到了一个例子!
回答by Arun Pratap Singh
The above Answer solves our problem but in addition to that. if we are trying to decompress a uncompressed("not a zip format") byte[] . we will get "Not in GZIP format" exception message.
上面的答案解决了我们的问题,但除此之外。如果我们试图解压缩一个未压缩的(“不是 zip 格式”) byte[] 。我们将收到“Not in GZIP format”异常消息。
For solving that we can add addition code in our Class.
为了解决这个问题,我们可以在我们的类中添加额外的代码。
public static boolean isCompressed(final byte[] compressed) {
return (compressed[0] == (byte) (GZIPInputStream.GZIP_MAGIC)) && (compressed[1] == (byte) (GZIPInputStream.GZIP_MAGIC >> 8));
}
My Complete Compression Class with compress/decompress would look like:
我的带有压缩/解压缩的完整压缩类如下所示:
import java.io.BufferedReader;
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.zip.GZIPInputStream;
import java.util.zip.GZIPOutputStream;
public class GZIPCompression {
public static byte[] compress(final String str) throws IOException {
if ((str == null) || (str.length() == 0)) {
return null;
}
ByteArrayOutputStream obj = new ByteArrayOutputStream();
GZIPOutputStream gzip = new GZIPOutputStream(obj);
gzip.write(str.getBytes("UTF-8"));
gzip.flush();
gzip.close();
return obj.toByteArray();
}
public static String decompress(final byte[] compressed) throws IOException {
final StringBuilder outStr = new StringBuilder();
if ((compressed == null) || (compressed.length == 0)) {
return "";
}
if (isCompressed(compressed)) {
final GZIPInputStream gis = new GZIPInputStream(new ByteArrayInputStream(compressed));
final BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(gis, "UTF-8"));
String line;
while ((line = bufferedReader.readLine()) != null) {
outStr.append(line);
}
} else {
outStr.append(compressed);
}
return outStr.toString();
}
public static boolean isCompressed(final byte[] compressed) {
return (compressed[0] == (byte) (GZIPInputStream.GZIP_MAGIC)) && (compressed[1] == (byte) (GZIPInputStream.GZIP_MAGIC >> 8));
}
}
回答by Andrey Badaev
You can't convert binary data to String. As a solution you can encode binary data and then convert to String. For example, look at this How do you convert binary data to Strings and back in Java?
您不能将二进制数据转换为字符串。作为一种解决方案,您可以对二进制数据进行编码,然后转换为字符串。例如,看看这个How do you convert binary data to Strings and back in Java?
回答by Sergey Frolov
Another example of correct compression and decompression:
另一个正确压缩和解压的例子:
@Slf4j
public class GZIPCompression {
public static byte[] compress(final String stringToCompress) {
if (isNull(stringToCompress) || stringToCompress.length() == 0) {
return null;
}
try (final ByteArrayOutputStream baos = new ByteArrayOutputStream();
final GZIPOutputStream gzipOutput = new GZIPOutputStream(baos)) {
gzipOutput.write(stringToCompress.getBytes(UTF_8));
gzipOutput.finish();
return baos.toByteArray();
} catch (IOException e) {
throw new UncheckedIOException("Error while compression!", e);
}
}
public static String decompress(final byte[] compressed) {
if (isNull(compressed) || compressed.length == 0) {
return null;
}
try (final GZIPInputStream gzipInput = new GZIPInputStream(new ByteArrayInputStream(compressed));
final StringWriter stringWriter = new StringWriter()) {
IOUtils.copy(gzipInput, stringWriter, UTF_8);
return stringWriter.toString();
} catch (IOException e) {
throw new UncheckedIOException("Error while decompression!", e);
}
}
}
回答by Yu Zhang
Client send some messages need be compressed, server (kafka) decompress the string meesage
客户端发送一些需要压缩的消息,服务器(kafka)解压字符串 meesage
Below is my sample:
以下是我的示例:
compress:
压缩:
public static String compress(String str, String inEncoding) {
if (str == null || str.length() == 0) {
return str;
}
try {
ByteArrayOutputStream out = new ByteArrayOutputStream();
GZIPOutputStream gzip = new GZIPOutputStream(out);
gzip.write(str.getBytes(inEncoding));
gzip.close();
return URLEncoder.encode(out.toString("ISO-8859-1"), "UTF-8");
} catch (IOException e) {
e.printStackTrace();
}
return null;
}
decompress:
解压:
public static String decompress(String str, String outEncoding) {
if (str == null || str.length() == 0) {
return str;
}
try {
String decode = URLDecoder.decode(str, "UTF-8");
ByteArrayOutputStream out = new ByteArrayOutputStream();
ByteArrayInputStream in = new ByteArrayInputStream(decode.getBytes("ISO-8859-1"));
GZIPInputStream gunzip = new GZIPInputStream(in);
byte[] buffer = new byte[256];
int n;
while ((n = gunzip.read(buffer)) >= 0) {
out.write(buffer, 0, n);
}
return out.toString(outEncoding);
} catch (IOException e) {
e.printStackTrace();
}
return null;
}