java 将字节数组或 strinbuilder 转换为 utf-8

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/13602465/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-31 13:23:09  来源:igfitidea点击:

Convert byte array or strinbuilder to utf-8

java

提问by Henrik

i want to convert the content to utf-8 charset before i return the string in following method:

我想在使用以下方法返回字符串之前将内容转换为 utf-8 字符集:

public static String getContentResult(URL url) throws IOException{
    InputStream in = url.openStream();
    StringBuilder sb = new StringBuilder();

    byte [] buffer = new byte[4096];

    while(true){
        int byteRead = in.read(buffer);
        if(byteRead == -1)
            break;
        for(int i = 0; i < byteRead; i++){
            sb.append((char)buffer[i]);
        }
    }
    in.close();
    return sb.toString();
}

How can i do that?

我怎样才能做到这一点?

Thanks!

谢谢!

回答by Jon Skeet

You don't want to convert toUTF-8. You want (I believe) to interpret the incoming stream of data asUTF-8.

您不想转换UTF-8。您希望(我相信)将传入的数据流解释UTF-8。

Options:

选项:

  • Create an InputStreamReaderwrapping your incoming stream, specifying UTF-8, and read blocks of charactersat a time, appending to a StringBuilder

    StringBuilder builder = new StringBuilder();
    char[] buffer = new char[4096];
    InputStreamReader reader = new InputStreamReader(in, "UTF-8");
    int charsRead;
    while ((charsRead = reader.read(buffer)) != -1) {
        builder.append(buffer, 0, charsRead);
    }
    
  • Use Guavato read the whole data as a byte array, then convert it in one go:

    byte[] data = BytesStreams.toByteArray(in);
    return new String(data, Charsets.UTF_8);
    
  • 创建一个InputStreamReader包装您的传入流,指定 UTF-8,并一次读取字符块,附加到StringBuilder

    StringBuilder builder = new StringBuilder();
    char[] buffer = new char[4096];
    InputStreamReader reader = new InputStreamReader(in, "UTF-8");
    int charsRead;
    while ((charsRead = reader.read(buffer)) != -1) {
        builder.append(buffer, 0, charsRead);
    }
    
  • 使用Guava将整个数据读取为字节数组,然后一口气转换:

    byte[] data = BytesStreams.toByteArray(in);
    return new String(data, Charsets.UTF_8);
    

In either case, you should use a finallyblock to close the stream, so that you close it even if an exception is thrown.

在任何一种情况下,您都应该使用finally块来关闭流,这样即使抛出异常也可以关闭它。

回答by Harish Raj

Convert from String to byte[]:

从字符串转换为字节[]:

String s = "some text here";
byte[] b = s.getBytes("UTF-8");

Convert from byte[] to String:

从字节 [] 转换为字符串:

byte[] b = {(byte) 99, (byte)97, (byte)116};
String s = new String(b, "US-ASCII");

回答by Captain Fantastic

if you want to add the actual byte value don't use "US-ASCII" just leave that parameter off:

如果要添加实际字节值,请不要使用“US-ASCII”,只需关闭该参数即可:

byte[] abcd={'A','B','C','D',13,10,'E','F',(byte)255,'G','H',13,10,'J','K',0,'L','M'};
String s = new String(abcd);
StringBuilder sabcd=new StringBuilder();

sabcd.append(s);
System.out.println(sabcd);
for(int i=0;i<sabcd.length();i++) {
    char c=sabcd.charAt(i);
    System.out.println((int)c);
}

Result:

结果:

ABCD
EF?GH
JK
65
66
67
68
13
10
69
70
255
71
72
13
10
74
75
0
76
77