java 如何使用正确的编码将所有控制台输出重定向到 Swing JTextArea/JTextPane?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1522444/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-29 16:54:33  来源:igfitidea点击:

How to redirect all console output to a Swing JTextArea/JTextPane with the right encoding?

javaencodingstdoutnio

提问by

I've been trying to redirect System.out PrintStream to a JTextPane. This works fine, except for the encoding of special locale characters. I found a lot of documentation about it (see for ex. mindprod encoding page), but I'm still fighting with it. Similar questions were posted in StackOverFlow, but the encoding wasn't addressed as far as I've seen.

我一直在尝试将 System.out PrintStream 重定向到 JTextPane。这工作正常,除了特殊语言环境字符的编码。我找到了很多关于它的文档(参见例如mindprod encoding page),但我仍在与它作斗争。StackOverFlow 中也发布了类似的问题,但就我所见,编码并未得到解决。

First solution:

第一个解决方案:

String sUtf = new String(s.getBytes("cp1252"),"UTF-8");

Second solution should use java.nio. I don't understand how to use the Charset.

第二种解决方案应该使用 java.nio。我不明白如何使用字符集。

Charset defaultCharset = Charset.defaultCharset() ;
byte[] b = s.getBytes();
Charset cs = Charset.forName("UTF-8");
ByteBuffer bb = ByteBuffer.wrap( b );
CharBuffer cb = cs.decode( bb );
String stringUtf = cb.toString();
myTextPane.text = stringUtf

Neither solution works out. Any idea?

两种解决方案都不行。任何的想法?

Thanks in advance, jgran

提前致谢, jgran

采纳答案by Vilmantas Baranauskas

Try this code:

试试这个代码:

public class MyOutputStream extends OutputStream {

private PipedOutputStream out = new PipedOutputStream();
private Reader reader;

public MyOutputStream() throws IOException {
    PipedInputStream in = new PipedInputStream(out);
    reader = new InputStreamReader(in, "UTF-8");
}

public void write(int i) throws IOException {
    out.write(i);
}

public void write(byte[] bytes, int i, int i1) throws IOException {
    out.write(bytes, i, i1);
}

public void flush() throws IOException {
    if (reader.ready()) {
        char[] chars = new char[1024];
        int n = reader.read(chars);

        // this is your text
        String txt = new String(chars, 0, n);

        // write to System.err in this example
        System.err.print(txt);
    }
}

public static void main(String[] args) throws IOException {

    PrintStream out = new PrintStream(new MyOutputStream(), true, "UTF-8");

    System.setOut(out);

    System.out.println("café résumé voilà");

}

}

回答by eirikma

String in java does not have an encoding - Strings are backed by a character array, and character should always be utf-16 while they are treated as strings and char values.

java 中的字符串没有编码 - 字符串由字符数组支持,当它们被视为字符串和字符值时,字符应始终为 utf-16。

The encoding only comes into question when you export or import strings/chars to or from an external representation (or location). The transfer must take place using a sequence of bytes to represent the string.

仅当您将字符串/字符导出到外部表示(或位置)或从外部表示(或位置)导入字符串/字符时,编码才会出现问题。传输必须使用字节序列来表示字符串。

I think the first solution is close, but also totally confused. First you ask java to translate the char values to their cp1252-encoded equivalent values (the 'word'for the similarily-shaped character in the cp1252 'language'). Then you create a string from this byte sequence, stating that this sequence of cp-1252 codes is in fact a sequence of utf-8 codes and should be translated to the standard in-memory representation (utf-16) from utf-8.

我认为第一个解决方案很接近,但也完全混乱。首先,您要求 java 将 char 值转换为其 cp1252 编码的等效值(cp1252“语言”中形状相似的字符的“单词”)。然后从这个字节序列创建一个字符串,说明这个 cp-1252 代码序列实际上是一个 utf-8 代码序列,应该从 utf-8 转换为标准的内存中表示 (utf-16)。

A string is never utf og cp1252 or anything like that - it is alsways characters. Only byte sequences are utf-8 or cp1252. If you want to translate char values to a utf-8 string you could use.

字符串永远不是 utf og cp1252 或类似的东西 - 它总是字符。只有字节序列是 utf-8 或 cp1252。如果要将 char 值转换为可以使用的 utf-8 字符串。

byte[] utfs = myString.getBytes("UTF-8");

Actually, I think the problem lies elsewhere, probably inside the printstream and how it prints its input. You should try to avoid converting strings and chars to/from bytes, because that is always a major source of confusion and trouble. Perhaps you must override all methods in order to capture character data before conversion.

实际上,我认为问题出在其他地方,可能在打印流内部以及它如何打印输入。您应该尽量避免将字符串和字符转换为字节/从字节转换,因为这始终是混淆和麻烦的主要来源。也许您必须覆盖所有方法才能在转换之前捕获字符数据。

回答by marcospereira

You should create the PrintStream with the right encode: http://www.j2ee.me/j2se/1.5.0/docs/api/java/io/PrintStream.html#PrintStream(java.io.File, java.lang.String)

您应该使用正确的编码创建 PrintStream:http://www.j2ee.me/j2se/1.5.0/docs/api/java/io/PrintStream.html#PrintStream(java.io.File , java.lang.细绳)

Could you please provide more code about what are you trying to do?

你能提供更多关于你想要做什么的代码吗?

回答by Carsten

As you rightfully assume the problem is most likely in:

正如您正确地假设问题最有可能出现在:

String s = Character.toString((char)i);

since you encode with UTF-8, characters may be encoded with more than 1 byte and thus adding each byte you read as a character won't work.

由于您使用 UTF-8 进行编码,因此字符可能被编码为 1 个以上的字节,因此将您读取的每个字节添加为一个字符将不起作用。

To make it work you can try writing all bytes into a ByteBuffer and using a CharsetDecoder (Charset.forName("UTF-8).newDecoder(), "UTF-8" to match the PrintStream) to convert them into characters which you add the panel.

为了使其工作,您可以尝试将所有字节写入 ByteBuffer 并使用 CharsetDecoder(Charset.forName("UTF-8).newDecoder(), "UTF-8" 以匹配 PrintStream)将它们转换为您添加的字符面板。

I haven't tried it to make sure it works, but I think it is worth a try.

我还没有尝试过以确保它有效,但我认为值得一试。