Java 中的 UTF-8 到 EBCDIC
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/771054/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
UTF-8 to EBCDIC in Java
提问by
Our requirement is to send EBCDIC text to mainframe. We have some chinese characters thus UTF8 format. So, is there a way to convert the UTF-8 characters to EBCDIC?
我们的要求是将 EBCDIC 文本发送到大型机。我们有一些汉字,因此是 UTF8 格式。那么,有没有办法将 UTF-8 字符转换为 EBCDIC?
Thanks, Raj Mohan
谢谢,拉吉莫汉
回答by Lawrence Dol
Assuming your target system is an IBM mainframe or midrange, it has full support for all of the EBCDIC encodings built into it's JVM as encodings named CPxxxx, corresponding to the IBM CCSID's (CP stands for code-page). You will need to do the translations on the host-side since the client side will not have the necessary encoding support.
假设您的目标系统是 IBM 大型机或中型机,它完全支持内置在其 JVM 中的所有 EBCDIC 编码,作为名为 CPxxxx 的编码,对应于 IBM CCSID(CP 代表代码页)。您需要在主机端进行翻译,因为客户端将没有必要的编码支持。
Since Unicode is DBCS and greater, and supports every known character, you will likely be targeting multiple EBCDIC encodings; so you will likely configure those encodings in some way. Try to have your client Unicode (UTF-8, UTF-16, etc) only, with the translations being done as data arrives on the host and/or leaves the host system.
由于 Unicode 是 DBCS 及更高版本,并且支持所有已知字符,因此您可能会以多种 EBCDIC 编码为目标;所以你可能会以某种方式配置这些编码。尝试让您的客户端仅使用 Unicode(UTF-8、UTF-16 等),并在数据到达主机和/或离开主机系统时完成转换。
Other than needing to do translations host-side, the mechanics are the same as any Java translation; e.g. new String(bytes,encoding) and String.getBytes(encoding), and the various NIO and writer classes. There's really no magic - it's no different than translating between, say, ISO 8859-x and Unicode, or any other SBCS (or limited DBCS).
除了需要在主机端进行翻译外,机制与任何 Java 翻译相同;例如 new String(bytes,encoding) 和 String.getBytes(encoding),以及各种 NIO 和 writer 类。真的没有什么魔法 - 这与在 ISO 8859-x 和 Unicode 或任何其他 SBCS(或有限 DBCS)之间进行转换没有什么不同。
For example:
例如:
byte[] ebcdta="Hello World".getBytes("CP037"); // get bytes for EBCDIC codepage 37
You can find more information on IBM's documentation website.
您可以在IBM 的文档网站上找到更多信息。
回答by Arne Burmeister
EBCDIC has many 8-Bit Codepages. Many of them are supported by the VM. Have a look at Charset.availableCharsets().keySet()
, the EBCDIC pages are named IBM...
(there are aliases like cp500
for IBM500
as you can see by Charset.forName("IBM500").aliases()
).
EBCDIC 有许多 8 位代码页。其中许多都受 VM 支持。看看Charset.availableCharsets().keySet()
,EBCDIC 页面被命名IBM...
(有像cp500
for这样的别名IBM500
,你可以通过 看到Charset.forName("IBM500").aliases()
)。
There are two problems:
有两个问题:
- if you have characters included in different code pages of EBCDIC, this will not help
- i am not sure, if these charsets are available in any vm outside windows.
- 如果您在 EBCDIC 的不同代码页中包含字符,这将无济于事
- 我不确定,这些字符集是否在 Windows 外的任何 vm 中可用。
For the first, have a look at this approach. For the second, have a try on the desired target runtime ;-)
首先,看看这个方法。第二,尝试使用所需的目标运行时 ;-)
回答by Thorbj?rn Ravn Andersen
For the midrange AS/400 (IBM i these days) the best bet is to use the IBM Java Toolkit (jt400.jar) which does all these things transparently (perhaps slightly hinted).
对于中端 AS/400(如今的 IBM i),最好的选择是使用 IBM Java Toolkit (jt400.jar),它可以透明地完成所有这些工作(可能略有暗示)。
Please note that inside Java a character is a 16 bit value, not an UTF-8 (that is an encoding).
请注意,在 Java 中,字符是 16 位值,而不是 UTF-8(即编码)。
回答by Ahmad Y. Saleh
You can always make use of the IBM Toolbox for Java (JTOpen), specifically the com.ibm.as400.access.AS400Text
class in the jt400.jar.
您始终可以使用 IBM Toolbox for Java ( JTOpen),特别com.ibm.as400.access.AS400Text
是 jt400.jar 中的类。
It goes as follows:
它是这样的:
int codePageNumber = 420;
String codePage = "CP420";
String sourceUtfText = "???? ???? ????";
AS400Text converter = new AS400Text(sourceUtfText.length(), codePageNumber);
byte[] bytesData = converter.toBytes(sourceUtfText);
String resultedEbcdicText = new String(bytesData, codePage);
I used the code-page 420and its corresponding java representation of the encoding CP420, this code-page is used for Arabic text, so, you should pick the suitable code-page for Chinese text.
我使用了代码页420及其对应的编码CP420 的java表示,该代码页用于阿拉伯文本,因此,您应该为中文文本选择合适的代码页。