在java中将EBCDIC转换为ASCII

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/20349491/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-13 01:15:30  来源:igfitidea点击:

Converting EBCDIC to ASCII in java

javaasciiinputstreamcobolebcdic

提问by Robin-Hoodie

I am supposed to convert an EBCDIC file to ASCII by using Java. So far I have this code:

我应该使用 Java 将 EBCDIC 文件转换为 ASCII。到目前为止,我有这个代码:

public class Migration {
    InputStreamReader reader;
    StringBuilder builder;

    public Migration(){
        try {
            reader = new InputStreamReader(new FileInputStream("C:\TI3\Legacy Systemen\Week 3\Oefening 3\inputfile.dat"),
                   java.nio.charset.Charset.forName("ibm500") );
        } catch(FileNotFoundException e){
            e.printStackTrace();
        }
        builder = new StringBuilder();
    }

    public void read() throws IOException {
        int theInt;
        while((theInt = reader.read()) != -1){
            char theChar = (char) theInt;
            builder.append(theChar);

        }

        reader.close();
    }

    @Override
    public String toString(){
        return builder.toString();
    }
}

The file description is the following:

文件描述如下:

 02 KDGEX.
      05 B1-LENGTH PIC S9(04) USAGE IS COMP.
      05 B1-CODE PIC S9(04) USAGE IS COMP.
      05 B1-NUMBER PIC X(08).
      05 B1-PPR-NAME PIC X(06).
      05 B1-PPR-FED PIC 9(03).
      05 B1-PPR-RNR PIC S9(08) USAGE IS COMP.
      05 B1-DATA.
        10 B1-VBOND PIC 9(02).
        10 B1-KONST.
          20 B1-AFDEL PIC 9(03).
          20 B1-KASSIER PIC 9(03).
          20 B1-DATZIT-DM PIC 9(04).
        10 B1-BETWYZ PIC X(01).
        10 B1-RNR PIC X(13).
        10 B1-BETKOD PIC 9(02).
        10 B1-VOLGNR-INF PIC 9(02).
        10 B1-QUAL-PREST PIC 9(03).
        10 B1-REKNUM PIC 9(12).
        10 B1-REKNR REDEFINES B1-REKNUM.
          20 B1-REKNR-PART1 PIC 9(03).
          20 B1-REKNR-PART2 PIC 9(07).
          20 B1-REKNR-PART3 PIC 9(02).
        10 B1-VOLGNR-M30 PIC 9(03).
        10 B1-OMSCHR.
          15 B1-OMSCHR1 PIC X(14).
          15 B1-OMSCHR2 PIC X(14).
        10 B1-OMSCHR-INF REDEFINES B1-OMSCHR.
          15 B1-AANT-PREST PIC 9(02).
          15 B1-VERSTR PIC 9(01).
          15 B1-LASTDATE PIC 9(06).
          15 B1-HONOR PIC 9(06).
          15 B1-RIJKN PIC X(13).
        10 FILLER--1 PIC 9(02).
        10 B1-INFOREK PIC 9(01).
        10 B1-BEDRAG-EUR PIC 9(08).
        10 B1-BEDRAG-DV PIC X(01).
        10 B1-BEDRAG-RMG-DV REDEFINES B1-BEDRAG-DV PIC X(01).
      05 FILLER PIC X(5).

We can ignore the first 2 bytes on every line. The problem is the bytes where there's a USAGE IS COMP since the reader is not converting them properly, I think I am supposed to read these as bytes or something, though I have no idea how.

我们可以忽略每行的前 2 个字节。问题是有 USAGE IS COMP 的字节,因为阅读器没有正确转换它们,我想我应该将这些作为字节或其他东西来读取,尽管我不知道如何。

采纳答案by McDowell

If I am interpreting this format correctly you have a binary file format with fixed-length records. Some of these records are not character data (COBOL computational fields?)

如果我正确地解释了这种格式,那么您就有了一个带有固定长度记录的二进制文件格式。其中一些记录不是字符数据(COBOL 计算字段?)

So, you will have to read the records using a more low-level approach processing individual fields of each record:

因此,您将不得不使用处理每条记录的各个字段的更底层方法来读取记录:

import java.io.*;

public class Record {
  private byte[] kdgex = new byte[2]; // COMP
  private byte[] b1code = new byte[2]; // COMP
  private byte[] b1number = new byte[8]; // DISPLAY
  // other fields

  public void read(DataInput data) throws IOException {
    data.readFully(kdgex);
    data.readFully(b1code);
    data.readFully(b1number);
    // other fields
  }

  public void write(DataOutput out) throws IOException {
    out.write(kdgex);
    out.write(b1code);
    out.write(b1number);
    // other fields
  }
}

Here I've used byte arrays for the first three fields of the record but you could use other more suitable types where appropriate (like a shortfor the first field with readShort.) Note: my interpretation of the field widths is likely wrong; it is just an example.

在这里,我在记录的前三个字段中使用了字节数组,但您可以在适当的情况下使用其他更合适的类型(例如short带有readShort的第一个字段。)注意:我对字段宽度的解释可能是错误的;这只是一个例子。

The DataInputStreamis generally used as a DataInputimplementation.

DataInputStream类一般用作DataInput中的实现。

Since all characters in the source and target encodings use a one-octet-per code point you should be able to transcode the character data fields using a method like this:

由于源和目标编码中的所有字符都使用一个八位字节的代码点,因此您应该能够使用如下方法对字符数据字段进行转码:

public static byte[] transcodeField(byte[] source, Charset from, Charset to) {
  byte[] result = new String(source, from).getBytes(to);
  if (result.length != source.length) {
    throw new AssertionError(result.length + "!=" + source.length);
  }
  return result;
}

I suggest tagging your question with COBOL (assuming that is the source of this format) so that someone else can speak with more authority on the format of the data source.

我建议用 COBOL 标记您的问题(假设这是这种格式的来源),以便其他人可以更权威地讨论数据源的格式。

回答by Murali

I also faced same issue like converting EBCDIC to ASCII string. Please find the code below to convert a single EBCDIC to ASCII string.

我也遇到了同样的问题,比如将 EBCDIC 转换为 ASCII 字符串。请在下面找到将单个 EBCDIC 转换为 ASCII 字符串的代码。

public class EbcdicConverter
{
    public static void main(String[] args) 
        throws Exception
    {
        String ebcdicString =<your EBCDIC string>;
        // convert String into InputStream
        InputStream is = new ByteArrayInputStream(ebcdicString.getBytes());
        ByteArrayOutputStream baos=new ByteArrayOutputStream();

        int line;
         while((line = is.read()) != -1) {
             baos.write((char)line);
         }
         String str = baos.toString("Cp500");
         System.out.println(str);
    }
}