C#:将COMP-3压缩的十进制转换为人类可读的值
我有一系列来自大型机的ASCII平面文件,这些文件将由C应用程序处理。引入了一个新的提要,该提要带有一个打包十进制(COMP-3)字段,需要将其转换为数值。
使用ASCII传输模式通过FTP传输文件。我担心二进制字段可能包含将解释为非常低的ASCII代码或者控制字符而不是值,或者更糟的是,可能会在FTP进程中丢失。
而且,这些字段被读取为字符串。我可能可以灵活地处理这部分内容(即某种形式的信息流),但这项业务会给我带来压力。
该要求显示为"从HEX转换为ASCII",但显然不能产生正确的值。任何帮助,将不胜感激;只要我们可以解释转换过程的逻辑,它就不必特定于语言。
解决方案
我很抱歉,如果我离这里很远,但是也许我将在此处粘贴的此代码示例可以为我们提供帮助。来自VBRocks ...
Imports System Imports System.IO Imports System.Text Imports System.Text.Encoding '4/20/07 submission includes a line spacing addition when a control character is used: ' The line spacing is calculated off of the 3rd control character. ' ' Also includes the 4/18 modification of determining end of file. '4/26/07 submission inclues an addition of 6 to the record length when the 4th control ' character is an 8. This is because these records were being truncated. 'Authored by Gary A. Lima, aka. VBRocks ''' <summary> ''' Translates an EBCDIC file to an ASCII file. ''' </summary> ''' <remarks></remarks> Public Class EBCDIC_to_ASCII_Translator #Region " Example" Private Sub Example() 'Set your source file and destination file paths Dim sSourcePath As String = "c:\Temp\MyEBCDICFile" Dim sDestinationPath As String = "c:\Temp\TranslatedFile.txt" Dim trans As New EBCDIC_to_ASCII_Translator() 'If your EBCDIC file uses Control records to determine the length of a record, then this to True trans.UseControlRecord = True 'If the first record of your EBCDIC file is filler (junk), then set this to True trans.IgnoreFirstRecord = True 'EBCDIC files are written in block lengths, set your block length (Example: 134, 900, Etc.) trans.BlockLength = 900 'This method will actually translate your source file and output it to the specified destination file path trans.TranslateFile(sSourcePath, sDestinationPath) 'Here is a alternate example: 'No Control record is used 'trans.UseControlRecord = False 'Translate the whole file, including the first record 'trans.IgnoreFirstRecord = False 'Set the block length 'trans.BlockLength = 134 'Translate... 'trans.TranslateFile(sSourcePath, sDestinationPath) '*** Some additional methods that you can use are: 'Trim off leading characters from left side of string (position 0 to...) 'trans.LTrim = 15 'Translate 1 EBCDIC character to an ASCII character 'Dim strASCIIChar as String = trans.TranslateCharacter("S") 'Translate an EBCDIC character array to an ASCII string 'trans.TranslateCharacters(chrEBCDICArray) 'Translates an EBCDIC string to an ASCII string 'Dim strASCII As String = trans.TranslateString("EBCDIC String") End Sub #End Region 'Example 'Translate characters from EBCDIC to ASCII Private ASCIIEncoding As Encoding = Encoding.ASCII Private EBCDICEncoding As Encoding = Encoding.GetEncoding(37) 'EBCDIC 'Block Length: Can be fixed (Ex: 134). Private miBlockLength As Integer = 0 Private mbUseControlRec As Boolean = True 'If set to False, will return exact block length Private mbIgnoreFirstRecord As Boolean = True 'Will Ignore first record if set to true (First record may be filler) Private miLTrim As Integer = 0 ''' <summary> ''' Translates SourceFile from EBCDIC to ASCII. Writes output to file path specified by DestinationFile parameter. ''' Set the BlockLength Property to designate block size to read. ''' </summary> ''' <param name="SourceFile">Enter the path of the Source File.</param> ''' <param name="DestinationFile">Enter the path of the Destination File.</param> ''' <remarks></remarks> Public Sub TranslateFile(ByVal SourceFile As String, ByVal DestinationFile As String) Dim iRecordLength As Integer 'Stores length of a record, not including the length of the Control Record (if used) Dim sRecord As String = "" 'Stores the actual record Dim iLineSpace As Integer = 1 'LineSpace: 1 for Single Space, 2 for Double Space, 3 for Triple Space... Dim iControlPosSix As Byte() 'Stores the 6th character of a Control Record (used to calculate record length) Dim iControlRec As Byte() 'Stores the EBCDIC Control Record (First 6 characters of record) Dim bEOR As Boolean 'End of Record Flag Dim bBOF As Boolean = True 'Beginning of file Dim iConsumedChars As Integer = 0 'Stores the number of consumed characters in the current block Dim bIgnoreRecord As Boolean = mbIgnoreFirstRecord 'Ignores the first record if set. Dim ControlArray(5) As Char 'Stores Control Record (first 6 bytes) Dim chrArray As Char() 'Stores characters just after read from file Dim sr As New StreamReader(SourceFile, EBCDICEncoding) Dim sw As New StreamWriter(DestinationFile) 'Set the RecordLength to the RecordLength Property (below) iRecordLength = miBlockLength 'Loop through entire file Do Until sr.EndOfStream = True 'If using a Control Record, then check record for valid data. If mbUseControlRec = True Then 'Read the Control Record (first 6 characters of the record) sr.ReadBlock(ControlArray, 0, 6) 'Update the value of consumed (read) characters iConsumedChars += ControlArray.Length 'Get the bytes of the Control Record Array iControlRec = EBCDICEncoding.GetBytes(ControlArray) 'Set the line spacing (position 3 divided by 64) ' (64 decimal = Single Spacing; 128 decimal = Double Spacing) iLineSpace = iControlRec(2) / 64 'Check the Control record for End of File 'If the Control record has a 8 or 10 in position 1, and a 1 in postion 2, then it is the end of the file If (iControlRec(0) = 8 OrElse iControlRec(0) = 10) AndAlso _ iControlRec(1) = 1 Then If bBOF = False Then Exit Do Else 'The Beginning of file flag is set to true by default, so when the first ' record is encountered, it is bypassed and the bBOF flag is set to False bBOF = False End If 'If bBOF = Fals End If 'If (iControlRec(0) = 8 OrElse 'Set the default value for the End of Record flag to True ' If the Control Record has all zeros, then it's True, else False bEOR = True 'If the Control record contains all zeros, bEOR will stay True, else it will be set to False For i As Integer = 0 To 5 If iControlRec(i) > 0 Then bEOR = False Exit For End If 'If iControlRec(i) > 0 Next 'For i As Integer = 0 To 5 If bEOR = False Then 'Convert EBCDIC character to ASCII 'Multiply the 6th byte by 6 to get record length ' Why multiply by 6? Because it works. iControlPosSix = EBCDICEncoding.GetBytes(ControlArray(5)) 'If the 4th position of the control record is an 8, then add 6 ' to the record length to pick up remaining characters. If iControlRec(3) = 8 Then iRecordLength = CInt(iControlPosSix(0)) * 6 + 6 Else iRecordLength = CInt(iControlPosSix(0)) * 6 End If 'Add the length of the record to the Consumed Characters counter iConsumedChars += iRecordLength Else 'If the Control Record had all zeros in it, then it is the end of the Block. 'Consume the remainder of the block so we can continue at the beginning of the next block. ReDim chrArray(miBlockLength - iConsumedChars - 1) 'ReDim chrArray(iRecordLength - iConsumedChars - 1) 'Consume (read) the remaining characters in the block. ' We are not doing anything with them because they are not actual records. 'sr.ReadBlock(chrArray, 0, iRecordLength - iConsumedChars) sr.ReadBlock(chrArray, 0, miBlockLength - iConsumedChars) 'Reset the Consumed Characters counter iConsumedChars = 0 'Set the Record Length to 0 so it will not be processed below. iRecordLength = 0 End If ' If bEOR = False End If 'If mbUseControlRec = True If iRecordLength > 0 Then 'Resize our array, dumping previous data. Because Arrays are Zero (0) based, subtract 1 from the Record length. ReDim chrArray(iRecordLength - 1) 'Read the specfied record length, without the Control Record, because we already consumed (read) it. sr.ReadBlock(chrArray, 0, iRecordLength) 'Copy Character Array to String Array, Converting in the process, then Join the Array to a string sRecord = Join(Array.ConvertAll(chrArray, New Converter(Of Char, String)(AddressOf ChrToStr)), "") 'If the record length was 0, then the Join method may return Nothing If IsNothing(sRecord) = False Then If bIgnoreRecord = True Then 'Do nothing - bypass record 'Reset flag bIgnoreRecord = False Else 'Write the line out, LTrimming the specified number of characters. If sRecord.Length >= miLTrim Then sw.WriteLine(sRecord.Remove(0, miLTrim)) Else sw.WriteLine(sRecord.Remove(0, sRecord.Length)) End If ' If sRecord.Length >= miLTrim 'Write out the number of blank lines specified by the 3rd control character. For i As Integer = 1 To iLineSpace - 1 sw.WriteLine("") Next 'For i As Integer = 1 To iLineSpace End If 'If bIgnoreRecord = True 'Obviously, if we have read more characters from the file than the designated size of the block, ' then subtract the number of characters we have read into the next block from the block size. If iConsumedChars > miBlockLength Then 'If iConsumedChars > iRecordLength Then iConsumedChars = iConsumedChars - miBlockLength 'iConsumedChars = iConsumedChars - iRecordLength End If End If 'If IsNothing(sRecord) = False End If 'If iRecordLength > 0 'Allow computer to process (works in a class module, not in a dll) 'Application.DoEvents() Loop 'Destroy StreamReader (sr) sr.Close() sr.Dispose() 'Destroy StreamWriter (sw) sw.Close() sw.Dispose() End Sub ''' <summary> ''' Translates 1 EBCDIC Character (Char) to an ASCII String ''' </summary> ''' <param name="chr"></param> ''' <returns></returns> ''' <remarks></remarks> Private Function ChrToStr(ByVal chr As Char) As String Dim sReturn As String = "" 'Convert character into byte Dim EBCDICbyte As Byte() = EBCDICEncoding.GetBytes(chr) 'Convert EBCDIC byte to ASCII byte Dim ASCIIByte As Byte() = Encoding.Convert(EBCDICEncoding, ASCIIEncoding, EBCDICbyte) sReturn = Encoding.ASCII.GetString(ASCIIByte) Return sReturn End Function ''' <summary> ''' Translates an EBCDIC String to an ASCII String ''' </summary> ''' <param name="sStringToTranslate"></param> ''' <returns>String</returns> ''' <remarks></remarks> Public Function TranslateString(ByVal sStringToTranslate As String) As String Dim i As Integer = 0 Dim sReturn As New System.Text.StringBuilder() 'Loop through the string and translate each character For i = 0 To sStringToTranslate.Length - 1 sReturn.Append(ChrToStr(sStringToTranslate.Substring(i, 1))) Next Return sReturn.ToString() End Function ''' <summary> ''' Translates 1 EBCDIC Character (Char) to an ASCII String ''' </summary> ''' <param name="sCharacterToTranslate"></param> ''' <returns>String</returns> ''' <remarks></remarks> Public Function TranslateCharacter(ByVal sCharacterToTranslate As Char) As String Return ChrToStr(sCharacterToTranslate) End Function ''' <summary> ''' Translates an EBCDIC Character (Char) Array to an ASCII String ''' </summary> ''' <param name="sCharacterArrayToTranslate"></param> ''' <returns>String</returns> ''' <remarks>Remarks</remarks> Public Function TranslateCharacters(ByVal sCharacterArrayToTranslate As Char()) As String Dim sReturn As String = "" 'Copy Character Array to String Array, Converting in the process, then Join the Array to a string sReturn = Join(Array.ConvertAll(sCharacterArrayToTranslate, _ New Converter(Of Char, String)(AddressOf ChrToStr)), "") Return sReturn End Function ''' <summary> ''' Block Length must be set. You can set the BlockLength for specific block sizes (Ex: 134). ''' Set UseControlRecord = False for files with specific block sizes (Default is True) ''' </summary> ''' <value>0</value> ''' <returns>Integer</returns> ''' <remarks></remarks> Public Property BlockLength() As Integer Get Return miBlockLength End Get Set(ByVal value As Integer) miBlockLength = value End Set End Property ''' <summary> ''' Determines whether a ControlKey is used to calculate RecordLength of valid data ''' </summary> ''' <value>Default value is True</value> ''' <returns>Boolean</returns> ''' <remarks></remarks> Public Property UseControlRecord() As Boolean Get Return mbUseControlRec End Get Set(ByVal value As Boolean) mbUseControlRec = value End Set End Property ''' <summary> ''' Ignores first record if set (Default is True) ''' </summary> ''' <value>Default is True</value> ''' <returns>Boolean</returns> ''' <remarks></remarks> Public Property IgnoreFirstRecord() As Boolean Get Return mbIgnoreFirstRecord End Get Set(ByVal value As Boolean) mbIgnoreFirstRecord = value End Set End Property ''' <summary> ''' Trims the left side of every string the specfied number of characters. Default is 0. ''' </summary> ''' <value>Default is 0.</value> ''' <returns>Integer</returns> ''' <remarks></remarks> Public Property LTrim() As Integer Get Return miLTrim End Get Set(ByVal value As Integer) miLTrim = value End Set End Property End Class
首先,我们必须消除由ASCII传输模式引起的行尾(EOL)转换问题。当BCD值恰好与EOL字符相对应时,我们绝对有必要担心数据损坏。此问题的最坏方面是,它将很少且意外地发生。
最好的解决方案是将传输模式更改为BIN。这是适当的,因为我们要传输的数据是二进制的。如果无法使用正确的FTP传输模式,则可以撤消代码中的ASCII模式损坏。我们要做的就是将\ r \ n对转换回\ n。如果我是你,我将确保它已经过测试。
一旦解决了EOL问题,COMP-3转换就可以实现。我可以在MS知识库中找到本文,并带有BASIC示例代码。有关此代码的VB.NET端口,请参见下文。
由于我们要处理的是COMP-3值,因此几乎可以肯定,我们正在读取的文件格式具有固定的记录大小和固定的字段长度。如果我们是我,那么在继续进行此操作之前,我会先了解文件格式规范。我们应该使用BinaryReader来处理此数据。如果有人坚持这一点,我会走开。让他们找到其他人放纵自己的愚蠢。
这是BASIC示例代码的VB.NET端口。我没有对此进行测试,因为我无权访问COMP-3文件。如果这不起作用,我将参考原始的MS示例代码以获取指导,或者参考该问题的其他答案中的参考。
Imports Microsoft.VisualBasic Module Module1 'Sample COMP-3 conversion code 'Adapted from http://support.microsoft.com/kb/65323 'This code has not been tested Sub Main() Dim Digits%(15) 'Holds the digits for each number (max = 16). Dim Basiceqv#(1000) 'Holds the Basic equivalent of each COMP-3 number. 'Added to make code compile Dim MyByte As Char, HighPower%, HighNibble% Dim LowNibble%, Digit%, E%, Decimal%, FileName$ 'Clear the screen, get the filename and the amount of decimal places 'desired for each number, and open the file for sequential input: FileName$ = InputBox("Enter the COBOL data file name: ") Decimal% = InputBox("Enter the number of decimal places desired: ") FileOpen(1, FileName$, OpenMode.Binary) Do Until EOF(1) 'Loop until the end of the file is reached. Input(1, MyByte) If MyByte = Chr(0) Then 'Check if byte is 0 (ASC won't work on 0). Digits%(HighPower%) = 0 'Make next two digits 0. Increment Digits%(HighPower% + 1) = 0 'the high power to reflect the HighPower% = HighPower% + 2 'number of digits in the number 'plus 1. Else HighNibble% = Asc(MyByte) \ 16 'Extract the high and low LowNibble% = Asc(MyByte) And &HF 'nibbles from the byte. The Digits%(HighPower%) = HighNibble% 'high nibble will always be a 'digit. If LowNibble% <= 9 Then 'If low nibble is a 'digit, assign it and Digits%(HighPower% + 1) = LowNibble% 'increment the high HighPower% = HighPower% + 2 'power accordingly. Else HighPower% = HighPower% + 1 'Low nibble was not a digit but a Digit% = 0 '+ or - signals end of number. 'Start at the highest power of 10 for the number and multiply 'each digit by the power of 10 place it occupies. For Power% = (HighPower% - 1) To 0 Step -1 Basiceqv#(E%) = Basiceqv#(E%) + (Digits%(Digit%) * (10 ^ Power%)) Digit% = Digit% + 1 Next 'If the sign read was negative, make the number negative. If LowNibble% = 13 Then Basiceqv#(E%) = Basiceqv#(E%) - (2 * Basiceqv#(E%)) End If 'Give the number the desired amount of decimal places, print 'the number, increment E% to point to the next number to be 'converted, and reinitialize the highest power. Basiceqv#(E%) = Basiceqv#(E%) / (10 ^ Decimal%) Print(Basiceqv#(E%)) E% = E% + 1 HighPower% = 0 End If End If Loop FileClose() 'Close the COBOL data file, and end. End Sub End Module
如果原始数据在EBCDIC中,则COMP-3字段为乱码。 FTP进程已经完成了COMP-3字段中字节值的EBCDIC到ASCII转换,这不是我们想要的。要更正此问题,我们可以:
1)使用BINARY模式进行传输,以便获取原始EBCDIC数据。然后,将COMP-3字段转换为数字,并将记录上的任何其他EBCDIC文本转换为ASCII。一个压缩字段将每个数字存储在一个半字节中,下半字节作为一个符号(F为正,其他值,通常为D或者E,为负)。在PIC 999.99 USAGE COMP-3中存储123.4将是X'01234F'(三个字节),而在同一字段中的-123是X'01230D'。
2)让发送方将该字段转换为"正在使用","正在显示","正在领先"(或者"正在跟踪")数字字段。这会将数字存储为一串EBCDIC数字,并且将符号存储为单独的负号(-)或者空白字符。在FTP传输中,所有数字和符号均正确转换为等效的ASCII。
EBCDIC翻译的一些有用链接:
转换表可用于检查打包的十进制字段中的某些值:
http://www.simotime.com/asc2ebc1.htm
msdn中的代码页列表:
http://msdn.microsoft.com/zh-cn/library/dd317756(VS.85).aspx
和一段代码来转换C#中的字节数组字段:
// 500 is the code page for IBM EBCDIC International System.Text.Encoding enc = new System.Text.Encoding(500); string value = enc.GetString(byteArrayField);
打包的字段在EBCDIC或者ASCII中相同。不要在它们上运行EBCDIC到ASCII的转换。在.Net中,将它们转储为byte []。
我们可以使用按位掩码和移位来打包/解包。
-但是按位运算仅适用于.Net中的整数类型,因此我们需要跳过一些箍!
优秀的COBOL或者C语言美术师可以为我们指明正确的方向。
找一个老人,付会费(大约三杯啤酒就可以了)。