Java “EOF”字符的十六进制代码在哪里?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/24991803/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Where is hex code of the "EOF" character?
提问by
As far as know in the end of all files, specially text files, there is a Hex code for EOFor NULLcharacter. And when we want to write a program and read the contents of a text file, we send the read function until we receive that EOF hexcode.
据了解,在所有文件的末尾,特别是文本文件,EOF或NULL字符都有一个十六进制代码。当我们想要编写程序并读取文本文件的内容时,我们会发送 read 函数,直到我们收到 EOF 十六进制代码。
My question : I downloaded some tools to see a hex view of a text file. but I can't see any hex code for EOF(End Of File/NULL) or EOT(End Of Text)
我的问题:我下载了一些工具来查看文本文件的十六进制视图。但我看不到EOF(文件结尾/NULL)或EOT(文本结尾)的任何十六进制代码
ASCII/Hex code tables :
ASCII/十六进制代码表:
This is output of Hex viewer tools:
这是十六进制查看器工具的输出:
Note : My input file is a text file that its content is "Where is hex code of "EOF"?"
注意:我的输入文件是一个文本文件,其内容是“EOF”的十六进制代码在哪里?
Appreciate your time and consideration.
感谢您的时间和考虑。
采纳答案by David Xu
There is no such thing as a EOF character. The operating system knows exactly how many bytes a file contains (this is stored alongside other metadata like permissions, creation date, and the name), and hence can tell programs that try to read the eleventh byte of a ten byte file: You've reached the end of file, there are no more bytes to read.
没有 EOF 字符这样的东西。操作系统确切地知道文件包含多少字节(这与权限、创建日期和名称等其他元数据一起存储),因此可以告诉尝试读取十字节文件的第十一个字节的程序:到达文件末尾,没有更多的字节要读取。
In fact, the "EOF" value returned for example by C functions like getchar
is explicitly an int
value outside the range of a byte, so it cannot possibly be stored in a file!
事实上,例如由 C 函数返回的“EOF”值getchar
是显式超出 byte 范围的int
值,因此它不可能存储在文件中!
Sometimes, certain file formats insist on adding NUL terminators (probably because that's how strings are usually stored in C), though usually these delimit multiple records in a single file, not the file as a whole. And such decoration usually disqualifies a file from being considered a "text file".
有时,某些文件格式坚持添加 NUL 终止符(可能是因为字符串在 C 中通常是这样存储的),尽管这些通常在单个文件中分隔多个记录,而不是整个文件。而这样的修饰通常会使文件失去被视为“文本文件”的资格。
ASCII codes like ETX and NUL date back to the days of teletypewriters and friends. NUL is used in C for in-memorystrings, but this has no bearing on file systems.
ETX 和 NUL 等 ASCII 代码可以追溯到电传打字机和朋友的时代。NUL 在 C 中用于内存中字符串,但这与文件系统无关。
回答by David Xu
There is no such thing as EOF. EOF is just a value returned by file reading functions to tell you the file pointer reached the end of the file.
没有 EOF 这样的东西。EOF 只是文件读取函数返回的值,用于告诉您文件指针已到达文件末尾。
回答by Joop Eggen
There once were even different EOF characters (for different operating systems). No longer seen one. (Typically files were in blocks of 128 bytes.) For coding a PITA, like nowadays BOMs.
甚至曾经有不同的 EOF 字符(针对不同的操作系统)。一个都不见了。(通常文件以 128 字节的块为单位。)用于编写 PITA,就像现在的 BOM。
Instead there is still a int read()
that normally delivers a byte value, but for EOF delivers -1.
相反,仍然有一个int read()
通常传递一个字节值,但对于 EOF 传递 -1。
The NUL character is a string terminator in C. In java you can have a NUL character in the middle of a string. To be cooperative with C, the UTF-8 bytes generated use a multi-byte encoding both for Unicode characters > 127 and for NUL.
NUL 字符是 C 中的字符串终止符。在 Java 中,您可以在字符串中间有一个 NUL 字符。为了与 C 协作,生成的 UTF-8 字节对 Unicode 字符 > 127 和 NUL 都使用多字节编码。
(Some of this is probably known already.)
(其中一些可能已经知道了。)
回答by OldCurmudgeon
There was - a long long time ago - an End Of Filemarker but it hasn't been used in files for many years.
很久很久以前,有一个文件结束标记,但多年来一直没有在文件中使用。
You can demonstrate a distant echo of it on windows using:
您可以使用以下方法在 Windows 上演示它的遥远回声:
C:\>copy con junk.txt
Hello
Hello again
- Press <Ctrl> and <z>
C:\>dump junk.txt
junk.txt:
00000000 4865 6c6c 6f0d 0a48 656c 6c6f 2061 6761 Hello..Hello aga
00000010 696e 0d0a in..
C:\>
Note the use of Ctrl-Z
as an EOT marker.
请注意Ctrl-Z
用作 EOT 标记。
However, notice also that the Ctrl-Z
does not appear in the file any more - it used to appear as a 0x1a
but only on some operating systems and even then not consistently.
但是,还要注意Ctrl-Z
不再出现在文件中 - 它曾经显示为0x1a
但仅在某些操作系统上出现,即使如此也不一致。
Use of ETX
(0x03
) stopped even before those dim and distant times.
甚至在那些昏暗和遥远的时代之前就停止使用ETX
( 0x03
) 了。
回答by Carlos R Canas
You need the end of file character in certain instances for example sending a file to a printer from a Unix computer. Most windows/dos enabled printers expect the end-of-file marker to print the file stored in their memories. If no end-of-file marker is sent, the printer just sits until it times out (normally 2 minutes) and then prints the file. If you use lpr to print from Unix, you should make sure to include the end-of-file marker. Windows/dos attach it automatically and the printers are designed to wait fot it.
在某些情况下,您需要文件结尾字符,例如从 Unix 计算机向打印机发送文件。大多数启用 windows/dos 的打印机都希望文件结束标记打印存储在其内存中的文件。如果没有发送文件结束标记,打印机将一直等待直到超时(通常为 2 分钟),然后打印文件。如果您使用 lpr 从 Unix 打印,您应该确保包含文件结束标记。Windows/dos 会自动附加它,而打印机设计为等待它。
回答by kralyk
The EOT
byte (0x04
) is used to this day by unix tty terminals to indicate end of input. You type it with a Ctrl+ D(ie. ^D
) to end input to shells or any other program reading from stdin.
的EOT
字节(0x04
)被用于这一天由UNIX TTY终端以指示输入结束。您使用Ctrl+ D(即^D
)键入它以结束对 shell 或从标准输入读取的任何其他程序的输入。
However, as others have pointed out, this is distinct from EOF, which is a condition rather than a piece of data per se.
然而,正如其他人所指出的,这与 EOF 不同,EOF 是一种条件,而不是一条数据本身。
回答by mckenzm
In the 7bit Wintel world it is 0x1A or chr(26).
在 7 位 Wintel 世界中,它是 0x1A 或 chr(26)。
It is still commonly found in older text files and archives and is still produced by some file transmission protocols. In particular text files downloaded from BBS systems were commonly terminated with this character.
它仍然常见于较旧的文本文件和档案中,并且仍然由某些文件传输协议产生。特别是从 BBS 系统下载的文本文件通常以这个字符结尾。
There are other such sentinel values for older systems, and like EOL (CR,LF,CR+LF) needs to be anticipated from time to time.
对于较旧的系统,还有其他此类哨兵值,并且需要不时预测 EOL (CR,LF,CR+LF)。
It can be a source of annoyance to see it still being used, on the same level as return(0) for instance.
看到它仍在使用可能会令人烦恼,例如与 return(0) 处于同一级别。