unix 和 windows 文件的区别

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/17645/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-15 11:06:42  来源:igfitidea点击:

Differences between unix and windows files

javawindowsunixfile

提问by svrist

Am I correct in assuming that the only difference between "windows files" and "unix files" is the linebreak?

我是否正确假设“windows 文件”和“unix 文件”之间的唯一区别是换行符?

We have a system that has been moved from a windows machine to a unix machine and are having troubles with the format.

我们有一个系统已经从 Windows 机器转移到了 unix 机器并且在格式方面遇到了问题。

I need to automate the translation between unix/windows before the files get delivered to the system in our "transportsystem". I'll probably need something to determine the current format and something to transform it into the other format. If it's just the newline thats the big difference then I'm considering just reading the files with the java.io. As far as I know, they are able to handle both with readLine. And then just write each line back with

在文件被传送到我们的“传输系统”中的系统之前,我需要在 unix/windows 之间自动转换。我可能需要一些东西来确定当前的格式,并将其转换为其他格式。如果只是换行符是最大的不同,那么我正在考虑使用 java.io 读取文件。据我所知,他们能够使用 readLine 处理这两者。然后只需将每一行写回

while (line = readline)
    print(line + NewlineInOtherFormat)
....


Summary:

概括:

samjudson:

This is only a difference in text files, where UNIX uses a single Line Feed (LF) to signify a new line, Windows uses a Carriage Return/Line Feed (CRLF) and Mac uses just a CR.

to which Cebjyreelaborates:

OS X uses LF, the same as UNIX - MacOS 9 and below did use CR though

Mo

There could also be a difference in character encoding for national characters. There is no "unix-encoding" but many linux-variants use UTF-8 as the default encoding. Mac OS (which is also a unix) uses its own encoding (macroman). I am not sure, what windows default encoding is.

McDowell

In addition to the new-line differences, the byte-order mark can cause problems if files are treated as Unicode on Windows.

Cheekysoft

However, another set of problems that you may come across can be related to single/multi-byte character encodings. If you see strange unexpected chars (not at end-of-line) then this could be the reason. Especially if you see square boxes, question marks, upside-down question marks, extra characters or unexpected accented characters.

Sadie

On unix, files that start with a . are hidden. On windows, it's a filesystem flag that you probably don't have easy access to. This may result in files that are supposed to be hidden now becoming visible on the client machines.

File permissions vary between the two. You will probably find, when you copy files onto a unix system, that the files now belong to the user that did the copying and have limited rights. You'll need to use chown/chmod to make sure the correct users have access to them.

萨姆贾德森

这只是文本文件的区别,其中 UNIX 使用单个换行符 (LF) 来表示新行,Windows 使用回车/换行符 (CRLF),Mac 仅使用 CR。

Cebjyre阐述:

OS X 使用 LF,与 UNIX 相同 - MacOS 9 及以下确实使用 CR

国家字符的字符编码也可能有所不同。没有“unix-encoding”,但许多 linux-variants 使用 UTF-8 作为默认编码。Mac OS(也是 unix)使用自己的编码(宏指令)。我不确定,Windows 默认编码是什么。

麦克道尔

除了换行符差异之外,如果文件在 Windows 上被视为 Unicode,则字节顺序标记可能会导致问题。

厚脸皮

但是,您可能遇到的另一组问题可能与单字节/多字节字符编码有关。如果您看到奇怪的意外字符(不是在行尾),那么这可能是原因。尤其是当您看到方框、问号、倒置的问号、多余的字符或意外的重音字符时。

萨迪

在 unix 上,以 . 是隐藏的。在 Windows 上,它是一个您可能不容易访问的文件系统标志。这可能会导致本应隐藏的文件现在在客户端计算机上可见。

文件权限在两者之间有所不同。您可能会发现,当您将文件复制到 unix 系统时,这些文件现在属于进行复制的用户并且具有有限的权限。您需要使用 chown/chmod 来确保正确的用户可以访问它们。

There exists tools to help with the problem:

有一些工具可以帮助解决这个问题:

pauldoo

If you are just interested in the content of text files, then yes the line endings are different. Take a look at something like dos2unix, it may be of help here.

Cheekysoft

As pauldoo suggests, tools like dos2unix can be very useful. Note that these may be on your linux/unix system as fromdos or tofrodos, or perhaps even as the general purpose toolbox recode.

保尔杜

如果您只对文本文件的内容感兴趣,那么行结尾是不同的。看看像 dos2unix 这样的东西,它可能对这里有帮助。

厚脸皮

正如 pauldoo 所说,像 dos2unix 这样的工具非常有用。请注意,这些可能在您的 linux/unix 系统上作为 fromdos 或 tofrodos,或者甚至作为通用工具箱重新编码。

Help for java coding

java编码的帮助

Cheekysoft

When writing to files or reading from files (that you are in control of), it is often worth specifying the encoding to use, as most Java methods allow this. However, also ensuring that the system locale matches can save a lot of pain

厚脸皮

写入文件或读取文件(您可以控制)时,通常值得指定要使用的编码,因为大多数 Java 方法都允许这样做。但是,同时确保系统区域设置匹配可以省去很多痛苦

采纳答案by samjudson

This is only a difference in text files, where UNIX uses a single Line Feed (LF) to signify a new line, Windows uses a Carriage Return/Line Feed (CRLF) and Mac uses just a CR.

这只是文本文件的区别,其中 UNIX 使用单个换行符 (LF) 来表示新行,Windows 使用回车/换行符 (CRLF),Mac 仅使用 CR。

Binary files there should be no difference (i.e. a JPEG on a windows machine will be byte for byte the same as the same JPEG on a unix box.)

二进制文件应该没有区别(即 Windows 机器上的 JPEG 将逐字节与 unix 机器上的相同 JPEG 相同。)

回答by Mo.

There could also be a difference in character encoding for national characters. There is no "unix-encoding" but many linux-variants use UTF-8 as the default encoding. Mac OS (which is also a unix) uses its own encoding (macroman). I am not sure, what windows default encoding is.

国家字符的字符编码也可能有所不同。没有“unix-encoding”,但许多 linux-variants 使用 UTF-8 作为默认编码。Mac OS(也是 unix)使用自己的编码(宏指令)。我不确定,Windows 默认编码是什么。

But this could be another source of trouble (apart from the different linebreaks).

但这可能是另一个麻烦来源(除了不同的换行符)。

What are your problems? The linebreak-related problems can be easily corrected with the programs dos2unix or unix2dos on the unix-machine

你有什么问题?可以使用 unix 机器上的程序 dos2unix 或 unix2dos 轻松纠正与换行相关的问题

回答by pauldoo

If you are just interested in the content of text files, then yes the line endings are different. Take a look at something like dos2unix, it may be of help here.

如果您只对文本文件的内容感兴趣,那么行结尾是不同的。看看类似dos2unix 的东西,它可能对这里有帮助。

(Of course there are many other things that make unix and windows files different, but I don't think you're interested in those other differences right now.)

(当然还有许多其他因素使 unix 和 windows 文件不同,但我认为您现在对这些其他差异不感兴趣。)

回答by Marcus Downing

In addition to the answers given, you may find issues with the different file systems:

除了给出的答案之外,您可能还会发现不同文件系统的问题:

  • On unix, files that start with a .are hidden. On windows, it's a filesystem flag that you probably don't have easy access to. This may result in files that are supposed to be hidden now becoming visible on the client machines.

  • File permissions vary between the two. You will probably find, when you copy files onto a unix system, that the files now belong to the user that did the copying and have limited rights. You'll need to use chown/chmodto make sure the correct users have access to them.

  • 在 unix 上,以. 是隐藏的。在 Windows 上,它是一个您可能不容易访问的文件系统标志。这可能会导致本应隐藏的文件现在在客户端计算机上可见。

  • 文件权限在两者之间有所不同。您可能会发现,当您将文件复制到 unix 系统时,这些文件现在属于进行复制的用户并且具有有限的权限。您需要使用chown/chmod来确保正确的用户可以访问它们。

回答by McDowell

In addition to the new-line differences, the byte-order markcan cause problems if files are treated as Unicode on Windows.

除了换行符差异之外,如果文件在 Windows 上被视为 Unicode ,则字节顺序标记可能会导致问题。

回答by Cheekysoft

As pauldoo suggests, tools like dos2unix can be very useful. Note that these may be on your linux/unix system as fromdosor tofrodos, or perhaps even as the general purpose toolbox recode.

正如 pauldoo 所说,像 dos2unix 这样的工具非常有用。请注意,这些可能在您的 linux/unix 系统上作为fromdostofrodos,或者甚至作为通用工具箱recode

However, another set of problems that you may come across can be related to single/multi-byte character encodings. If you see strange unexpected chars (not at end-of-line) then this could be the reason. Especially if you see square boxes, question marks, upside-down question marks, extra characters or unexpected accented characters.

但是,您可能遇到的另一组问题可能与单字节/多字节字符编码有关。如果您看到奇怪的意外字符(不是在行尾),那么这可能是原因。尤其是当您看到方框、问号、倒置的问号、多余的字符或意外的重音字符时。

Running the command localeon your *nix box will tell you what the system locale is. If this is different to the encoding used in the text files that have been transferred over from the windows machine, then this can sometimes cause issues, depending on the usage of those files. You can use the very powerful recodecommand to try and convert between the different charsets as well as any line ending issues. recode -lwill show you all of the formats and encodings that the tool can convert between. It is likely to be a VERY long list.

在 *nix 框上运行命令locale将告诉您系统区域设置是什么。如果这与从 Windows 机器传输过来的文本文件中使用的编码不同,那么这有时会导致问题,具体取决于这些文件的使用情况。您可以使用非常强大的recode命令来尝试在不同的字符集以及任何行结束问题之间进行转换。recode -l将向您显示该工具可以转换的所有格式和编码。这可能是一个很长的清单。

When writing to files or reading from files (that you are in control of), it is often worth specifying the encoding to use, as most Java methods allow this. However, also ensuring that the system locale matches can save a lot of pain.

写入文件或读取文件(您可以控制)时,通常值得指定要使用的编码,因为大多数 Java 方法都允许这样做。但是,同时确保系统区域设置匹配可以省去很多麻烦。