Java Windows 和 Linux 的行尾是否不同？

Question

提问by jbu

I am trying to parse the Linux /etc/passwdfile in Java. I'm currently reading each line through the java.util.Scannerclass and then using java.lang.String.split(String)to delimit each line.

我正在尝试/etc/passwd用 Java解析 Linux文件。我目前正在通过java.util.Scanner课程阅读每一行，然后使用它java.lang.String.split(String)来分隔每一行。

The problem is that the line:

问题是该行：

list:x:38:38:Mailing List Manager:/var/list:/bin/sh"

is treated by the scanner as 3 different lines:

扫描仪将其视为 3 条不同的行：

list:x:38:38:Mailing
List
Manager...

list:x:38:38:Mailing
List
Manager...

When I type this out into a new file that I didn't get from Linux, Scannerparses it properly.

当我将其输入到一个不是从 Linux 获得的新文件中时，Scanner会正确解析它。

Is there something I'm not understanding about new lines in Linux?

我对 Linux 中的新行有什么不理解的地方吗？

Obviously a work around is to parse it without using scanner, but it wouldn't be elegant. Does anyone know of an elegant way to do it?

显然，解决方法是在不使用扫描仪的情况下解析它，但它不会很优雅。有谁知道一种优雅的方式来做到这一点？

Is there a way to convert the file into one that would work with Scanner?

有没有办法将文件转换为可以使用的文件Scanner？

Not even two days ago: Historical reason behind different line ending at different platforms

两天前：不同线路结束不同平台的历史原因

EDIT

编辑

Note from the original author:

原作者注：

"I figured out I have a different error that is causing the problem. Disregard question"

“我发现我有一个不同的错误导致了问题。无视问题”

Answer 1

回答by Michael Haren

From Wikipedia:

来自维基百科：

LF: Multics, Unix and Unix-like systems (GNU/Linux, AIX, Xenix, Mac OS X, FreeBSD, etc.), BeOS, Amiga, RISC OS, and others
CR+LF: DEC RT-11 and most other early non-Unix, non-IBM OSes, CP/M, MP/M, DOS, OS/2, Microsoft Windows, Symbian OS
CR: Commodore machines, Apple II family, Mac OS up to version 9and OS-9

LF：Multics、Unix 和类 Unix 系统（GNU/ Linux、AIX、Xenix、Mac OS X、FreeBSD 等）、BeOS、Amiga、RISC OS 等
CR+LF：DEC RT-11 和大多数其他早期非 Unix、非 IBM 操作系统、CP/M、MP/M、DOS、OS/2、Microsoft Windows、Symbian 操作系统
CR：Commodore 机器、Apple II 系列、Mac OS 版本 9和 OS-9

I translate this into these line endings in general:

我翻译成这些行结束这个一般：

Windows: '\r\n'
Mac (OS 9-): '\r'
Mac (OS 10+): '\n'
Unix/Linux: '\n'

视窗： '\r\n'
Mac (OS 9-): '\r'
Mac（操作系统 10+）： '\n'
Unix/Linux： '\n'

You need to make your scanner/parser handle the unix version, too.

你也需要让你的扫描器/解析器处理 unix 版本。

Answer 2

回答by davetron5000

Why not use LineNumberReader?

为什么不使用LineNumberReader？

If you can't do that, what does the code look like?

如果你不能这样做，代码是什么样的？

The only difference I can think of is that you are splitting on a bad regex and that when you edit the file yourself, you get dos newlines that somehow pass your regex.

我能想到的唯一区别是，您正在拆分错误的正则表达式，并且当您自己编辑文件时，您会得到以某种方式通过正则表达式的 dos 换行符。

Still, for reading things one line at a time, it seems like overkill to use Scanner.

尽管如此，为了一次阅读一行内容，使用Scanner.

Of course, why you are parsing /etc/passwdis a hole other discussion :)

当然，为什么要解析/etc/passwd是其他讨论的一个漏洞:)

Answer 3

回答by Kevin Haines

The scanner is breaking at the spaces.

扫描仪在空格处损坏。

EDIT: The 'Scanning' Java Tutorialstates:

编辑：“扫描”Java 教程指出：

By default, a scanner uses white space to separate tokens. (White space characters include blanks, tabs, and line terminators. For the full list, refer to the documentation for Character.isWhitespace.)

默认情况下，扫描仪使用空格来分隔标记。（空白字符包括空格、制表符和行终止符。有关完整列表，请参阅 Character.isWhitespace 的文档。）

You can use the useDelimiter() method to change these defaults.

您可以使用 useDelimiter() 方法来更改这些默认值。

Answer 4

回答by nEJC

This works for me on Ubuntu

这在 Ubuntu 上对我有用

import java.util.Scanner;
import java.io.File;

public class test {
  public static void main(String[] args) {
    try {
      Scanner sc = new Scanner(new File("/etc/passwd"));
      String l;
      while( ( l = sc.nextLine() ) != null ) {
        String[] p = l.split(":");
        for(String pi: p) System.out.print( pi + "\t:\t" );
        System.out.println();
      }
    } catch(Exception e) { e.printStackTrace(); }
  }
}

Answer 5

回答by Chase Seibert

You can get the standard line ending for your current OS from:

您可以从以下位置获取当前操作系统的标准行结尾：

System.getProperty("line.separator")

Java Windows 和 Linux 的行尾是否不同？

提问by jbu

回答by Michael Haren

回答by davetron5000

回答by Kevin Haines

回答by nEJC

回答by Chase Seibert

相关推荐

最近更新

标签

Java Windows 和 Linux 的行尾是否不同？

提问by jbu

回答by Michael Haren

回答by davetron5000

回答by Kevin Haines

回答by nEJC

回答by Chase Seibert

相关推荐

Java 在没有 Git 存储库的情况下使用 Spring Cloud Config

Java 如何使用堆栈跟踪或反射找到方法的调用者？

尝试在空对象引用上调用虚拟方法“void java.io.BufferedReader.close()”

Java 是否为任何字符（例如 SPACE）定义了常量？

相关推荐

最近更新

标签