如何在 Java 中规范 EOL 字符?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3776923/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-14 04:50:38  来源:igfitidea点击:

How can I normalize the EOL character in Java?

javaline-endings

提问by bilal

I have a linux server and many clients with many operating systems. The server takes an input file from clients. Linux has end of line char LF, while Mac has end of line char CR, and Windows has end of line char CR+LF

我有一个 linux 服务器和许多具有许多操作系统的客户端。服务器从客户端获取输入文件。Linux 有行尾字符 LF,而 Mac 有行尾字符 CR,Windows 有行尾字符 CR+LF

The server needs as end of line char LF. Using java, I want to ensure that the file will always use the linux eol char LF. How can I achieve it?

服务器需要作为行尾字符 LF。使用java,我想确保该文件将始终使用linux eol char LF。我怎样才能实现它?

采纳答案by lalli

Combining the two answers (by Visage & eumiro):

结合两个答案(由 Visage 和 eumiro):

EDIT:After reading the comment. line. System.getProperty("line.separator")has no use then.
Before sending the file to server, open it replace all the EOLs and writeback
Make sure to use DataStreams to do so, and write in binary

编辑:阅读评论后。线。 System.getProperty("line.separator")那就没用了。
在将文件发送到服务器之前,打开它替换所有 EOL 并回写
确保使用 DataStreams 这样做,并以二进制写入

String fileString;
//..
//read from the file
//..
//for windows
fileString = fileString.replaceAll("\r\n", "\n");
fileString = fileString.replaceAll("\r", "\n");
//..
//write to file in binary mode.. something like:
DataOutputStream os = new DataOutputStream(new FileOutputStream("fname.txt"));
os.write(fileString.getBytes());
//..
//send file
//..

The replaceAllmethod has two arguments, the first one is the string to replace and the second one is the replacement. But, the first one is treated as a regular expression, so, '\'is interpreted that way. So:

replaceAll方法有两个参数,第一个是要替换的字符串,第二个是替换的字符串。但是,第一个被视为正则表达式,因此'\'以这种方式解释。所以:

"\r\n" is converted to "\r\n" by Regex
"\r\n" is converted to CR+LF by Java

回答by eumiro

Could you try this?

你能试试这个吗?

content.replaceAll("\r\n?", "\n")

回答by PaulJWilliams

Use

System.getProperty("line.separator")

That will give you the (local) EOL character(s). You can then use an analysis of the incomifile to determine what 'flavour' it is and convert accordingly.

这将为您提供(本地)EOL 字符。然后,您可以使用对 incomifile 的分析来确定它是什么“风味”并相应地进行转换。

Alternatively, get your clients to standardise!

或者,让您的客户标准化!

回答by tyjen

Had to do this for a recent project. The method below will normalize the line endings in the given file to the line ending specified by the OS the JVM is running on. So if you JVM is running on Linux, this will normalize all line endings to LF (\n).

最近的一个项目不得不这样做。下面的方法将给定文件中的行尾标准化为运行 JVM 的操作系统指定的行尾。因此,如果您的 JVM 在 Linux 上运行,这会将所有行结尾规范化为 LF (\n)。

Also works on very large files due to the use of buffered streams.

由于使用缓冲流,也适用于非常大的文件。

public static void normalizeFile(File f) {      
    File temp = null;
    BufferedReader bufferIn = null;
    BufferedWriter bufferOut = null;        

    try {           
        if(f.exists()) {
            // Create a new temp file to write to
            temp = new File(f.getAbsolutePath() + ".normalized");
            temp.createNewFile();

            // Get a stream to read from the file un-normalized file
            FileInputStream fileIn = new FileInputStream(f);
            DataInputStream dataIn = new DataInputStream(fileIn);
            bufferIn = new BufferedReader(new InputStreamReader(dataIn));

            // Get a stream to write to the normalized file
            FileOutputStream fileOut = new FileOutputStream(temp);
            DataOutputStream dataOut = new DataOutputStream(fileOut);
            bufferOut = new BufferedWriter(new OutputStreamWriter(dataOut));

            // For each line in the un-normalized file
            String line;
            while ((line = bufferIn.readLine()) != null) {
                // Write the original line plus the operating-system dependent newline
                bufferOut.write(line);
                bufferOut.newLine();                                
            }

            bufferIn.close();
            bufferOut.close();

            // Remove the original file
            f.delete();

            // And rename the original file to the new one
            temp.renameTo(f);
        } else {
            // If the file doesn't exist...
            log.warn("Could not find file to open: " + f.getAbsolutePath());
        }
    } catch (Exception e) {
        log.warn(e.getMessage(), e);
    } finally {
        // Clean up, temp should never exist
        FileUtils.deleteQuietly(temp);
        IOUtils.closeQuietly(bufferIn);
        IOUtils.closeQuietly(bufferOut);
    }
}

回答by Matthias

Here is a comprehensive helper class to deal with EOL issues. It it partially based on the solution posted by tyjen.

这是一个处理 EOL 问题的综合助手类。它部分基于 tyjen 发布的解决方案。

import java.io.BufferedInputStream;
import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.DataInputStream;
import java.io.DataOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.OutputStreamWriter;

import org.apache.commons.io.FileUtils;
import org.apache.commons.io.IOUtils;

/**
 * Helper class to deal with end-of-line markers in text files.
 * 
 * Loosely based on these examples:
 *  - http://stackoverflow.com/a/9456947/1084488 (cc by-sa 3.0)
 *  - http://svn.apache.org/repos/asf/tomcat/trunk/java/org/apache/tomcat/buildutil/CheckEol.java (Apache License v2.0)
 * 
 * This file is posted here to meet the "ShareAlike" requirement of cc by-sa 3.0:
 *    http://stackoverflow.com/a/27930311/1084488
 * 
 * @author Matthias Stevens
 */
public class EOLUtils
{

    /**
     * Unix-style end-of-line marker (LF)
     */
    private static final String EOL_UNIX = "\n";

    /**
     * Windows-style end-of-line marker (CRLF)
     */
    private static final String EOL_WINDOWS = "\r\n";

    /**
     * "Old Mac"-style end-of-line marker (CR)
     */
    private static final String EOL_OLD_MAC = "\r";

    /**
     * Default end-of-line marker on current system
     */
    private static final String EOL_SYSTEM_DEFAULT = System.getProperty( "line.separator" );

    /**
     * The support end-of-line marker modes
     */
    public static enum Mode
    {
        /**
         * Unix-style end-of-line marker ("\n")
         */
        LF,

        /**
         * Windows-style end-of-line marker ("\r\n") 
         */
        CRLF,

        /**
         * "Old Mac"-style end-of-line marker ("\r")
         */
        CR
    }

    /**
     * The default end-of-line marker mode for the current system
     */
    public static final Mode SYSTEM_DEFAULT = ( EOL_SYSTEM_DEFAULT.equals( EOL_UNIX ) ? Mode.LF : ( EOL_SYSTEM_DEFAULT
        .equals( EOL_WINDOWS ) ? Mode.CRLF : ( EOL_SYSTEM_DEFAULT.equals( EOL_OLD_MAC ) ? Mode.CR : null ) ) );
    static
    {
        // Just in case...
        if ( SYSTEM_DEFAULT == null )
        {
            throw new IllegalStateException( "Could not determine system default end-of-line marker" );
        }
    }

    /**
     * Determines the end-of-line {@link Mode} of a text file.
     * 
     * @param textFile the file to investigate
     * @return the end-of-line {@link Mode} of the given file, or {@code null} if it could not be determined
     * @throws Exception
     */
    public static Mode determineEOL( File textFile )
        throws Exception
    {
        if ( !textFile.exists() )
        {
            throw new IOException( "Could not find file to open: " + textFile.getAbsolutePath() );
        }

        FileInputStream fileIn = new FileInputStream( textFile );
        BufferedInputStream bufferIn = new BufferedInputStream( fileIn );
        try
        {
            int prev = -1;
            int ch;
            while ( ( ch = bufferIn.read() ) != -1 )
            {
                if ( ch == '\n' )
                {
                    if ( prev == '\r' )
                    {
                        return Mode.CRLF;
                    }
                    else
                    {
                        return Mode.LF;
                    }
                }
                else if ( prev == '\r' )
                {
                    return Mode.CR;
                }
                prev = ch;
            }
            throw new Exception( "Could not determine end-of-line marker mode" );
        }
        catch ( IOException ioe )
        {
            throw new Exception( "Could not determine end-of-line marker mode", ioe );
        }
        finally
        {
            // Clean up:
            IOUtils.closeQuietly( bufferIn );
        }
    }

    /**
     * Checks whether the given text file has Windows-style (CRLF) line endings.
     * 
     * @param textFile the file to investigate
     * @return
     * @throws Exception
     */
    public static boolean hasWindowsEOL( File textFile )
        throws Exception
    {
        return Mode.CRLF.equals( determineEOL( textFile ) );
    }

    /**
     * Checks whether the given text file has Unix-style (LF) line endings.
     * 
     * @param textFile the file to investigate
     * @return
     * @throws Exception
     */
    public static boolean hasUnixEOL( File textFile )
        throws Exception
    {
        return Mode.LF.equals( determineEOL( textFile ) );
    }

    /**
     * Checks whether the given text file has "Old Mac"-style (CR) line endings.
     * 
     * @param textFile the file to investigate
     * @return
     * @throws Exception
     */
    public static boolean hasOldMacEOL( File textFile )
        throws Exception
    {
        return Mode.CR.equals( determineEOL( textFile ) );
    }

    /**
     * Checks whether the given text file has line endings that conform to the system default mode (e.g. LF on Unix).
     * 
     * @param textFile the file to investigate
     * @return
     * @throws Exception
     */
    public static boolean hasSystemDefaultEOL( File textFile )
        throws Exception
    {
        return SYSTEM_DEFAULT.equals( determineEOL( textFile ) );
    }

    /**
     * Convert the line endings in the given file to Unix-style (LF).
     * 
     * @param textFile the file to process
     * @throws IOException
     */
    public static void convertToUnixEOL( File textFile )
        throws IOException
    {
        convertLineEndings( textFile, EOL_UNIX );
    }

    /**
     * Convert the line endings in the given file to Windows-style (CRLF).
     * 
     * @param textFile the file to process
     * @throws IOException
     */
    public static void convertToWindowsEOL( File textFile )
        throws IOException
    {
        convertLineEndings( textFile, EOL_WINDOWS );
    }

    /**
     * Convert the line endings in the given file to "Old Mac"-style (CR).
     * 
     * @param textFile the file to process
     * @throws IOException
     */
    public static void convertToOldMacEOL( File textFile )
        throws IOException
    {
        convertLineEndings( textFile, EOL_OLD_MAC );
    }

    /**
     * Convert the line endings in the given file to the system default mode.
     * 
     * @param textFile the file to process
     * @throws IOException
     */
    public static void convertToSystemEOL( File textFile )
        throws IOException
    {
        convertLineEndings( textFile, EOL_SYSTEM_DEFAULT );
    }

    /**
     * Line endings conversion method.
     * 
     * @param textFile the file to process
     * @param eol the end-of-line marker to use (as a {@link String})
     * @throws IOException 
     */
    private static void convertLineEndings( File textFile, String eol )
        throws IOException
    {
        File temp = null;
        BufferedReader bufferIn = null;
        BufferedWriter bufferOut = null;

        try
        {
            if ( textFile.exists() )
            {
                // Create a new temp file to write to
                temp = new File( textFile.getAbsolutePath() + ".normalized" );
                temp.createNewFile();

                // Get a stream to read from the file un-normalized file
                FileInputStream fileIn = new FileInputStream( textFile );
                DataInputStream dataIn = new DataInputStream( fileIn );
                bufferIn = new BufferedReader( new InputStreamReader( dataIn ) );

                // Get a stream to write to the normalized file
                FileOutputStream fileOut = new FileOutputStream( temp );
                DataOutputStream dataOut = new DataOutputStream( fileOut );
                bufferOut = new BufferedWriter( new OutputStreamWriter( dataOut ) );

                // For each line in the un-normalized file
                String line;
                while ( ( line = bufferIn.readLine() ) != null )
                {
                    // Write the original line plus the operating-system dependent newline
                    bufferOut.write( line );
                    bufferOut.write( eol ); // write EOL marker
                }

                // Close buffered reader & writer:
                bufferIn.close();
                bufferOut.close();

                // Remove the original file
                textFile.delete();

                // And rename the original file to the new one
                temp.renameTo( textFile );
            }
            else
            {
                // If the file doesn't exist...
                throw new IOException( "Could not find file to open: " + textFile.getAbsolutePath() );
            }
        }
        finally
        {
            // Clean up, temp should never exist
            FileUtils.deleteQuietly( temp );
            IOUtils.closeQuietly( bufferIn );
            IOUtils.closeQuietly( bufferOut );
        }
    }

}

回答by Grigory Kislin

public static String normalize(String val) {
    return val.replace("\r\n", "\n")
            .replace("\r", "\n");
}

For HTML:

对于 HTML:

public static String normalize(String val) {
    return val.replace("\r\n", "<br/>")
            .replace("\n", "<br/>")
            .replace("\r", "<br/>");
}

回答by Guy M

solution to change the file ending with recursive search in path

更改路径中以递归搜索结尾的文件的解决方案

package handleFileLineEnd;

import java.io.File;
import java.io.IOException;
import java.nio.charset.Charset;
import java.nio.charset.StandardCharsets;
import java.nio.file.FileSystems;
import java.nio.file.Files;
import java.nio.file.OpenOption;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.List;

import sun.awt.image.BytePackedRaster;

public class handleFileEndingMain {

    static int carriageReturnTotal;
    static int newLineTotal;

    public static void main(String[] args)  throws IOException
    {       
        processPath("c:/temp/directories");

        System.out.println("carriageReturnTotal  (files have issue): " + carriageReturnTotal);

        System.out.println("newLineTotal: " + newLineTotal);
    }

    private static void processPath(String path) throws IOException
    {
        File dir = new File(path);
        File[] directoryListing = dir.listFiles();

        if (directoryListing != null) {
            for (File child : directoryListing) {
                if (child.isDirectory())                
                    processPath(child.toString());              
                else
                    checkFile(child.toString());
            }
        } 


    }

    private static void checkFile(String fileName) throws IOException
    {
        Path path = FileSystems.getDefault().getPath(fileName);

        byte[] bytes= Files.readAllBytes(path);

        for (int counter=0; counter<bytes.length; counter++)
        {
            if (bytes[counter] == 13)
            {
                carriageReturnTotal = carriageReturnTotal + 1;

                System.out.println(fileName);
                modifyFile(fileName);
                break;
            }
            if (bytes[counter] == 10)
            {
                newLineTotal = newLineTotal+ 1;
                //System.out.println(fileName);
                break;
            }
        }

    }

    private static void modifyFile(String fileName) throws IOException
    {

        Path path = Paths.get(fileName);
        Charset charset = StandardCharsets.UTF_8;

        String content = new String(Files.readAllBytes(path), charset);
        content = content.replaceAll("\r\n", "\n");
        content = content.replaceAll("\r", "\n");
        Files.write(path, content.getBytes(charset));
    }
}

回答by Eric B.

Although String.replaceAll() is simpler to code, this should perform better since it doesn't go through the regex infrastructure.

尽管 String.replaceAll() 的编码更简单,但它的性能应该更好,因为它不通过正则表达式基础结构。

    /**
 * Accepts a non-null string and returns the string with all end-of-lines
 * normalized to a \n.  This means \r\n and \r will both be normalized to \n.
 * <p>
 *     Impl Notes:  Although regex would have been easier to code, this approach
 *     will be more efficient since it's purpose built for this use case.  Note we only
 *     construct a new StringBuilder and start appending to it if there are new end-of-lines
 *     to be normalized found in the string.  If there are no end-of-lines to be replaced
 *     found in the string, this will simply return the input value.
 * </p>
 *
 * @param inputValue !null, input value that may or may not contain new lines
 * @return the input value that has new lines normalized
 */
static String normalizeNewLines(String inputValue){
    StringBuilder stringBuilder = null;
    int index = 0;
    int len = inputValue.length();
    while (index < len){
        char c = inputValue.charAt(index);
        if (c == '\r'){
            if (stringBuilder == null){
                stringBuilder = new StringBuilder();
                // build up the string builder so it contains all the prior characters
                stringBuilder.append(inputValue.substring(0, index));
            }
            if ((index + 1 < len) &&
                inputValue.charAt(index + 1) == '\n'){
                // this means we encountered a \r\n  ... move index forward one more character
                index++;
            }
            stringBuilder.append('\n');
        }else{
            if (stringBuilder != null){
                stringBuilder.append(c);
            }
        }
        index++;
    }
    return stringBuilder == null ? inputValue : stringBuilder.toString();
}