Java 判断两个文件是否存储相同的内容

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/27379059/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-11 04:15:41  来源:igfitidea点击:

Determine if two files store the same content

javafilecomparison

提问by principal-ideal-domain

How would you write a java function boolean sameContent(Path file1,Path file2)which determines if the two given paths point to files which store the same content? Of course, first, I would check if the file sizes are the same. This is a necessary condition for storing the same content. But then I'd like to listen to your approaches. If the two files are stored on the same hard drive (like in most of my cases) it's probably not the best way to jump too many times between the two streams.

您将如何编写一个 java 函数boolean sameContent(Path file1,Path file2)来确定两个给定的路径是否指向存储相同内容的文件?当然,首先,我会检查文件大小是否相同。这是存储相同内容的必要条件。但是我想听听你的方法。如果这两个文件存储在同一个硬盘上(就像在我的大多数情况下一样),这可能不是在两个流之间跳转太多次的最佳方式。

回答by peterremec

Thisshould help you with your problem:

应该可以帮助您解决问题:

package test;

import java.io.File;
import java.io.IOException;

import org.apache.commons.io.FileUtils;

public class CompareFileContents {

    public static void main(String[] args) throws IOException {

        File file1 = new File("test1.txt");
        File file2 = new File("test2.txt");
        File file3 = new File("test3.txt");

        boolean compare1and2 = FileUtils.contentEquals(file1, file2);
        boolean compare2and3 = FileUtils.contentEquals(file2, file3);
        boolean compare1and3 = FileUtils.contentEquals(file1, file3);

        System.out.println("Are test1.txt and test2.txt the same? " + compare1and2);
        System.out.println("Are test2.txt and test3.txt the same? " + compare2and3);
        System.out.println("Are test1.txt and test3.txt the same? " + compare1and3);
    }
}

回答by SMA

Exactly what FileUtils.contentEqualsmethod of Apache commons IO does and api is here.

FileUtils.contentEqualsApache commons IO 的具体方法和 api 在这里

Try something like:

尝试类似:

File file1 = new File("file1.txt");
File file2 = new File("file2.txt");
boolean isTwoEqual = FileUtils.contentEquals(file1, file2);

It does the following checks before actually doing the comparison:

在实际进行比较之前,它会进行以下检查:

  • existence of both the files
  • Both file's that are passed are to be of file type and not directory.
  • length in bytes should not be the same.
  • Both are different files and not one and the same.
  • Then compare the contents.
  • 两个文件都存在
  • 传递的两个文件都是文件类型而不是目录。
  • 以字节为单位的长度不应相同。
  • 两者都是不同的文件,而不是一回事。
  • 然后比较内容。

回答by Chthonic Project

If you don't want to use any external libraries, then simply read the files into byte arrays and compare them (won't work pre Java-7):

如果您不想使用任何外部库,那么只需将文件读入字节数组并进行比较(在 Java-7 之前不起作用):

byte[] f1 = Files.readAllBytes(file1);
byte[] f2 = Files.readAllBytes(file2);

by using Arrays.equals.

通过使用Arrays.equals

If the files are large, then instead of reading the entire files into arrays, you should use BufferedInputStreamand read the files chunk-by-chunk as explained here.

如果文件很大,则不应将整个文件读入数组,而应BufferedInputStream按照此处的说明逐块使用和读取文件。

回答by icza

If the files are small, you can read both into the memory and compare the byte arrays.

如果文件很小,您可以将两者读入内存并比较字节数组。

If the files are not small, you can either compute the hashes of their content (e.g. MD5 or SHA-1) one after the other and compare the hashes (but this still leaves a very small chance of error), or you can compare their content but for this you still have to read the streams alternating.

如果文件不小,您可以一个接一个地计算它们的内容(例如 MD5 或 SHA-1)的哈希值并比较哈希值(但这仍然会留下很小的错误机会),或者您可以比较它们的内容,但为此您仍然必须交替阅读流。

Here is an example:

下面是一个例子:

boolean sameContent(Path file1, Path file2) throws IOException {
    final long size = Files.size(file1);
    if (size != Files.size(file2))
        return false;

    if (size < 4096)
        return Arrays.equals(Files.readAllBytes(file1), Files.readAllBytes(file2));

    try (InputStream is1 = Files.newInputStream(file1);
         InputStream is2 = Files.newInputStream(file2)) {
        // Compare byte-by-byte.
        // Note that this can be sped up drastically by reading large chunks
        // (e.g. 16 KBs) but care must be taken as InputStream.read(byte[])
        // does not neccessarily read a whole array!
        int data;
        while ((data = is1.read()) != -1)
            if (data != is2.read())
                return false;
    }

    return true;
}

回答by Nolequen

Since Java 12 there is method Files.mismatchwhich returns -1if there is no mismatch in the content of the files. Thus the function would look like following:

从 Java 12 开始,有方法Files.mismatch-1如果文件内容没有不匹配,则返回。因此,该函数将如下所示:

private static boolean sameContent(Path file1, Path file2) throws IOException {
    return Files.mismatch(file1, file2) == -1;
}

回答by yolo

package test;  

      import org.junit.jupiter.api.Test;

      import java.io.IOException;
      import java.nio.file.FileSystems;
      import java.nio.file.Files;
      import java.nio.file.Path;

import static org.junit.Assert.assertEquals;

public class CSVResultDIfference {

   @Test
   public void csvDifference() throws IOException {
       Path file_F = FileSystems.getDefault().getPath("C:\Projekts\csvTestX", "yolo2.csv");
       long size_F = Files.size(file_F);
       Path file_I = FileSystems.getDefault().getPath("C:\Projekts\csvTestZ", "yolo2.csv");
       long size_I = Files.size(file_I);
       assertEquals(size_F, size_I);

   }
}

it worked for me :)

它对我有用:)

回答by mcoolive

If it for unit test, then AssertJprovides a method named hasSameContentAs. An example:

如果是用于单元测试,那么AssertJ提供了一个名为hasSameContentAs的方法。一个例子:

Assertions.assertThat(file1).hasSameContentAs(file2)