java 在java中拆分并连接一个二进制文件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4431945/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-30 06:20:36  来源:igfitidea点击:

Split and join back a binary file in java

javafile-io

提问by Vinay Pandey

I am trying to divide a binary file (like video/audio/image) into chunks of 100kb each and then join those chunks back to get back the original file. My code seems to be working, in the sense that it divides the file and joins the chunks, the file I get back is of the same size as original. However, the problem is that the contents get truncated - that is, if it's a video file it stops after 2 seconds, if it is image file then only the upper part looks correct.

我试图将一个二进制文件(如视频/音频/图像)分成每个 100kb 的块,然后将这些块重新加入以取回原始文件。我的代码似乎正在运行,从它划分文件并连接块的意义上说,我返回的文件与原始文件的大小相同。然而,问题是内容被截断了——也就是说,如果是视频文件,它会在 2 秒后停止,如果是图像文件,则只有上半部分看起来是正确的。

Here is the code I am using (I can post the entire code if you like):

这是我正在使用的代码(如果您愿意,我可以发布整个代码):

For dividing:

用于划分:

File ifile = new File(fname); 
FileInputStream fis;
String newName;
FileOutputStream chunk;
int fileSize = (int) ifile.length();
int nChunks = 0, read = 0, readLength = Chunk_Size;
byte[] byteChunk;
try {
    fis = new FileInputStream(ifile);
    StupidTest.size = (int)ifile.length();
    while (fileSize > 0) {
        if (fileSize <= Chunk_Size) {
            readLength = fileSize;
        }
        byteChunk = new byte[readLength];
        read = fis.read(byteChunk, 0, readLength);
        fileSize -= read;
        assert(read==byteChunk.length);
        nChunks++;
        newName = fname + ".part" + Integer.toString(nChunks - 1);
        chunk = new FileOutputStream(new File(newName));
        chunk.write(byteChunk);
        chunk.flush();
        chunk.close();
        byteChunk = null;
        chunk = null;
    }
    fis.close();
    fis = null;

And for joining file, I put the names of all chunks in a List, then sort it by name and then run the following code:

对于加入文件,我将所有块的名称放在一个列表中,然后按名称对其进行排序,然后运行以下代码:

File ofile = new File(fname);
FileOutputStream fos;
FileInputStream fis;
byte[] fileBytes;
int bytesRead = 0;
try {
    fos = new FileOutputStream(ofile,true);             
    for (File file : files) {
        fis = new FileInputStream(file);
        fileBytes = new byte[(int) file.length()];
        bytesRead = fis.read(fileBytes, 0,(int)  file.length());
        assert(bytesRead == fileBytes.length);
        assert(bytesRead == (int) file.length());
        fos.write(fileBytes);
        fos.flush();
        fileBytes = null;
        fis.close();
        fis = null;
    }
    fos.close();
    fos = null;

采纳答案by BalusC

I can spot only 2 potential mistakes in the code:

我只能在代码中发现 2 个潜在的错误:

int fileSize = (int) ifile.length();

The above fails when the file is over 2GB since an intcannot hold more.

当文件超过 2GB 时,上述操作失败,因为int无法容纳更多。

newName = fname + ".part" + Integer.toString(nChunks - 1);

A filename which is constructed like that should be sorted on a very specific manner. When using default string sorting, name.part10will namely come before name.part2. You'd like to supply a custom Comparatorwhich extracts and parses the part number as an int and then compare by that instead.

像这样构造的文件名应该以非常特定的方式进行排序。使用默认字符串排序时,name.part10即会在name.part2. 您想提供一个自定义Comparator,它将零件编号提取并解析为 int,然后通过它进行比较。

回答by Karl Knechtel

And for joining file, I put the names of all chunks in a List, then sort it by name and then run the following code:

对于加入文件,我将所有块的名称放在一个列表中,然后按名称对其进行排序,然后运行以下代码:

But your names are of the following form:

但是你的名字是以下形式:

newName = fname + ".part" + Integer.toString(nChunks - 1);

Think carefully about what happens if you have 11 or more parts. Which string comes first in alphabetical order: ".part10" or ".part2"? (Answer: ".part10", since '1' comes before '2' in the character encoding.)

仔细考虑如果您有 11 个或更多零件会发生什么。哪个字符串按字母顺序排在最前面:“.part10”还是“.part2”?(答案:“.part10”,因为在字符编码中 '1' 在 '2' 之前。)

回答by 18446744073709551615

public class FileSplitter {
    private static final int BUFSIZE = 4*1024;
    public boolean needsSplitting(String file, int chunkSize) {
        return new File(file).length() > chunkSize;
    }
    private static boolean isASplitFileChunk(String file) {
        return chunkIndexLen(file) > 0;
    }
    private static int chunkIndexLen(String file) {
        int n = numberOfTrailingDigits(file);
        if (n > 0) {
            String zeroes = new String(new char[n]).replace("
import java.io.*;

class Split
{


  public static void main(String args[])throws IOException
   {

    Console con=System.console();
    System.out.println("enter the file name");
    String path=con.readLine();
    File f= new File(path);
    int filesize=(int)f.length();
    FileInputStream fis= new FileInputStream(path); 

    int size;
    System.out.println("enter file size for split");
        size=Integer.parseInt(con.readLine());


    byte b[]=new byte[size];

    int ch,c=0;




    while(filesize>0)
           {
                 ch=fis.read(b,0,size);


        filesize = filesize-ch;


                String fname=c+"."+f.getName()+"";
        c++;
        FileOutputStream fos= new FileOutputStream(new File(fname));
        fos.write(b,0,ch);
        fos.flush();
        fos.close();

        }

fis.close();

}

}
", "0"); if (file.matches(".*\.part[0-9]{"+n+"}?of[0-9]{"+n+"}?$") && !file.endsWith(zeroes) && !chunkNumberStr(file, n).equals(zeroes)) { return n; } } return 0; } private static String getWholeFileName(String chunkName) { int n = chunkIndexLen(chunkName); if (n>0) { return chunkName.substring(0, chunkName.length() - 7 - 2*n); // 7+2n: 1+4+n+2+n : .part012of345 } return chunkName; } private static int getNumberOfChunks(String filename) { int n = chunkIndexLen(filename); if (n > 0) { try { String digits = chunksTotalStr(filename, n); return Integer.parseInt(digits); } catch (NumberFormatException x) { // should never happen } } return 1; } private static int getChunkNumber(String filename) { int n = chunkIndexLen(filename); if (n > 0) { try { // filename.part001of200 String digits = chunkNumberStr(filename, n); return Integer.parseInt(digits)-1; } catch (NumberFormatException x) { } } return 0; } private static int numberOfTrailingDigits(String s) { int n=0, l=s.length()-1; while (l>=0 && Character.isDigit(s.charAt(l))) { n++; l--; } return n; } private static String chunksTotalStr(String filename, int chunkIndexLen) { return filename.substring(filename.length()-chunkIndexLen); } protected static String chunkNumberStr(String filename, int chunkIndexLen) { int p = filename.length() - 2 - 2*chunkIndexLen; // 123of456 return filename.substring(p,p+chunkIndexLen); } // 0,8 ==> part1of8; 7,8 ==> part8of8 private static String chunkFileName(String filename, int n, int total, int chunkIndexLength) { return filename+String.format(".part%0"+chunkIndexLength+"dof%0"+chunkIndexLength+"d", n+1, total); } public static String[] splitFile(String fname, long chunkSize) throws IOException { FileInputStream fis = null; ArrayList<String> res = new ArrayList<String>(); byte[] buffer = new byte[BUFSIZE]; try { long totalSize = new File(fname).length(); int nChunks = (int) ((totalSize + chunkSize - 1) / chunkSize); int chunkIndexLength = String.format("%d", nChunks).length(); fis = new FileInputStream(fname); long written = 0; for (int i=0; written<totalSize; i++) { String chunkFName = chunkFileName(fname, i, nChunks, chunkIndexLength); FileOutputStream fos = new FileOutputStream(chunkFName); try { written += copyStream(fis, buffer, fos, chunkSize); } finally { Closer.closeSilently(fos); } res.add(chunkFName); } } finally { Closer.closeSilently(fis); } return res.toArray(new String[0]); } public static boolean canJoinFile(String chunkName) { int n = chunkIndexLen(chunkName); if (n>0) { int nChunks = getNumberOfChunks(chunkName); String filename = getWholeFileName(chunkName); for (int i=0; i<nChunks; i++) { if (!new File(chunkFileName(filename, i, nChunks, n)).exists()) { return false; } } return true; } return false; } public static void joinChunks(String chunkName) throws IOException { int n = chunkIndexLen(chunkName); if (n>0) { int nChunks = getNumberOfChunks(chunkName); String filename = getWholeFileName(chunkName); byte[] buffer = new byte[BUFSIZE]; FileOutputStream fos = new FileOutputStream(filename); try { for (int i=0; i<nChunks; i++) { FileInputStream fis = new FileInputStream(chunkFileName(filename, i, nChunks, n)); try { copyStream(fis, buffer, fos, -1); } finally { Closer.closeSilently(fis); } } } finally { Closer.closeSilently(fos); } } } public static boolean deleteAllChunks(String chunkName) { boolean res = true; int n = chunkIndexLen(chunkName); if (n>0) { int nChunks = getNumberOfChunks(chunkName); String filename = getWholeFileName(chunkName); for (int i=0; i<nChunks; i++) { File f = new File(chunkFileName(filename, i, nChunks, n)); res &= (f.delete() || !f.exists()); } } return res; } private static long copyStream(FileInputStream fis, byte[] buffer, FileOutputStream fos, long maxAmount) throws IOException { long chunkSizeWritten; for (chunkSizeWritten=0; chunkSizeWritten<maxAmount || maxAmount<0; ) { int toRead = maxAmount < 0 ? buffer.length : (int)Math.min(buffer.length, maxAmount - chunkSizeWritten); int lengthRead = fis.read(buffer, 0, toRead); if (lengthRead < 0) { break; } fos.write(buffer, 0, lengthRead); chunkSizeWritten += lengthRead; } return chunkSizeWritten; } }

Borrow Closerhereor from org.apache.logging.log4j.core.util.

Closer这里从 org.apache.logging.log4j.core.util借用。

回答by Haderlump

Are there more than 10 chunks? Then the program will concatenate *.part1 + *.part10 + *.part2 and so on.

是否有超过 10 个块?然后程序将连接 *.part1 + *.part10 + *.part2 等等。

回答by Manglesh pareek

For splitting the file:----->

用于拆分文件:----->

import java.io.*;
class split{
public static void main(String args[])throws IOException {
String a;
int b;
long len;
Console con=System.console();
System.out.println("Enter File Name: ");
File f=new File(con.readLine());
System.out.println("Enter Destination File Size: ");  
b=Integer.parseInt(con.readLine());
FileInputStream fis=new FileInputStream(f);
len=f.length();
int c=(int)len/b;
if(((int)len%b)!=0)
c++;
for(int i=0;i<c;i++){
File f1=new File(i+""+"."+f);
FileOutputStream fos=new FileOutputStream(f1);
for(int j=0;j<b;j++){   
int ch;
if((ch=fis.read())!=-1)
fos.write(ch); } }
fis.close();
System.out.println("Operation Successful"); }}

回答by Devendra

It takes split file name & destination file size(in byte) form user and split it into subfiles its working for all type of files like(.bin,.jpg,.rar)

它采用拆分文件名和目标文件大小(以字节为单位)形式用户并将其拆分为子文件,它适用于所有类型的文件,如(.bin、.jpg、.rar)

import java.io.*;
class merge{
static int i;
public static void main(String args[])throws IOException{
String a;
int b;
long len;
Console con=System.console();
System.out.println("Enter File to be retrived: ");
File f=new File(con.readLine());
FileOutputStream fos=new FileOutputStream(f,true);
try {
File f1=new File(i+""+"."+f);
while((f1.exists())!=false) {
int ch;
FileInputStream fis=new FileInputStream(i+""+"."+f);
i++;
while((ch=fis.read())!=-1){
fos.write(ch);  }}}
catch(FileNotFoundException e1){} }}

and another program will merge all the split files.It take only split file name and merge all the files.

另一个程序将合并所有拆分文件。它只需要拆分文件名并合并所有文件。

newName = String.format("%s.part%09d", fname, nChunks - 1);

回答by Peter Lawrey

What happens when you do a binary comparison of the files. e.g. with diff. Do you see a difference after the first file?

当您对文件进行二进制比较时会发生什么。例如与差异。你看到第一个文件后有什么不同吗?

Can you try breaking up a text TXT file? if there are bytes are out of place it should be more obvious what is going wrong. e.g. a repeated block/file/or data full of nul bytes. ??

您可以尝试分解文本 TXT 文件吗?如果有字节不合适,那应该更明显出了什么问题。例如一个重复的块/文件/或充满空字节的数据。??

EDIT: As others have noticed, you read the files in no particular order. What you can do is use a padded file number like.

编辑:正如其他人所注意到的,您没有按特定顺序阅读文件。您可以做的是使用填充的文件编号,例如。

Arrays.sort(files);
for (File file : files) {

This will give you up to 1 billion files in numeric order.

这将为您提供多达 10 亿个按数字顺序排列的文件。

When you read the files, you need to ensure they are sorted.

当您阅读文件时,您需要确保它们已排序。

##代码##

Using a custom comparator as others have suggest would reduce the size of the padded numbers but it can be nice to be able to sort by name to get the correct order. e.g. in explorer.

使用其他人建议的自定义比较器会减少填充数字的大小,但能够按名称排序以获得正确的顺序会很好。例如在资源管理器中。