Java 如何在 Amazon S3 中读取文件的内容

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/27318587/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-11 04:09:51  来源:igfitidea点击:

How do I read the content of a file in Amazon S3

javaamazon-web-servicesfile-ioamazon-s3

提问by ZZzzZZzz

I have a file in Amazon S3in bucket ABCD. I have 3 objects ("folderA/folderB/folderC/abcd.csv")which are folders and in the final folder I have a .csvfile (abcd.csv). I have used a logic to convert it to JSONand load it back into another file which is a .txtfile in the same folder ("folderA/folderB/folderC/abcd.txt"). I had to download the file locally in order to do that. How would I read the file directly and write it back to the text file. The code which I have used to write to a file in S3 is below and I need to read a file from S3.

Amazon S3在 bucket 中有一个文件ABCD。我有 3 个对象("folderA/folderB/folderC/abcd.csv"),它们是文件夹,在最后一个.csv文件夹中,我有一个文件(abcd.csv). 我使用了一种逻辑将其转换为并将其JSON加载回另一个文件,该.txt文件是同一文件夹中的文件("folderA/folderB/folderC/abcd.txt")。我必须在本地下载文件才能做到这一点。我如何直接读取文件并将其写回文本文件。我用来在 S3 中写入文件的代码如下,我需要从 S3 读取文件。

 InputStream inputStream = new ByteArrayInputStream(json.getBytes(StandardCharsets.UTF_16));
 ObjectMetadata metadata = new ObjectMetadata();
 metadata.setContentLength(json.length());
 PutObjectRequest request = new PutObjectRequest(bucketPut, filePut, inputStream, metadata);
 s3.putObject(request);

采纳答案by ashokramcse

First you should get the object InputStreamto do your need.

首先,您应该让对象InputStream满足您的需求。

S3Object object = s3Client.getObject(new GetObjectRequest(bucketName, key));
InputStream objectData = object.getObjectContent();

Pass the InputStream, File Nameand the pathto the below method to download your stream.

InputStream,File Name和传递path给以下方法以下载您的流。

public void saveFile(String fileName, String path, InputStream objectData) throws Exception {
    DataOutputStream dos = null;
    OutputStream out = null;
    try {
        File newDirectory = new File(path);
        if (!newDirectory.exists()) {
            newDirectory.mkdirs();
        }

        File uploadedFile = new File(path, uploadFileName);
        out = new FileOutputStream(uploadedFile);
        byte[] fileAsBytes = new byte[inputStream.available()];
        inputStream.read(fileAsBytes);

        dos = new DataOutputStream(out);
        dos.write(fileAsBytes);
    } catch (IOException io) {
        io.printStackTrace();
    } catch (Exception e) {
        e.printStackTrace();
    } finally {
        try {
            if (out != null) {
                out.close();
            }
            if (dos != null) {
                dos.close();
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

After you Download your object read the file and make it to JSONand write it to .txtfile after that you can upload the txtfile to the desired bucket in S3

下载对象后,读取文件并将其JSON写入.txt文件,然后您可以将txt文件上传到所需的存储桶中S3

回答by Oguz

You can use other java libs for downloading or reading files without downloading. Check the code please, I hope it is helpful for you. This example for PDF.

您可以使用其他 java 库来下载或读取文件而无需下载。请检查代码,我希望它对您有所帮助。此示例为 PDF。

import java.io.IOException;
import java.io.InputStream;
import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.Date;
import java.util.List;
import javax.swing.JTextArea;
import java.io.FileWriter;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.text.PDFTextStripper;
import org.apache.pdfbox.text.PDFTextStripperByArea;
import org.joda.time.DateTime;
import com.amazonaws.auth.AWSCredentials;
import com.amazonaws.auth.BasicAWSCredentials;
import com.amazonaws.services.s3.AmazonS3;
import com.amazonaws.services.s3.AmazonS3Client;
import com.amazonaws.services.s3.model.AmazonS3Exception;
import com.amazonaws.services.s3.model.CopyObjectRequest;
import com.amazonaws.services.s3.model.GetObjectRequest;
import com.amazonaws.services.s3.model.ListObjectsV2Request;
import com.amazonaws.services.s3.model.ListObjectsV2Result;
import com.amazonaws.services.s3.model.S3Object;
import com.amazonaws.services.s3.model.S3ObjectSummary;
import java.io.File; 
   //..
   // in your main class 
   private static AWSCredentials credentials = null;
   private static AmazonS3 amazonS3Client = null;

   public static void intializeAmazonObjects() {
        credentials = new BasicAWSCredentials(ACCESS_KEY, SECRET_ACCESS_KEY);
        amazonS3Client = new AmazonS3Client(credentials);
    }
   public void mainMethod() throws IOException, AmazonS3Exception{
        // connect to aws
        intializeAmazonObjects();

    ListObjectsV2Request req = new ListObjectsV2Request().withBucketName(bucketName);
    ListObjectsV2Result listObjectsResult;
do {

        listObjectsResult = amazonS3Client.listObjectsV2(req);
        int count = 0;
        for (S3ObjectSummary objectSummary : listObjectsResult.getObjectSummaries()) {
            System.out.printf(" - %s (size: %d)\n", objectSummary.getKey(), objectSummary.getSize());

            // Date lastModifiedDate = objectSummary.getLastModified();

            // String bucket = objectSummary.getBucketName();
            String key = objectSummary.getKey();
            String newKey = "";
            String newBucket = "";
            String resultText = "";

            // only try to read pdf files
            if (!key.contains(".pdf")) {
                continue;
            }

            // Read the source file as text
            String pdfFileInText = readAwsFile(objectSummary.getBucketName(), objectSummary.getKey());
            if (pdfFileInText.isEmpty())
                continue;
        }//end of current bulk

        // If there are more than maxKeys(in this case 999 default) keys in the bucket,
        // get a continuation token
        // and list the next objects.
        String token = listObjectsResult.getNextContinuationToken();
        System.out.println("Next Continuation Token: " + token);
        req.setContinuationToken(token);
    } while (listObjectsResult.isTruncated());
}

public String readAwsFile(String bucketName, String keyName) {
    S3Object object;
    String pdfFileInText = "";
    try {

        // AmazonS3 s3client = getAmazonS3ClientObject();
        object = amazonS3Client.getObject(new GetObjectRequest(bucketName, keyName));
        InputStream objectData = object.getObjectContent();

        PDDocument document = PDDocument.load(objectData);
        document.getClass();

        if (!document.isEncrypted()) {

            PDFTextStripperByArea stripper = new PDFTextStripperByArea();
            stripper.setSortByPosition(true);

            PDFTextStripper tStripper = new PDFTextStripper();

            pdfFileInText = tStripper.getText(document);

        }

    } catch (Exception e) {
        e.printStackTrace();
    }
    return pdfFileInText;
}