Java 如何在 Amazon S3 中读取文件的内容

Question

提问by ZZzzZZzz

I have a file in Amazon S3in bucket ABCD. I have 3 objects ("folderA/folderB/folderC/abcd.csv")which are folders and in the final folder I have a .csvfile (abcd.csv). I have used a logic to convert it to JSONand load it back into another file which is a .txtfile in the same folder ("folderA/folderB/folderC/abcd.txt"). I had to download the file locally in order to do that. How would I read the file directly and write it back to the text file. The code which I have used to write to a file in S3 is below and I need to read a file from S3.

我Amazon S3在 bucket 中有一个文件ABCD。我有 3 个对象("folderA/folderB/folderC/abcd.csv")，它们是文件夹，在最后一个.csv文件夹中，我有一个文件(abcd.csv). 我使用了一种逻辑将其转换为并将其JSON加载回另一个文件，该.txt文件是同一文件夹中的文件("folderA/folderB/folderC/abcd.txt")。我必须在本地下载文件才能做到这一点。我如何直接读取文件并将其写回文本文件。我用来在 S3 中写入文件的代码如下，我需要从 S3 读取文件。

 InputStream inputStream = new ByteArrayInputStream(json.getBytes(StandardCharsets.UTF_16));
 ObjectMetadata metadata = new ObjectMetadata();
 metadata.setContentLength(json.length());
 PutObjectRequest request = new PutObjectRequest(bucketPut, filePut, inputStream, metadata);
 s3.putObject(request);

Answer 1

采纳答案by ashokramcse

First you should get the object InputStreamto do your need.

首先，您应该让对象InputStream满足您的需求。

S3Object object = s3Client.getObject(new GetObjectRequest(bucketName, key));
InputStream objectData = object.getObjectContent();

Pass the InputStream, File Nameand the pathto the below method to download your stream.

将InputStream,File Name和传递path给以下方法以下载您的流。

public void saveFile(String fileName, String path, InputStream objectData) throws Exception {
    DataOutputStream dos = null;
    OutputStream out = null;
    try {
        File newDirectory = new File(path);
        if (!newDirectory.exists()) {
            newDirectory.mkdirs();
        }

        File uploadedFile = new File(path, uploadFileName);
        out = new FileOutputStream(uploadedFile);
        byte[] fileAsBytes = new byte[inputStream.available()];
        inputStream.read(fileAsBytes);

        dos = new DataOutputStream(out);
        dos.write(fileAsBytes);
    } catch (IOException io) {
        io.printStackTrace();
    } catch (Exception e) {
        e.printStackTrace();
    } finally {
        try {
            if (out != null) {
                out.close();
            }
            if (dos != null) {
                dos.close();
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

After you Download your object read the file and make it to JSONand write it to .txtfile after that you can upload the txtfile to the desired bucket in S3

下载对象后，读取文件并将其JSON写入.txt文件，然后您可以将txt文件上传到所需的存储桶中S3

Answer 2

回答by Oguz

You can use other java libs for downloading or reading files without downloading. Check the code please, I hope it is helpful for you. This example for PDF.

您可以使用其他 java 库来下载或读取文件而无需下载。请检查代码，我希望它对您有所帮助。此示例为 PDF。

import java.io.IOException;
import java.io.InputStream;
import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.Date;
import java.util.List;
import javax.swing.JTextArea;
import java.io.FileWriter;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.text.PDFTextStripper;
import org.apache.pdfbox.text.PDFTextStripperByArea;
import org.joda.time.DateTime;
import com.amazonaws.auth.AWSCredentials;
import com.amazonaws.auth.BasicAWSCredentials;
import com.amazonaws.services.s3.AmazonS3;
import com.amazonaws.services.s3.AmazonS3Client;
import com.amazonaws.services.s3.model.AmazonS3Exception;
import com.amazonaws.services.s3.model.CopyObjectRequest;
import com.amazonaws.services.s3.model.GetObjectRequest;
import com.amazonaws.services.s3.model.ListObjectsV2Request;
import com.amazonaws.services.s3.model.ListObjectsV2Result;
import com.amazonaws.services.s3.model.S3Object;
import com.amazonaws.services.s3.model.S3ObjectSummary;
import java.io.File; 
   //..
   // in your main class 
   private static AWSCredentials credentials = null;
   private static AmazonS3 amazonS3Client = null;

   public static void intializeAmazonObjects() {
        credentials = new BasicAWSCredentials(ACCESS_KEY, SECRET_ACCESS_KEY);
        amazonS3Client = new AmazonS3Client(credentials);
    }
   public void mainMethod() throws IOException, AmazonS3Exception{
        // connect to aws
        intializeAmazonObjects();

    ListObjectsV2Request req = new ListObjectsV2Request().withBucketName(bucketName);
    ListObjectsV2Result listObjectsResult;
do {

        listObjectsResult = amazonS3Client.listObjectsV2(req);
        int count = 0;
        for (S3ObjectSummary objectSummary : listObjectsResult.getObjectSummaries()) {
            System.out.printf(" - %s (size: %d)\n", objectSummary.getKey(), objectSummary.getSize());

            // Date lastModifiedDate = objectSummary.getLastModified();

            // String bucket = objectSummary.getBucketName();
            String key = objectSummary.getKey();
            String newKey = "";
            String newBucket = "";
            String resultText = "";

            // only try to read pdf files
            if (!key.contains(".pdf")) {
                continue;
            }

            // Read the source file as text
            String pdfFileInText = readAwsFile(objectSummary.getBucketName(), objectSummary.getKey());
            if (pdfFileInText.isEmpty())
                continue;
        }//end of current bulk

        // If there are more than maxKeys(in this case 999 default) keys in the bucket,
        // get a continuation token
        // and list the next objects.
        String token = listObjectsResult.getNextContinuationToken();
        System.out.println("Next Continuation Token: " + token);
        req.setContinuationToken(token);
    } while (listObjectsResult.isTruncated());
}

public String readAwsFile(String bucketName, String keyName) {
    S3Object object;
    String pdfFileInText = "";
    try {

        // AmazonS3 s3client = getAmazonS3ClientObject();
        object = amazonS3Client.getObject(new GetObjectRequest(bucketName, keyName));
        InputStream objectData = object.getObjectContent();

        PDDocument document = PDDocument.load(objectData);
        document.getClass();

        if (!document.isEncrypted()) {

            PDFTextStripperByArea stripper = new PDFTextStripperByArea();
            stripper.setSortByPosition(true);

            PDFTextStripper tStripper = new PDFTextStripper();

            pdfFileInText = tStripper.getText(document);

        }

    } catch (Exception e) {
        e.printStackTrace();
    }
    return pdfFileInText;
}

Java 如何在 Amazon S3 中读取文件的内容

提问by ZZzzZZzz

采纳答案by ashokramcse

回答by Oguz

相关推荐

最近更新

标签

Java 如何在 Amazon S3 中读取文件的内容

提问by ZZzzZZzz

采纳答案by ashokramcse

回答by Oguz

相关推荐

Java IntelliJ 在多行中替换变量名

Java 无法解析主 URL：'spark:http://localhost:18080'

Java 静态方法无法访问类的实例成员

Java com.google.gson.internal.LinkedTreeMap 无法投射到我的班级

相关推荐

最近更新

标签