Java 如何在 Amazon S3 中读取文件的内容
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/27318587/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How do I read the content of a file in Amazon S3
提问by ZZzzZZzz
I have a file in Amazon S3
in bucket ABCD
. I have 3 objects ("folderA/folderB/folderC/abcd.csv")
which are folders and in the final folder I have a .csv
file (abcd.csv)
. I have used a logic to convert it to JSON
and load it back into another file which is a .txt
file in the same folder ("folderA/folderB/folderC/abcd.txt")
. I had to download the file locally in order to do that. How would I read the file directly and write it back to the text file. The code which I have used to write to a file in S3 is below and I need to read a file from S3.
我Amazon S3
在 bucket 中有一个文件ABCD
。我有 3 个对象("folderA/folderB/folderC/abcd.csv")
,它们是文件夹,在最后一个.csv
文件夹中,我有一个文件(abcd.csv)
. 我使用了一种逻辑将其转换为并将其JSON
加载回另一个文件,该.txt
文件是同一文件夹中的文件("folderA/folderB/folderC/abcd.txt")
。我必须在本地下载文件才能做到这一点。我如何直接读取文件并将其写回文本文件。我用来在 S3 中写入文件的代码如下,我需要从 S3 读取文件。
InputStream inputStream = new ByteArrayInputStream(json.getBytes(StandardCharsets.UTF_16));
ObjectMetadata metadata = new ObjectMetadata();
metadata.setContentLength(json.length());
PutObjectRequest request = new PutObjectRequest(bucketPut, filePut, inputStream, metadata);
s3.putObject(request);
采纳答案by ashokramcse
First you should get the object InputStream
to do your need.
首先,您应该让对象InputStream
满足您的需求。
S3Object object = s3Client.getObject(new GetObjectRequest(bucketName, key));
InputStream objectData = object.getObjectContent();
Pass the InputStream
, File Name
and the path
to the below method to download your stream.
将InputStream
,File Name
和传递path
给以下方法以下载您的流。
public void saveFile(String fileName, String path, InputStream objectData) throws Exception {
DataOutputStream dos = null;
OutputStream out = null;
try {
File newDirectory = new File(path);
if (!newDirectory.exists()) {
newDirectory.mkdirs();
}
File uploadedFile = new File(path, uploadFileName);
out = new FileOutputStream(uploadedFile);
byte[] fileAsBytes = new byte[inputStream.available()];
inputStream.read(fileAsBytes);
dos = new DataOutputStream(out);
dos.write(fileAsBytes);
} catch (IOException io) {
io.printStackTrace();
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
if (out != null) {
out.close();
}
if (dos != null) {
dos.close();
}
} catch (IOException e) {
e.printStackTrace();
}
}
}
After you Download your object read the file and make it to JSON
and write it to .txt
file after that you can upload the txt
file to the desired bucket in S3
下载对象后,读取文件并将其JSON
写入.txt
文件,然后您可以将txt
文件上传到所需的存储桶中S3
回答by Oguz
You can use other java libs for downloading or reading files without downloading. Check the code please, I hope it is helpful for you. This example for PDF.
您可以使用其他 java 库来下载或读取文件而无需下载。请检查代码,我希望它对您有所帮助。此示例为 PDF。
import java.io.IOException;
import java.io.InputStream;
import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.Date;
import java.util.List;
import javax.swing.JTextArea;
import java.io.FileWriter;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.text.PDFTextStripper;
import org.apache.pdfbox.text.PDFTextStripperByArea;
import org.joda.time.DateTime;
import com.amazonaws.auth.AWSCredentials;
import com.amazonaws.auth.BasicAWSCredentials;
import com.amazonaws.services.s3.AmazonS3;
import com.amazonaws.services.s3.AmazonS3Client;
import com.amazonaws.services.s3.model.AmazonS3Exception;
import com.amazonaws.services.s3.model.CopyObjectRequest;
import com.amazonaws.services.s3.model.GetObjectRequest;
import com.amazonaws.services.s3.model.ListObjectsV2Request;
import com.amazonaws.services.s3.model.ListObjectsV2Result;
import com.amazonaws.services.s3.model.S3Object;
import com.amazonaws.services.s3.model.S3ObjectSummary;
import java.io.File;
//..
// in your main class
private static AWSCredentials credentials = null;
private static AmazonS3 amazonS3Client = null;
public static void intializeAmazonObjects() {
credentials = new BasicAWSCredentials(ACCESS_KEY, SECRET_ACCESS_KEY);
amazonS3Client = new AmazonS3Client(credentials);
}
public void mainMethod() throws IOException, AmazonS3Exception{
// connect to aws
intializeAmazonObjects();
ListObjectsV2Request req = new ListObjectsV2Request().withBucketName(bucketName);
ListObjectsV2Result listObjectsResult;
do {
listObjectsResult = amazonS3Client.listObjectsV2(req);
int count = 0;
for (S3ObjectSummary objectSummary : listObjectsResult.getObjectSummaries()) {
System.out.printf(" - %s (size: %d)\n", objectSummary.getKey(), objectSummary.getSize());
// Date lastModifiedDate = objectSummary.getLastModified();
// String bucket = objectSummary.getBucketName();
String key = objectSummary.getKey();
String newKey = "";
String newBucket = "";
String resultText = "";
// only try to read pdf files
if (!key.contains(".pdf")) {
continue;
}
// Read the source file as text
String pdfFileInText = readAwsFile(objectSummary.getBucketName(), objectSummary.getKey());
if (pdfFileInText.isEmpty())
continue;
}//end of current bulk
// If there are more than maxKeys(in this case 999 default) keys in the bucket,
// get a continuation token
// and list the next objects.
String token = listObjectsResult.getNextContinuationToken();
System.out.println("Next Continuation Token: " + token);
req.setContinuationToken(token);
} while (listObjectsResult.isTruncated());
}
public String readAwsFile(String bucketName, String keyName) {
S3Object object;
String pdfFileInText = "";
try {
// AmazonS3 s3client = getAmazonS3ClientObject();
object = amazonS3Client.getObject(new GetObjectRequest(bucketName, keyName));
InputStream objectData = object.getObjectContent();
PDDocument document = PDDocument.load(objectData);
document.getClass();
if (!document.isEncrypted()) {
PDFTextStripperByArea stripper = new PDFTextStripperByArea();
stripper.setSortByPosition(true);
PDFTextStripper tStripper = new PDFTextStripper();
pdfFileInText = tStripper.getText(document);
}
} catch (Exception e) {
e.printStackTrace();
}
return pdfFileInText;
}