使用azuresdkforjava上传大文件,堆有限

ix0qys7i  于 2021-06-30  发布在  Java
关注(0)|答案(1)|浏览(440)

我们正在开发文档微服务,需要使用azure作为文件内容的存储。天蓝色的方块看起来是个合理的选择。文档服务的堆限制为512mb( -Xmx512m ).
我没有成功地得到流文件上传有限堆工作使用 azure-storage-blob:12.10.0-beta.1 (也在 12.9.0 ).
尝试了以下方法:
使用从文档中复制粘贴 BlockBlobClient ```
BlockBlobClient blockBlobClient = blobContainerClient.getBlobClient("file").getBlockBlobClient();

File file = new File("file");

try (InputStream dataStream = new FileInputStream(file)) {
blockBlobClient.upload(dataStream, file.length(), true /* overwrite file */);
}

结果: `java.io.IOException: mark/reset not supported` -sdk尝试使用标记/重置,即使文件输入流报告此功能不受支持。
添加 `BufferedInputStream` 要缓解标记/重置问题(根据建议):

BlockBlobClient blockBlobClient = blobContainerClient.getBlobClient("file").getBlockBlobClient();

File file = new File("file");

try (InputStream dataStream = new BufferedInputStream(new FileInputStream(file))) {
blockBlobClient.upload(dataStream, file.length(), true /* overwrite file */);
}

结果: `java.lang.OutOfMemoryError: Java heap space` . 我假设sdk试图将所有1.17gb的文件内容加载到内存中。
更换 `BlockBlobClient` 与 `BlobClient` 消除堆大小限制( `-Xmx512m` ):

BlobClient blobClient = blobContainerClient.getBlobClient("file");

File file = new File("file");

try (InputStream dataStream = new FileInputStream(file)) {
blobClient.upload(dataStream, file.length(), true /* overwrite file */);
}

结果:使用了1.5gb堆内存,所有文件内容都加载到内存中+React器一侧的一些缓冲区
visualvm的堆使用率
切换到流媒体 `BlobOutputStream` :

long blockSize = DataSize.ofMegabytes(4L).toBytes();

BlockBlobClient blockBlobClient = blobContainerClient.getBlobClient("file").getBlockBlobClient();

// create / erase blob
blockBlobClient.commitBlockList(List.of(), true);

BlockBlobOutputStreamOptions options = (new BlockBlobOutputStreamOptions()).setParallelTransferOptions(
(new ParallelTransferOptions()).setBlockSizeLong(blockSize).setMaxConcurrency(1).setMaxSingleUploadSizeLong(blockSize));

try (InputStream is = new FileInputStream("file")) {
try (OutputStream os = blockBlobClient.getBlobOutputStream(options)) {
IOUtils.copy(is, os); // uses 8KB buffer
}
}

结果:文件在上载过程中损坏。azure web门户显示1.09gb,而不是预期的1.17gb。从azure web portal手动下载文件可确认文件内容在上载过程中已损坏。内存占用显著减少,但文件损坏是一个阻碍因素。
问题:无法提供内存占用小的上传/下载解决方案
任何帮助都将不胜感激!
vhmi4jdf

vhmi4jdf1#

请尝试下面的代码上传/下载大文件,我已经在我的测试使用一个.zip文件大小约1.1 gb
上传文件:

public static void uploadFilesByChunk() {
                String connString = "<conn str>";
                String containerName = "<container name>";
                String blobName = "UploadOne.zip";
                String filePath = "D:/temp/" + blobName;

                BlobServiceClient client = new BlobServiceClientBuilder().connectionString(connString).buildClient();
                BlobClient blobClient = client.getBlobContainerClient(containerName).getBlobClient(blobName);
                long blockSize = 2 * 1024 * 1024; //2MB
                ParallelTransferOptions parallelTransferOptions = new ParallelTransferOptions()
                                .setBlockSizeLong(blockSize).setMaxConcurrency(2)
                                .setProgressReceiver(new ProgressReceiver() {
                                        @Override
                                        public void reportProgress(long bytesTransferred) {
                                                System.out.println("uploaded:" + bytesTransferred);
                                        }
                                });

                BlobHttpHeaders headers = new BlobHttpHeaders().setContentLanguage("en-US").setContentType("binary");

                blobClient.uploadFromFile(filePath, parallelTransferOptions, headers, null, AccessTier.HOT,
                                new BlobRequestConditions(), Duration.ofMinutes(30));
        }

内存占用:

下载文件:

public static void downLoadFilesByChunk() {
                String connString = "<conn str>";
                String containerName = "<container name>";
                String blobName = "UploadOne.zip";

                String filePath = "D:/temp/" + "DownloadOne.zip";

                BlobServiceClient client = new BlobServiceClientBuilder().connectionString(connString).buildClient();
                BlobClient blobClient = client.getBlobContainerClient(containerName).getBlobClient(blobName);
                long blockSize = 2 * 1024 * 1024;
                com.azure.storage.common.ParallelTransferOptions parallelTransferOptions = new com.azure.storage.common.ParallelTransferOptions()
                                .setBlockSizeLong(blockSize).setMaxConcurrency(2)
                                .setProgressReceiver(new com.azure.storage.common.ProgressReceiver() {
                                        @Override
                                        public void reportProgress(long bytesTransferred) {
                                                System.out.println("dowloaded:" + bytesTransferred);
                                        }
                                });

                BlobDownloadToFileOptions options = new BlobDownloadToFileOptions(filePath)
                                .setParallelTransferOptions(parallelTransferOptions);
                blobClient.downloadToFileWithResponse(options, Duration.ofMinutes(30), null);
        }

内存占用:

结果:

相关问题