我们正在hdfs中存储包含xml文件的zip文件。我们需要能够使用java以编程方式解压文件并将包含的xml文件流式输出。filesystem.open返回fsdatainputstream,但zipfile构造函数仅将文件或字符串作为参数。我真的不想使用filesystem.copytolocalfile。是否可以流式传输存储在hdfs中的zip文件的内容,而不首先将zip文件复制到本地文件系统?如果是,怎么办?
yh2wf1be1#
嗨,请找到样本代码,
public static Map<String, byte[]> loadZipFileData(String hdfsFilePath) { try { ZipInputStream zipInputStream = readZipFileFromHDFS(new Path(hdfsFilePath)); ZipEntry zipEntry = null; byte[] buf = new byte[1024]; Map<String, byte[]> listOfFiles = new LinkedHashMap<>(); while ((zipEntry = zipInputStream.getNextEntry()) != null ) { int bytesRead = 0; String entryName = zipEntry.getName(); if (!zipEntry.isDirectory()) { ByteArrayOutputStream outputStream = new ByteArrayOutputStream(); while ((bytesRead = zipInputStream.read(buf, 0, 1024)) > -1) { outputStream.write(buf, 0, bytesRead); } listOfFiles.put(entryName, outputStream.toByteArray()); outputStream.close(); } zipInputStream.closeEntry(); } zipInputStream.close(); return listOfFiles; } catch (Exception e) { e.printStackTrace(); } } protected ZipInputStream readZipFileFromHDFS(FileSystem fileSystem, Path path) throws Exception { if (!fileSystem.exists(path)) { throw new IllegalArgumentException(path.getName() + " does not exist"); } FSDataInputStream fsInputStream = fileSystem.open(path); ZipInputStream zipInputStream = new ZipInputStream(fsInputStream); return zipInputStream; }
1条答案
按热度按时间yh2wf1be1#
嗨,请找到样本代码,