分析zip输入流中包含的多个gzipped json文件,而不将任何文件保存到磁盘(因为谷歌应用程序引擎)



我正试图通过http连接的InputStream解析一个zip文件中包含的多个gzipped json文件。

我已经读了第一个文件,但没有读更多。有时它会失败,无法读取整个(第一个)文件。我已经检查了连接上的内容长度标头,即使我无法读取整个文件,它也是一样的。

我使用的是谷歌应用程序引擎,它不允许我在本地保存文件,我发现的大多数例子都是这样做的。

我正在使用来自的ZipArchiveInputStreamhttps://commons.apache.org/proper/commons-compress/用于Zip文件。

这是我能找到的最相关的问题:如何从包含多个GzipStreams 的文件中读取

private static ArrayList<RawEvent> parseAmplitudeEventArchiveData(HttpURLConnection connection)
        throws IOException, ParseException {
    String name, line;
    ArrayList<RawEvent> events = new ArrayList<>();
    try (ZipArchiveInputStream zipInput =
                 new ZipArchiveInputStream(connection.getInputStream(), null, false, true);) {
        ZipArchiveEntry zipEntry = zipInput.getNextZipEntry();
        if (zipEntry != null) {
            try(GZIPInputStream gzipInputStream = new GZIPInputStream(connection.getInputStream());
            BufferedReader reader = new BufferedReader(new InputStreamReader(gzipInputStream))) {
                name = zipEntry.getName();
                log.info("Parsing file: " + name);
                while ((line = reader.readLine()) != null) {
                    events.add(parseJsonLine(line));
                }
                log.info("Events size: " + events.size());
            }
        }
    }
    return events;
}

这对我有效:

public class UnzipZippedFiles {
    public static void main(String[] args) throws IOException, ParseException {
        FileInputStream inputStream = new FileInputStream("/home/me/dev/scratchpad/src/main/resources/files.zip");
        unzipFile(inputStream);
    }
    private static void unzipFile(InputStream inputStream)
            throws IOException, ParseException {
        try (ZipArchiveInputStream zipInput =
                     new ZipArchiveInputStream(inputStream, null, false, true);) {
            ZipArchiveEntry zipEntry;
            while ((zipEntry = zipInput.getNextZipEntry()) != null) {
                System.out.println("File: " + zipEntry.getName());
                byte[] fileBytes = readDataFromZipStream(zipInput, zipEntry);
                ByteArrayInputStream byteIn = new ByteArrayInputStream(fileBytes);
                unzipGzipArchiveAndPrint(byteIn);
            }
        }
    }
    private static byte[] readDataFromZipStream(ZipArchiveInputStream zipStream, ZipArchiveEntry entry) throws IOException {
        byte[] data = new byte[(int) entry.getSize()];
        zipStream.read(data);
        return data;
    }
    private static void unzipGzipArchiveAndPrint(InputStream inputStream) throws IOException {
        System.out.println("Content:");
        try (GZIPInputStream gzipInputStream = new GZIPInputStream(inputStream);
             BufferedReader reader = new BufferedReader(new InputStreamReader(gzipInputStream))) {
            String line;
            while ((line = reader.readLine()) != null) {
                System.out.println(line);
            }
        }
    }
}

最新更新