如何在java中将大xml转换成字符串

wbgh16ku 于 2021-06-21 发布在 Pig

关注(0)|答案(1)|浏览(502)

作为pig脚本的一部分，我需要获取使用udf生成的xml，而xml太大（大约1.5gb）。目前我正在使用下面的代码将xml转换为字符串

StringWriter sw = new StringWriter();
    XMLWriter output = new XMLWriter(sw, xmlFormat);
    try {
       output.write(document);
        output.close();
    } catch (IOException e) {}

    return sw.toString();

这会抛出outofmemoryerror，因为stringwriter在内部使用字符串缓冲区，并且它依赖于数组。因为数组使用整数作为索引，xml的长度超出int范围。
有没有办法将这个大xml转换成字符串并将其发送回pig脚本？或者我们可以用其他方法来实现它。
仅供参考-我们正在使用dom4j( org.dom4j.Document )用于处理XML
更新1：我尝试下面的代码，我现在可以存储800 mb，但仍然是1.5 gb的文件是失败的

ByteArrayOutputStream result = new ByteArrayOutputStream();
    try {
        XMLWriter output = new XMLWriter(result, xmlFormat);
        output.write(document);
        output.close();
        return result.toString("UTF-8");
    } catch (IOException e) {}

Java xml apache-pig dom4j

来源：https://stackoverflow.com/questions/42351841/how-to-convert-large-xml-to-string-in-java