使用hiveudf解压缩列数据

piv4azn7  于 2021-06-26  发布在  Hive
关注(0)|答案(1)|浏览(500)

context:decompress the 使用hive udf evaluate()方法的列数据
例外情况:
失败,出现异常java.io.ioexception:org.apache.hadoop.hive.ql.metadata.hiveexception:无法执行方法public static org.apache.hadoop.io.text test.udfdecompressor.evaluate(java.lang.string)在对象测试上抛出org.apache.hadoop.hive.ql.metadata.hiveexception。udfdecompressor@1008df1e 类test.udfdecompressor的参数{x��}千瓦⸲�_一�����ö¤ï¿½\��a-b型�我�@`�����"�第三次�我����$_�e�� } 大小1
源代码:

  1. import java.io.ByteArrayInputStream;
  2. import java.io.IOException;
  3. import java.nio.charset.Charset;
  4. import java.util.Arrays;
  5. import java.util.zip.DataFormatException;
  6. import java.util.zip.InflaterInputStream;
  7. import org.apache.hadoop.hive.ql.exec.UDF;
  8. import org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaStringObjectInspector;
  9. import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory;
  10. public class Decompress extends UDF{
  11. public static String evaluate(String data1) throws IOException, DataFormatException{
  12. ByteArrayInputStream bao=new ByteArrayInputStream(data1.getBytes());
  13. InflaterInputStream iis= new InflaterInputStream(bao);
  14. String out="";
  15. byte[] bt=new byte[1024];
  16. int len=-1;
  17. while ((len =iis.read(bt))!=-1){
  18. out += new String(Arrays.copyOf(bt, len));
  19. }
  20. JavaStringObjectInspector stringInspector;
  21. stringInspector = PrimitiveObjectInspectorFactory.javaStringObjectInspector;
  22. String ip = stringInspector.getPrimitiveJavaObject(out);
  23. //return new String(ip.getBytes(Charset.forName("UTF-8")));
  24. //return new String(ip.getBytes(Charset.forName("UTF-8")));
  25. return ip;
  26. }
  27. }

我尝试了多种方法来使用gzib、zlibjavaapi解压,但是遇到了相同的错误。有谁能帮助我解决这个问题,并建议使用hiveudf解压列数据的正确方法吗
提前谢谢。

frebpwbc

frebpwbc1#

  1. import org.apache.hadoop.hive.ql.exec.UDF;
  2. import org.apache.hadoop.io.BytesWritable;
  3. import org.apache.hadoop.io.Text;
  4. import java.io.ByteArrayInputStream;
  5. import java.io.ByteArrayOutputStream;
  6. import java.io.IOException;
  7. import java.util.zip.InflaterInputStream;
  8. public class Decompress extends UDF {
  9. private final Text r = new Text();
  10. public Text evaluate(BytesWritable bw) throws IOException {
  11. ByteArrayInputStream zipped = new ByteArrayInputStream(bw.getBytes());
  12. InflaterInputStream inflater = new InflaterInputStream(zipped);
  13. ByteArrayOutputStream unzipped = new ByteArrayOutputStream();
  14. byte[] bt = new byte[1024];
  15. int len;
  16. while ((len = inflater.read(bt)) != -1) {
  17. unzipped.write(bt, 0, len);
  18. }
  19. r.clear();
  20. r.set(unzipped.toByteArray());
  21. return r;
  22. }
  23. }
展开查看全部

相关问题