有没有一个烫手的源代码，我可以使用lzo压缩二进制数据？

o2g1uqev 于 2021-06-04 发布在 Hadoop

关注(0)|答案(1)|浏览(381)

我正在使用elephant bird的可拆分lzo压缩将序列化的节俭记录写入一个文件。为了实现这一点，我用他们的 ThriftBlockWriter 班级。我的滚烫工作然后使用fixedpathlzothrift源来处理记录。一切正常。问题是我只限于一个节俭阶层的记录。
我想开始使用 RawBlockWriter 而不是 ThriftBlockWriter[MyThriftClass] . 因此，我的输入将是lzo压缩的原始字节数组，而不是lzo压缩的节俭记录。我的问题是：我应该用什么来代替 FixedPathLzoThrift[MyThriftClass] ?
解释“protocolbuffers”标签：象鸟使用协议缓冲区 SerializedBlock 类来 Package 原始输入，如图所示。

hadoop protocol-buffers thrift scalding lzo

来源：https://stackoverflow.com/questions/27967953/is-there-a-scalding-source-i-can-use-for-lzo-compressed-binary-data

1条答案

按热度按时间

qhhrdooz1#

我通过创建一个 FixedPathLzoRaw 类来代替 FixedPathLzoThrift :

case class FixedPathLzoRaw(path: String*) extends FixedPathSource(path: _*) with LzoRaw

// Corresponds to LzoThrift trait
trait LzoRaw extends LocalTapSource with SingleMappable[Array[Byte]] with TypedSink[Array[Byte]] {
  override def setter[U <: Array[Byte]] = TupleSetter.asSubSetter[Array[Byte], U](TupleSetter.singleSetter[Array[Byte]])
  override def hdfsScheme = HadoopSchemeInstance((new LzoByteArrayScheme()).asInstanceOf[Scheme[_, _, _, _, _]])
}

赞(0）回复(0）举报 2021-06-04

我来回答

有没有一个烫手的源代码，我可以使用lzo压缩二进制数据？

1条答案

相关问题

热门标签

最新问答