apache gora reducer，用于带hbase的多表输出

hts6caw3 于 2021-06-09 发布在 Hbase

关注(0)|答案(1)|浏览(381)

我在hbase表中有小数据通过nutch爬网。它让我们使用apache-gora作为orm。我在hbase中找到了许多处理单个表中数据的示例（mapreduce）。但我的问题是，我必须将数据复制到多个表中（在reducer中）。没有gora，就有一些指导，例如，这个问题等等，但是如何为我的案例做指导。

hbase mapreduce nutch gora

来源：https://stackoverflow.com/questions/58389352/apache-gora-reducer-for-multi-table-output-with-hbase

1条答案

按热度按时间

8wigbo561#

我从来没有按你的要求做过，但你可能会在gora教程的“构建工作”部分看到答案。这里有一个减速机配置的例子，上面说：

/* Mappers are initialized with GoraMapper.initMapper() or 
 * GoraInputFormat.setInput()*/
GoraMapper.initMapperJob(job, inStore, TextLong.class, LongWritable.class
    , LogAnalyticsMapper.class, true);

/* Reducers are initialized with GoraReducer#initReducer().
 * If the output is not to be persisted via Gora, any reducer 
 * can be used instead. */
GoraReducer.initReducerJob(job, outStore, LogAnalyticsReducer.class);

然后，不用 GoraReducer.initReducerJob() 您只需配置自己的减速机，如下链接所示（如果答案正确）：

GoraMapper.initMapperJob(job, inStore, TextLong.class, LongWritable.class
    , LogAnalyticsMapper.class, true);
job.setOutputFormatClass(MultiTableOutputFormat.class);
job.setReducerClass(MyReducer.class);
job.setNumReduceTasks(2);
TableMapReduceUtil.addDependencyJars(job);
TableMapReduceUtil.addDependencyJars(job.getConfiguration());

要知道在前面的示例中，Map器发出 (TextLong, LongWritable) 键值，所以你的减速机应该是这样的，从你写的链接和答案：

public class MyReducer extends TableReducer<TextLong, LongWritable, Put> {

    private static final Logger logger = Logger.getLogger( MyReducer.class );

    @SuppressWarnings( "deprecation" )
    @Override
    protected void reduce( TextLong key, Iterable<LongWritable> data, Context context ) throws IOException, InterruptedException {
        logger.info( "Working on ---> " + key.toString() );
        for ( Result res : data ) {
            Put put = new Put( res.getRow() );
            KeyValue[] raw = res.raw();
            for ( KeyValue kv : raw ) {
                put.add( kv );
            }

        ImmutableBytesWritable key = new ImmutableBytesWritable(Bytes.toBytes("tableName"));
        context.write(key, put);    

        }
    }
}

再说一次，我从来没有这样做过。。。所以也许行不通：\

赞(0）回复(0）举报 2021-06-09

我来回答

apache gora reducer，用于带hbase的多表输出

1条答案

相关问题

热门标签

最新问答