我正在使用下面的代码来编写hbase
jsonDStream.foreachRDD(new Function<JavaRDD<String>, Void>() {
@Override
public Void call(JavaRDD<String> rdd) throws Exception {
DataFrame jsonFrame = sqlContext.jsonRDD(rdd);
DataFrame selecteFieldFrame = jsonFrame.select("id_str","created_at","text");
Configuration config = HBaseConfiguration.create();
config.set("hbase.zookeeper.quorum", "d-9543");
config.set("zookeeper.znode.parent","/hbase-unsecure");
config.set("hbase.zookeeper.property.clientPort", "2181");
final JobConf jobConfig=new JobConf(config,SveAsHadoopDataSetExample.class);
jobConfig.setOutputFormat(TableOutputFormat.class);
jobConfig.set(TableOutputFormat.OUTPUT_TABLE,"tableName");
selecteFieldFrame.javaRDD().mapToPair(new PairFunction<Row, ImmutableBytesWritable, Put>() {
@Override
public Tuple2<ImmutableBytesWritable, Put> call(Row row) throws Exception {
// TODO Auto-generated method stub
return convertToPut(row);
}
}).saveAsHadoopDataset(jobConfig);
return null;
}
});
但当我在zookeeper中看到zkdump时,连接不断增加
任何建议/建议都会大有帮助!
1条答案
按热度按时间8oomwypt1#
我有同样的问题,这是一个hbase错误,我修复它:
将org.apache.hadoop.hbase.mapred.tableoutputformat更改为org.apache.hadoop.hbase.mapreduce.tableoutputformat,并使用org.apache.hadoop.mapreduce.job,而不是org.apache.hadoop.mapred.jobconf
这是一个示例:
这也许对你有帮助!
https://github.com/hortonworks-spark/shc/pull/20/commits/2074067c42c5a454fa4cdeec18c462b5367f23b9