如何使用apachebeam读取hive/如何在apachebeam中将hive用作源?
szqfcxe21#
hadoopinputformatio可用于从配置单元读取,如下所示:
Configuration conf = new Configuration(); conf.setClass("mapreduce.job.inputformat.class", HCatInputFormat.class, InputFormat.class); conf.setClass("key.class", LongWritable.class, WritableComparable.class); conf.setClass("value.class", DefaultHCatRecord.class, Writable.class); conf.set("hive.metastore.uris", "..."); HCatInputFormat.setInput(hiveConf, "myDatabase", "myTable", "myFilter"); PCollection<KV<LongWritable, DefaultHCatRecord>> data = p.apply(HadoopInputFormatIO.<Long, DefaultHCatRecord>read().withConfiguration(conf));
bd1hkmkf2#
2017年7月合并的拉取请求允许 Beam 2.1.0 支持 hive 通过 HCatalog https://issues.apache.org/jira/browse/beam-2357 .
Beam 2.1.0
hive
HCatalog
2条答案
按热度按时间szqfcxe21#
hadoopinputformatio可用于从配置单元读取,如下所示:
bd1hkmkf2#
2017年7月合并的拉取请求允许
Beam 2.1.0
支持hive
通过HCatalog
https://issues.apache.org/jira/browse/beam-2357 .