问题是,从spark shell中的hivecontext中,横向视图爆炸不起作用。下面是示例表和示例Spark代码。从Spark帧“vasotherdf”的预期输出是6,但它给出了8。
配置单元表:
CREATE EXTERNAL TABLE `aa`(
`col1` string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
'hdfs://nn1001.dev:8020/tmp/aa'
样本数据:
aaa|qq|ww|dd
aaa
aaa|bbb
ccc
配置单元输出:
select count(distinct vother) as vothers from aa LATERAL VIEW explode(split(col1,'\\|')) a as vother;
6
select distinct vother as vothers from rafm.aa LATERAL VIEW explode(split(col1,'\\|')) a as vother;
aaa
bbb
ccc
dd
qq
ww
Spark输出:
val vasOtherDF = hiveContext.sql("select count(distinct vother) as vothers from aa LATERAL VIEW explode(split(col1,'\\|')) a as vother")
output: 8
select distinct vother as vothers from rafm.aa LATERAL VIEW explode(split(col1,'\\|')) a as vother;
aaa
bbb
ccc
dd
qq
ww
val vasOtherDF = hiveContext.sql("select distinct vother as vothers from aa LATERAL VIEW explode(split(col1,'\\|')) a as vother")
scala> vasOtherDF.show
+-------+
|vothers|
+-------+
| a|
| b|
| c|
| d|
| q|
| w|
| ||
| |
+-------+
暂无答案!
目前还没有任何答案,快来回答吧!