hive用子查询填充表

mm5n2pyu  于 2021-06-02  发布在  Hadoop
关注(0)|答案(1)|浏览(416)

我正在开发一个hadoop数据库,使用hive作为首选接口。我希望能够将几个select语句组合到一个查询中(有点像union,但每个查询填充一个不同的列)。下面的查询将在一个列中返回我需要的所有结果,但我希望能够使用每个查询填充一个单独的列。任何关于如何做到这一点的帮助都是非常棒的——某种等同于价值观的Hive可能会做到这一点。干杯。

  1. INSERT OVERWRITE TABLE tstr_tmp SELECT * FROM
  2. (SELECT time_stamp FROM http WHERE ext_hostname = 'exotichorse' AND dt = '01/07/2015' AND ext_url = 'http://lucy.info' ORDER BY time_stamp asc limit 1) as last_visit_of_day
  3. UNION ALL
  4. SELECT * FROM (SELECT CAST(COUNT(hr) as string) FROM http WHERE ext_hostname = 'exotichorse' AND dt = '01/07/2015' AND ext_url = 'http://lucy.info' group by ext_url) as n_hour_bins
  5. UNION ALL
  6. SELECT * FROM (SELECT time_stamp FROM http WHERE ext_hostname = 'exotichorse' AND dt = '01/07/2015' AND ext_url = 'http://lucy.info' ORDER BY time_stamp desc limit 1) as first_visit_of_day
  7. UNION ALL
  8. SELECT * FROM (SELECT ext_url FROM http WHERE ext_url = 'http://lucy.info' group by ext_url) as domain_name
  9. UNION ALL
  10. SELECT * FROM (SELECT CAST(count(*) as string) FROM http WHERE ext_hostname = 'exotichorse' AND dt = '01/07/2015' AND ext_url = 'http://lucy.info' group by ext_url) as n_http_requests
  11. UNION ALL
  12. SELECT * FROM (SELECT int_ip FROM http WHERE ext_hostname = 'exotichorse' group by int_ip) as internal_ip;

按照下面的请求,每个查询将返回一个字符串形式的值。对于这个特定的查询集,将返回以下结果:;

  1. 00:08:00
  2. 2
  3. 07:00:00
  4. http://lucy.info
  5. 2
  6. 192.168.0.22

我正在开发一个数据库,它将告诉我有关用户流量的信息,所以这个子集将填充下表;

  1. CREATE TABLE metric_http_domain_time_summary( last_visit_of_day string, n_hour_bins string, first_visit_of_day string, domain_name string, n_http_requests string, internal_ip string) PARTITIONED BY (dt string, hr string, origin string, cl string, st string);

我知道我需要对输入的数据进行分区,但我对此相当有信心,一旦我成功地运行了未分区的查询,就会对其进行编辑。我能力上的差距是将子查询串在一起以填充表。

owfi6suc

owfi6suc1#

在离开并思考了很长一段时间之后,我找到了答案。工会是没有必要的,实际上是在妨碍。此查询将根据需要返回上述输出。如果其他人也有同样的问题,我会把这个留着。由于堆栈溢出信誉的限制,我不得不去掉ext\uurl,但是这个概念可以用。

  1. SELECT * FROM
  2. (SELECT time_stamp FROM http WHERE referrer_hostname = 'exotichorse' AND dt = '01/07/2015' AND ext_url = 'ext_url_here' ORDER BY time_stamp asc limit 1) as last_visit_of_day,
  3. (SELECT CAST(COUNT(hr) as string) FROM http WHERE referrer_hostname = 'exotichorse' AND dt = '01/07/2015' AND ext_url = 'ext_url_here' group by ext_url) as n_hour_bins,
  4. (SELECT time_stamp FROM http WHERE referrer_hostname = 'exotichorse' AND dt = '01/07/2015' AND ext_url = 'ext_url_here' ORDER BY time_stamp desc limit 1) as first_visit_of_day,
  5. (SELECT ext_url FROM http WHERE ext_url = 'http://lucy.info' group by ext_url) as domain_name,
  6. (SELECT CAST(count(*) as string) FROM http WHERE referrer_hostname = 'exotichorse' AND dt = '01/07/2015' AND ext_url = 'ext_url_here' group by ext_url) as n_http_requests,
  7. (SELECT int_ip FROM http WHERE referrer_hostname = 'exotichorse' group by int_ip) as internal_ip;

相关问题