如何计算hive中时间戳('yyyy-mm-dd hh:mm:ss')的平均值(mean)?

pobjuy32  于 2021-05-31  发布在  Hadoop
关注(0)|答案(2)|浏览(755)

我有一张table。看起来像this:-

user_name               idle_hours            working_hours       start_time       stop_time
sahil24c@gmail.com     2019-10-24 05:05:00  2019-10-24 05:50:00  2019-10-24 08:30:02  2019-10-24 19:25:02
magadum@gmail.com      2019-10-24 02:15:00  2019-10-24 08:39:59  2019-10-24 08:30:02  2019-10-24 19:25:01
yathink3@gmail.com     2019-10-24 01:30:00  2019-10-24 09:24:59  2019-10-24 08:30:02  2019-10-24 19:25:01
shelkeva@gmail.com     2019-10-24 00:30:00  2019-10-24 09:10:01  2019-10-24 08:45:01  2019-10-24 18:25:02
puruissim@gmail.com    2019-10-24 03:15:00  2019-10-24 07:19:59  2019-10-24 08:50:02  2019-10-24 19:25:01
sangita.awa@gmail.com  2019-10-24 01:55:00  2019-10-24 08:40:00  2019-10-24 08:50:01  2019-10-24 19:25:01
vaishusawan@gmail.com  2019-10-24 00:35:00  2019-10-24 09:55:00  2019-10-24 08:55:01  2019-10-24 19:25:01
you@example.com        2019-10-24 02:35:00  2019-10-24 08:04:59  2019-10-24 08:45:02  2019-10-24 19:25:01
samadhanma@gmail.com   2019-10-24 01:10:00  2019-10-24 08:39:59  2019-10-24 09:00:02  2019-10-24 18:50:01

我想找出平均工作时间。

select * from workinglogs where unix_timestamp(working_hours) < AVG(unix_timestamp(working_hours));

当我运行这个查询时,它不起作用。
错误showing:- failed:semanticexception[error 10128]:行1:64尚不支持udaf“avg”的位置

brjng4g3

brjng4g31#

因为您使用的是udaf,所以必须使用group by。选择每列(不要使用*),然后按所选列分组。

Select col1, col2, col3,......coln from workinglogs group by col1, col2, col3,......coln having unix_timestamp(working_hours) < AVG(unix_timestamp(working_hours));
jslywgbw

jslywgbw2#

你可以遵循这个方法
子查询用于计算平均值,查询用于过滤输出
以你的数据为例

+------------------------+-------------------------+----------------------------+-------------------------+------------------------+--+
| workinglogs.user_name  | workinglogs.idle_hours  | workinglogs.working_hours  | workinglogs.start_time  | workinglogs.stop_time  |
+------------------------+-------------------------+----------------------------+-------------------------+------------------------+--+
| magadum@gmail.com      | 2019-10-24 02:15:00.0   | 2019-10-24 08:39:59.0      | 2019-10-24 08:30:02.0   | 2019-10-24 19:25:01.0  |
| yathink3@gmail.com     | 2019-10-24 01:30:00.0   | 2019-10-24 09:24:59.0      | 2019-10-24 08:30:02.0   | 2019-10-24 19:25:01.0  |
| shelkeva@gmail.com     | 2019-10-24 00:30:00.0   | 2019-10-24 09:10:01.0      | 2019-10-24 08:45:01.0   | 2019-10-24 18:25:02.0  |
| puruissim@gmail.com    | 2019-10-24 03:15:00.0   | 2019-10-24 07:19:59.0      | 2019-10-24 08:50:02.0   | 2019-10-24 19:25:01.0  |
| sangita.awa@gmail.com  | 2019-10-24 01:55:00.0   | 2019-10-24 08:40:00.0      | 2019-10-24 08:50:01.0   | 2019-10-24 19:25:01.0  |
| vaishusawan@gmail.com  | 2019-10-24 00:35:00.0   | 2019-10-24 09:55:00.0      | 2019-10-24 08:55:01.0   | 2019-10-24 19:25:01.0  |
| you@example.com        | 2019-10-24 02:35:00.0   | 2019-10-24 08:04:59.0      | 2019-10-24 08:45:02.0   | 2019-10-24 19:25:01.0  |
| samadhanma@gmail.com   | 2019-10-24 01:10:00.0   | 2019-10-24 08:39:59.0      | 2019-10-24 09:00:02.0   | 2019-10-24 18:50:01.0  |
+------------------------+-------------------------+----------------------------+-------------------------+------------------------+--+

带子查询的查询

WITH t AS(
SELECT ROUND(AVG(unix_timestamp(working_hours)),2) as average
FROM workinglogs)
SELECT w.user_name,w.idle_hours,w.working_hours,w.start_time,w.stop_time 
FROM workinglogs AS w,t
WHERE unix_timestamp(w.working_hours) < t.average;

输出

+------------------------+------------------------+------------------------+------------------------+------------------------+--+
|      w.user_name       |      w.idle_hours      |    w.working_hours     |      w.start_time      |      w.stop_time       |
+------------------------+------------------------+------------------------+------------------------+------------------------+--+
| magadum@gmail.com      | 2019-10-24 02:15:00.0  | 2019-10-24 08:39:59.0  | 2019-10-24 08:30:02.0  | 2019-10-24 19:25:01.0  |
| puruissim@gmail.com    | 2019-10-24 03:15:00.0  | 2019-10-24 07:19:59.0  | 2019-10-24 08:50:02.0  | 2019-10-24 19:25:01.0  |
| sangita.awa@gmail.com  | 2019-10-24 01:55:00.0  | 2019-10-24 08:40:00.0  | 2019-10-24 08:50:01.0  | 2019-10-24 19:25:01.0  |
| you@example.com        | 2019-10-24 02:35:00.0  | 2019-10-24 08:04:59.0  | 2019-10-24 08:45:02.0  | 2019-10-24 19:25:01.0  |
| samadhanma@gmail.com   | 2019-10-24 01:10:00.0  | 2019-10-24 08:39:59.0  | 2019-10-24 09:00:02.0  | 2019-10-24 18:50:01.0  |
+------------------------+------------------------+------------------------+------------------------+------------------------+--+

相关问题