sql中的group-by和order-by

o7jaxewo 于 2021-05-27 发布在 Spark

关注(0)|答案(1)|浏览(499)

我正在尝试使用spark应用程序访问s3数据。我正在应用sparksql来检索数据。它不是按子句分组。

DataFrame summaryQuery=sql.sql("Select score from summary order by updationDate desc);
summaryQuery.groupBy("sessionId").count().show();
summaryQuery.show();

我也在直接尝试

DataFrame summaryQuery=sql.sql("Select score from summary group by sessionId order by updationDate desc);
summaryquery.show();

但在这两种情况下，我都得到了sql异常。

Exception in thread "main" org.apache.spark.sql.AnalysisException: expression 'score' is neither present in the group by, nor is it an aggregate function. Add to group by or wrap in first() (or first_value) if you don't care which value you get.;

请指定如何查询数据。

apache-spark apache-spark-sql

来源：https://stackoverflow.com/questions/40154617/group-by-and-order-by-in-spark-sql

1条答案

按热度按时间

4uqofj5v1#

在sparksql中，我们必须将它 Package 在first（column\u name）或last（column\u name）函数中，或者在groupby子句中不存在column\u name时 Package 在任何聚合函数中。它将分别从获取的行中获取第一个或最后一个值。

DataFrame summaryQuery=sql.sql("Select first(score) from summary group by sessionId order by updationDate desc);
summaryquery.show();

赞(0）回复(0）举报 2021-05-27

我来回答

sql中的group-by和order-by

1条答案

相关问题

热门标签

最新问答