查询未找到列,建议配置单元sql中的同一列

z9ju0rcb  于 2021-06-26  发布在  Hive
关注(0)|答案(1)|浏览(386)

我在sql中有以下查询:

  1. select midquery.account, midquery.name, midquery.label, midquery.labelfrequency
  2. from(
  3. -- Count the appearance of each label.
  4. select count(*) as labelfrequency, account, name, label
  5. from(
  6. select account, name, label from myTable
  7. ) innerquery
  8. group by account, name, label
  9. ) midquery
  10. -- Select most frequent values only.
  11. where rank() over
  12. (partition by midquery.account, midquery.name
  13. order by midquery.labelfrequency desc) = 1

这样做的目的是找到每个名称帐户集最常见的标签。运行此查询时,出现以下错误:

  1. Error while compiling statement: FAILED: SemanticException [Error 10002]: Line 12:74 Invalid column reference 'labelfrequency': (possible column names are: labelfrequency, account, name, label)

我不太明白为什么翻译没有找到列labelfrequency,但可以建议它。你对如何解决这个问题有什么建议吗?
编辑:如果我将rank()移动到select部分,就会得到结果。

  1. select midquery.account, midquery.name, midquery.label, midquery.labelfrequency,
  2. rank() over (partition by midquery.account, midquery.name
  3. order by midquery.labelfrequency desc)
  4. from(
  5. -- Count the appearance of each label.
  6. select count(*) as labelfrequency, account, name, label
  7. from(
  8. select account, name, label from myTable
  9. ) innerquery
  10. group by account, name, label
  11. ) midquery
vxbzzdmp

vxbzzdmp1#

窗口函数在 WHERE 条款。这有很好的理由,但您可以将其视为sql的另一条规则——类似于无法识别列别名。
(真正的原因是指定在存在多个过滤条件时窗口函数的操作方式。要想出一套连贯的规则(几乎?)是不可能的
话虽如此,您可以简化您的查询:

  1. select t.account, t.name, t.label, t.labelfrequency
  2. from (select count(*) as labelfrequency, account, name, label,
  3. rank() over (partition by account, name
  4. order by count(*) desc
  5. ) as seqnum
  6. from myTable t
  7. group by account, name, label
  8. ) t
  9. where seqnum = 1;

也就是说,可以组合窗口函数和聚合函数。而且您不需要子查询来只指定少数几个列。

相关问题