查询未找到列,建议配置单元sql中的同一列

z9ju0rcb  于 2021-06-26  发布在  Hive
关注(0)|答案(1)|浏览(387)

我在sql中有以下查询:

select midquery.account, midquery.name, midquery.label,  midquery.labelfrequency
from(

    -- Count the appearance of each label.

    select count(*) as labelfrequency, account, name, label
    from(

        select account, name, label from myTable 

    ) innerquery

    group by account, name, label
) midquery

-- Select most frequent values only.
where rank() over 
    (partition by midquery.account, midquery.name 
     order by midquery.labelfrequency desc) = 1

这样做的目的是找到每个名称帐户集最常见的标签。运行此查询时,出现以下错误:

Error while compiling statement: FAILED: SemanticException [Error 10002]: Line 12:74 Invalid column reference 'labelfrequency': (possible column names are: labelfrequency, account, name, label)

我不太明白为什么翻译没有找到列labelfrequency,但可以建议它。你对如何解决这个问题有什么建议吗?
编辑:如果我将rank()移动到select部分,就会得到结果。

select midquery.account, midquery.name, midquery.label,  midquery.labelfrequency, 
    rank() over (partition by midquery.account, midquery.name 
     order by midquery.labelfrequency desc)
from(

    -- Count the appearance of each label.

    select count(*) as labelfrequency, account, name, label
    from(

        select account, name, label from myTable 

    ) innerquery

    group by account, name, label
) midquery
vxbzzdmp

vxbzzdmp1#

窗口函数在 WHERE 条款。这有很好的理由,但您可以将其视为sql的另一条规则——类似于无法识别列别名。
(真正的原因是指定在存在多个过滤条件时窗口函数的操作方式。要想出一套连贯的规则(几乎?)是不可能的
话虽如此,您可以简化您的查询:

select t.account, t.name, t.label, t.labelfrequency
from (select count(*) as labelfrequency, account, name, label,
             rank() over (partition by account, name
                          order by count(*) desc
                         ) as seqnum
      from myTable t
      group by account, name, label
     ) t
where seqnum = 1;

也就是说,可以组合窗口函数和聚合函数。而且您不需要子查询来只指定少数几个列。

相关问题