aws emr配置单元：尚未支持udaf“count”的位置

woobm2wo 于 2021-06-27 发布在 Hive

关注(0)|答案(1)|浏览(473)

我有一个相当复杂的查询，我正试图转换为使用Hive。
具体来说，我将它作为aws emr集群中的一个配置单元“步骤”来运行。
我已经试着为帖子清理了一些问题，只留下问题的本质。
完整的错误消息是：

FAILED: SemanticException [Error 10128]: Line XX:XX Not yet supported place for UDAF 'COUNT'

行号指向 COUNT 在select语句的底部：

INSERT INTO db.new_table (
        new_column1,
        new_column2,
        new_column3,
        ... ,
        new_column20
    ) 
    SELECT MD5(COALESCE(TBL1.col1," ")||"_"||COALESCE(new_column5," ")||"_"||...) AS 
        new_col1,
        TBL1.col2,
        TBL1.col3,
        TBL1.col3 AS new_column3,
        TBL1.col4,
        CASE
            WHEN TBL1.col5 = …
            ELSE “some value”
        END AS new_column5,
        TBL1.col6,
        TBL1.col7,
        TBL1.col8,
        CASE
            WHEN TBL1.col9 = …
            ELSE "some value"
        END AS new_column9,
        CASE 
            WHEN TBL1.col10 = …
            ELSE "value"
        END AS new_column10,
        TBL1.col11,
        "value" AS new_column12,
        TBL2.col1,
        TBL2.col2,
        from_unixtime(…) AS new_column13,
        CAST(…) AS new_column14,
        CAST(…) AS new_column15,
        CAST(…) AS new_column16,
        COUNT(DISTINCT TBL1.col17) AS new_column17
    FROM db.table1 TBL1
    LEFT JOIN 
        db.table2 TBL2
            ON TBL1.col311 = TBL2.col311
    WHERE TBL1.col14 BETWEEN "low" AND "high"
        AND TBL1.col44 = "Y"
        AND TBL1.col55 = "N"
    GROUP BY 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20;

如果我漏掉太多，请告诉我。
谢谢你的帮助！
更新
事实证明，我确实漏掉了太多的信息。对不起那些已经试图帮助。。。
我做了上面的更新。
按列删除第20组，例如：

GROUP BY 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19;

产生： Expression not in GROUP BY key '' '' 最新的
按列删除第20组并添加第一组，例如：

GROUP BY 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19;

产生：

Line XX:XX Invalid table alias or column reference 'new_column5':(possible column
 names are: TBL1.col1, TBL1.col2, (looks like all columns of TBL1), 
TBL2.col1, TBL2.col2, TBL2.col311)

第#行引用带有select语句的行。错误输出中只列出了tbl2中的这三列。
错误似乎指向 COALESCE(new_column5) . 请注意，我有一个 CASE 在tbl 1 select中的语句，我用它运行 AS new_column5 .

Hive amazon-emr amazon-web-services hiveql

来源：https://stackoverflow.com/questions/54544188/aws-emr-hive-not-yet-supported-place-for-udaf-count

1条答案

按热度按时间

iq0todco1#

您正在寻址计算列名 new_column5 在计算它的同一子查询级别。这在 hive 里是不可能的。将其替换为计算本身或使用上层子查询。
这是：

MD5(COALESCE(TBL1.col1," ")||"_"||COALESCE(CASE WHEN TBL1.col5 = … ELSE “some value” END," ")||"_"||...) AS new_col1,

而不是这样：

MD5(COALESCE(TBL1.col1," ")||"_"||COALESCE(new_column5," ")||"_"||...) AS 
        new_col1,

赞(0）回复(0）举报 2021-06-27

我来回答

aws emr配置单元：尚未支持udaf“count”的位置

1条答案

相关问题

热门标签

最新问答