hive 联接具有标签重复条目但不具有ID列重复条目的表

kse8i1jr  于 2022-11-05  发布在  Hive
关注(0)|答案(2)|浏览(118)

我有下面的表Fact_Sales:

ProductSK DateSK SalesAmount SalesNumber 
1         2019   300         150 
2         2019   500         190 
.....

和下面的表DimProduct:

ProductSK CategoryLabel 
1         ABC 
2         ABC 
....

我想按类别标签计算销售额,但在连接时,如下图所示,查询的结果将生成一个笛卡尔积:

SELECT CategoryLabel, SUM(SalesAmount)
FROM    Fact_Sales,     DimProduct
7gcisfzg

7gcisfzg1#

您可以LEFT链接到Product数据表的相异值:

SELECT DP.CategoryLabel, SUM(FS.SalesAmount)
FROM Fact_Sales AS FS
LEFT JOIN (
SELECT DISTINCT ProductSK,CategoryLabel FROM DimProduct
) AS DP
ON DP.ProductSK=FS.ProductSK
GROUP BY DP.CategoryLabel
bzzcjhmw

bzzcjhmw2#

您应该改用INNER JOIN ON

SELECT CategoryLabel, SUM(F.SalesAmount)
FROM Fact_Sales AS F INNER JOIN DimProduct AS D ON F.ProductSK = D.ProductSK
GROUP BY D.CategoryLabel

相关问题