在pig中连接数据后的multipy

kb5ga3dv  于 2021-06-25  发布在  Pig
关注(0)|答案(1)|浏览(400)

我尝试将两个字段相乘,并在pig中加入三个表后取它们的和。然而,我不断得到这个错误: <file loyalty_program.pig, line 30, column 74> (Name: Multiply Type: null Uid: null)incompatible types in Multiply Operator left hand side:bag :tuple(new_details1::new_details::potential_customers::num_of_orders:long) right hand side:bag :tuple(products::price:int) ```
-- load the data sets
orders = LOAD '/dualcore/orders' AS (order_id:int,
cust_id:int,
order_dtm:chararray);

details = LOAD '/dualcore/order_details' AS (order_id:int,
prod_id:int);

products = LOAD '/dualcore/products' AS (prod_id:int,
brand:chararray,
name:chararray,
price:int,
cost:int,
shipping_wt:int);
recent = FILTER orders by order_dtm matches '2012-.*$';

customer = GROUP recent by cust_id;

cust_orders = FOREACH customer GENERATE group as cust_id, (int)COUNT(recent) as num_of_orders;

potential_customers = FILTER cust_orders by num_of_orders>=5;

new_details = join potential_customers by cust_id, recent by cust_id;
new_details1 = join new_details by order_id, details by order_id;
new_details2 = join new_details1 by prod_id, products by prod_id;
--DESCRIBE new_details2;

final_details = FOREACH new_details2 GENERATE potential_customers::cust_id, potential_customers::num_of_orders as num_of_orders,recent::order_id as order_id,recent::order_dtm,details::prod_id,products::brand,products::name,products::price as price,products::cost,products::shipping_wt;

grouped_data = GROUP final_details by cust_id;

member = FOREACH grouped_data GENERATE SUM(final_details.num_of_orders * final_details.price) ;
lim = limit member 10;
dump lim;

我甚至把count的结果转换成int。它仍然不断地向我抛出这个错误。我不知道该怎么办。
mwecs4sa

mwecs4sa1#

好 啊。。我认为,首先,你想把购买数量乘以每种产品的价格,然后你需要乘以这个值的总和。。
尽管这是一个奇怪的要求,但你可以用下面的方法。。
你所需要做的就是在每个语句本身的最后一个细节中计算乘法,然后简单地应用乘法的和。。
根据你的load语句,我创建了以下输入文件
主订单.txt

6666,100,2012-01-01
7777,101,2012-09-02
8888,100,2012-01-09
9999,101,2012-12-08
6666,101,2012-09-02
9999,100,2012-07-12
9999,100,2012-08-01
6666,100,2012-01-02
7777,100,2012-09-09

订单详情.txt

6666,6000
7777,7000
8888,8000
9999,9000

主要产品.txt

6000,Nike,Shoes,3000,3000,1
7000,Adidas,Cap,1000,1000,1
8000,Rebook,Shoes,4000,4000,1
9000,Puma,Shoes,25000,2500,1

下面是代码

orders = LOAD '/user/cloudera/inputfiles/main_orders.txt'  USING PigStorage(',') AS (order_id:int,cust_id:int,order_dtm:chararray);

details = LOAD '/user/cloudera/inputfiles/orders_details.txt'  USING PigStorage(',') AS (order_id:int,prod_id:int);

products = LOAD '/user/cloudera/inputfiles/main_products.txt' USING PigStorage(',') AS(prod_id:int,brand:chararray,name:chararray,price:int,cost:int,shipping_wt:int);

recent = FILTER orders by order_dtm matches '2012-.*';

customer = GROUP recent by cust_id;

cust_orders = FOREACH customer GENERATE group as cust_id, (int)COUNT(recent) as num_of_orders;

potential_customers = FILTER cust_orders by num_of_orders>=5;

new_details = join potential_customers by cust_id, recent by cust_id;
new_details1 = join new_details by order_id, details by order_id;
new_details2 = join new_details1 by prod_id, products by prod_id;
DESCRIBE new_details2;

final_details = FOREACH new_details2 GENERATE potential_customers::cust_id, potential_customers::num_of_orders as num_of_orders,recent::order_id as order_id,recent::order_dtm,details::prod_id,products::brand,products::name,products::price as price,products::cost,products::shipping_wt, (potential_customers::num_of_orders * products::price ) as multiplied_price;// multiplication is achived in last variable
dump final_details;

 grouped_data = GROUP final_details by cust_id;

member = FOREACH grouped_data GENERATE SUM(final_details.multiplied_price)  ; 
lim = limit member 10;
dump lim;

为了清楚起见,我也将每个语句的最终细节输出转储。

(100,6,6666,2012-01-01,6000,Nike,Shoes,3000,3000,1,18000)
(100,6,6666,2012-01-02,6000,Nike,Shoes,3000,3000,1,18000)
(100,6,7777,2012-09-09,7000,Adidas,Cap,1000,1000,1,6000)
(100,6,8888,2012-01-09,8000,Rebook,Shoes,4000,4000,1,24000)
(100,6,9999,2012-07-12,9000,Puma,Shoes,25000,2500,1,150000)
(100,6,9999,2012-08-01,9000,Puma,Shoes,25000,2500,1,150000)

最终输出低于

(366000)

此代码可能对您有所帮助,但请再次澄清您的要求

相关问题