pig拉丁语法错误

ct3nt3jp  于 2021-06-02  发布在  Hadoop
关注(0)|答案(2)|浏览(322)

我有以下数据:

AGE,EDU,SEX,SALARY
67,10th,Male,<=50K
17,10th,Female,<=50K
40,Assoc-voc,Male,>50K
35,Assoc-voc,Male,<=50K
57,Assoc-voc,Male,<=50K
49,Assoc-voc,Male,>50K
42,Bachelors,Male,>50K
30,Bachelors,Male,>50K
23,Bachelors,Female,<=50K

========================================================
我的Pig拉丁语脚本是:

sensitive = LOAD '/mdsba' using PigStorage(',') as (AGE,EDU,SEX,SALARY);
--Filtered the data by the city
Data_filter1 = FILTER sensitive by (SALARY matches '<=50K');
Data_filter2 = FILTER sensitive by (SALARY matches '>50K');
BA= group  Data_filter1 by (EDU,SEX) ; 

BB= foreach BA generate group as EDU, COUNT (Data_filter1) as cn:int;

BC= FILTER BB by (cn == 4);

Dump BC ;

错误消息:
java.lang.classcastexception:java.lang.integer不能转换为java.lang.long
需要帮忙吗

1aaf6o9v

1aaf6o9v1#

问题是count返回long,但将其转换为int代码应如下所示:

BB= foreach BA generate group as EDU, COUNT (Data_filter1) as cn;

或者

BB= foreach BA generate group as EDU, COUNT (Data_filter1) as cn:long;
raogr8fs

raogr8fs2#

您遇到的问题是混合了int数据类型和long数据类型。
您需要手动将int转换为long。

相关问题