这是我今天的交易。好吧,我已经创建了一个关系,作为一对夫妇的转换后,阅读了关系Hive。问题是,我想把经过几次分析后的最终关系存储回Hive,但我做不到。让我们在我的代码里看清楚。
第一个字符串是当我从配置单元加载并转换结果时:
july = LOAD 'POC.july' USING org.apache.hive.hcatalog.pig.HCatLoader ;
july_cl = FOREACH july GENERATE GetDay(ToDate(start_date)) as day:int,start_station,duration; jul_cl_fl = FILTER july_cl BY day==31;
july_gr = GROUP jul_cl_fl BY (day,start_station);
july_result = FOREACH july_gr {
total_dura = SUM(jul_cl_fl.duration);
avg_dura = AVG(jul_cl_fl.duration);
qty_trips = COUNT(jul_cl_fl);
GENERATE FLATTEN(group),total_dura,avg_dura,qty_trips;
};
所以,现在当我尝试存储关系结果时,我不能,因为模式已更改,并且我认为它与配置单元不兼容:
使用org.apache.hive.hcatalog.pig.hcatstorer()将july\u结果存储到'poc.july\u analysis';
即使我试着为最后一段感情制定一个特别的计划,我也没有弄明白。
july_result = FOREACH july_gr {
total_dura = SUM(jul_cl_fl.duration);
avg_dura = AVG(jul_cl_fl.duration);
qty_trips = COUNT(jul_cl_fl);
GENERATE FLATTEN(group) as (day:int),total_dura as (total_dura:int),avg_dura as (avg_dura:int),qty_trips as (qty_trips:int);
};
1条答案
按热度按时间of1yzvn41#
通过对hortonworks社区的研究,我得到了如何在pig中为组关系定义输出格式的解决方案。我的新代码如下所示:
谢谢你们。