我在python脚本中遵循了zeropad.py下面的步骤
!/usr/bin/python
from org.apache.pig.scripting import *
@outputSchema('time:int')
def zero():
time.zfill(4)
=======================================
使用org.apache.pig.scripting.jython.jythonscriptengine作为myfuncs注册'zeropad.py';
Airlines_data_schema = LOAD 'AirlinesData_sample-1.csv' USING PigStorage('\t') AS (Year,Month,DayofMonth,DayofWeek,DepTime_actual:int,CRSDeptime:int,Arrtime_actual:int,CRSArrtime:int,UniqueCarrier,FlightNum,TailNum_Plane,ActualElapsedTime,CRSElapsedTime,Airtime,Arrdelay,Depdelay,Origin,Dest,Distance,Taxiin,Taxiout,Cancelled,CancellationCode,Diverted,CarrierDelay,WeatherDelay,NASDelay,SecurityDelay,LateAircraftDelay);
===================================================
airlines_new = FOREACH Airlines_data_schema GENERATE Year,Month,DayofMonth,DayofWeek,myfuncs.zero.DepTime_actual AS DepTime_actual_new,myfuncs.zero.CRSDeptime AS CRSDeptime_new,myfuncs.zero.Arrtime_actual AS Arrtime_actual_new,myfuncs.zero.CRSArrtime AS CRSArrtime_new,UniqueCarrier,FlightNum,TailNum_Plane,ActualElapsedTime,CRSElapsedTime,Airtime,Arrdelay,Depdelay,Origin,Dest,Distance,Taxiin,Taxiout,Cancelled,CancellationCode,Diverted,CarrierDelay,WeatherDelay,NASDelay,SecurityDelay,LateAircraftDelay ;
我得到以下错误
2017-02-26 19:37:19,606 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1025:
字段投影无效。架构中不存在投影字段[myfuncs]:year:bytearray,month:bytearray,第ayofmonth:bytearray,dayofweek:bytearray,部门时间_actual:int,c级rsdeptime:int,到达时间_actual:int,c级rsarrtime:int,统一uecarrier:bytearray,flightnum:bytearray,尾数_plane:bytearray,实际值apsedtime:bytearray,crsel公司apsedtime:bytearray,airtime:bytearray,arrdelay:bytearray,depdelay:bytearray,origin:bytearray,dest:bytearray,distance:bytearray,taxiin:bytearray,taxiout:bytearray,cancelled:bytearray,取消ationcode:bytearray,diverted:bytearray,汽车rierdelay:bytearray,wea公司therdelay:bytearray,nasdelay:bytearray,塞库ritydelay:bytearray,lateairc公司raftdelay:bytearray.
想知道为什么我不能使用python函数来操作我的列值
2条答案
按热度按时间8ljdwjyq1#
成功了!!!小的修正如下所示
4uqofj5v2#
尝试使用以下语法: