配置单元python udf

nhaq1z21 于 2021-05-29 发布在 Hadoop

关注(0)|答案(1)|浏览(436)

我正在使用这个python udf脚本：

import sys
import collections 
import datetime
import re
try:
    for line in sys.stdin: 
        line=line.strip()
        number,sd=line.split('\t')
        sd=sd.lower()
        sd=sd.split(' ')
        new_sd_list=collections.OrderedDict(collections.Counter(sd))
        new_sd=' '.join(new_sd_list)
        print('\t'.join([str(number),str(new_sd])))
except:
    print(sys.exc_info())

在putty中执行以下命令时。

SELECT TRANSFORM(number,shortdescription) USING 'python name.py' \
   AS (number,shortdescription) FROM table;

我得到这个错误：
原因：org.apache.hadoop.hive.ql.metadata.hiveexception:处理行{“number”：“00548”，“shortdescription”：“印度优化器中主数据不一致检查”时发生配置单元运行时错误。}
失败：执行错误，从org.apache.hadoop.hive.ql.exec.mr.mapredtask返回代码2 mapreduce作业已启动：stage-stage-1:map:4 hdfs读取：0 hdfs写入：0失败花费的mapreduce cpu总时间：0毫秒

hadoop Hive python hive-udf

来源：https://stackoverflow.com/questions/46031510/hive-python-udf

1条答案

按热度按时间

csbfibhn1#

import sys
import collections 
import datetime
import re
try:
    for line in sys.stdin: 
        line=line.strip()
        number,sd=line.split('\t')
        sd=sd.lower()
        sd=sd.split(' ')
        new_sd_list=collections.OrderedDict(collections.Counter(sd))
        new_sd=' '.join(new_sd_list)
        print('\t'.join([str(number),str(new_sd)])) #syntax error
except:
    print(sys.exc_info())

赞(0）回复(0）举报 2021-05-29

我来回答

配置单元python udf

1条答案

相关问题

热门标签

最新问答