我正在使用此python udf脚本:
import sys
import collections
import datetime
import re
try:
for line in sys.stdin:
line=line.strip()
number,sd=line.split('t')
sd=sd.lower()
sd=sd.split(' ')
new_sd_list=collections.OrderedDict(collections.Counter(sd))
new_sd=' '.join(new_sd_list)
print('t'.join([str(number),str(new_sd])))
except:
print(sys.exc_info())
在PUTTY中执行以下命令。
SELECT TRANSFORM(number,shortdescription) USING 'python name.py'
AS (number,shortdescription) FROM table;
我遇到了这个错误:
由:org.apache.hadoop.hive.ql.metadata.hiveException:处理行时蜂巢运行时错误{在印度优化器中。'}
失败:执行错误,返回代码2从org.apache.hadoop.hive.ql.exec.mr.mapredtask启动了MapReduce工作:阶段阶段1:地图:4 HDFS读取:0 HDFS写作:0失败总MAPREDUCE CPU花费的时间:0 msec
import sys
import collections
import datetime
import re
try:
for line in sys.stdin:
line=line.strip()
number,sd=line.split('t')
sd=sd.lower()
sd=sd.split(' ')
new_sd_list=collections.OrderedDict(collections.Counter(sd))
new_sd=' '.join(new_sd_list)
print('t'.join([str(number),str(new_sd)])) #syntax error
except:
print(sys.exc_info())