无法在配置单元中添加udf

8yoxcaq7  于 2021-05-29  发布在  Hadoop
关注(0)|答案(1)|浏览(335)

我必须在配置单元中添加以下自定义项:

package com.hadoopbook.hive;

import org.apache.commons.lang.StringUtils;
import org.apache.hadoop.hive.ql.exec.UDF;
import org.apache.hadoop.io.Text;

public class Strip extends UDF {
  private Text result = new Text();

  public Text evaluate(Text str) {
    if (str == null) {
      return null;
    }
    result.set(StringUtils.strip(str.toString()));
    return result;
  }

  public Text evaluate(Text str, String stripChars) {
    if (str == null) {
      return null;
    }
    result.set(StringUtils.strip(str.toString(), stripChars));
    return result;
  }
}

这是《hadoop:权威指南》一书中的一个例子
我创造了 .class 使用以下命令创建上述java文件的文件:

hduser@nb-VPCEH35EN:~/Hadoop-tutorial/hadoop-book-master/ch17-hive/src/main/java/com/hadoopbook/hive$ javac Strip.java

然后我使用以下命令创建了jar文件:

hduser@nb-VPCEH35EN:~/Hadoop-tutorial/hadoop-book-master/ch17-hive/src/main/java/com/hadoopbook/hive$ jar cvf Strip.jar Strip Strip.class 
Strip : no such file or directory
added manifest
adding: Strip.class(in = 915) (out= 457)(deflated 50%)

我将geenrated jar文件添加到hdfs目录中:

hduser@nb-VPCEH35EN:~/Hadoop-tutorial/hadoop-book-master/ch17-hive/src/main/java/com/hadoopbook/hive$ hadoop dfs -copyFromLocal /home/hduser/Hadoop-tutorial/hadoop-book-master/ch17-hive/src/main/java/com/hadoopbook/hive/Strip.jar /user/hduser/input

我尝试使用以下命令创建自定义项:

hive> create function strip as 'com.hadoopbook.hive.Strip' using jar 'hdfs://localhost/user/hduser/input/Strip.jar';

但我有一个错误如下:
转换为本地hdfs://localhost/user/hduser/input/strip.jar 将[/tmp/hduser\u resources/strip.jar]添加到类路径添加的资源:[hdfs://localhost/user/hduser/input/strip.jar]未能使用com.hadoopbook.hive.strip类注册default.strip失败:执行错误,从org.apache.hadoop.hive.ql.exec.functiontask返回代码1
我还尝试创建临时函数。因此,我首先使用以下方法将jar文件添加到配置单元:

hive> add jar hdfs://localhost/user/hduser/input/Strip.jar;
converting to local hdfs://localhost/user/hduser/input/Strip.jar
Added [/tmp/hduser_resources/Strip.jar] to class path
Added resources: [hdfs://localhost/user/hduser/input/Strip.jar]

然后我尝试添加临时函数:

hive> create temporary function strip as 'com.hadoopbook.hive.Strip';

但我有以下错误:
失败:未找到类com.hadoopbook.hive.strip失败:执行错误,从org.apache.hadoop.hive.ql.exec.functiontask返回代码1
jar文件已成功创建并添加到配置单元中。但仍显示未找到该类。有人能告诉我这是怎么回事吗?

xcitsw88

xcitsw881#

是的,使用像eclipse这样的ide很容易,然后从cli制作jar。
从命令行创建jar文件必须遵循以下步骤:
首先在project dir下生成project dirs ch17-hive :
bin-将存储.class(strip.class)文件
lib-将存储所需的外部jar
traget-将存储您将创建的jar

[ch17-hive]$ mkdir bin lib traget
[ch17-hive]$ ls
bin  lib  src  target

将所需的外部jar复制到 ch170hive/lib 目录:

[ch17-hive]$ cp /usr/lib/hive/lib/hive-exec.jar lib/.
[ch17-hive]$ cp /usr/lib/hadoop/hadoop-common.jar lib/.

现在从目录编译java com.hadoopbook.hive.Strip 在你的情况下 ch17-hive/src/main/java :

[java]$ pwd
/home/cloudera/ch17-hive/src/main/java
[java]$ javac  -d ../../../bin -classpath ../../../lib/hive-exec.jar:../../../lib/hadoop-common.jar com/hadoopbook/hive/Strip.java

创建menifest文件为:

[ch17-hive]$ cat MENIFEST.MF 
Main-Class: com.hadoopbook.hive.Strip
Class-Path: lib/hadoop-common.jar  lib/hive-exec.jar

创建jar为

[ch17-hive]$ jar cvfm target/strip.jar MENIFEST.MF -C bin .added manifest
adding: com/(in = 0) (out= 0)(stored 0%)
adding: com/hadoopbook/(in = 0) (out= 0)(stored 0%)
adding: com/hadoopbook/hive/(in = 0) (out= 0)(stored 0%)
adding: com/hadoopbook/hive/Strip.class(in = 915) (out= 456)(deflated 50%)

现在,项目结构应该如下所示:

[ch17-hive]$ ls *
MENIFEST.MF

bin:
com

lib:
hadoop-common.jar  hive-exec.jar

src:
main

target:
strip.jar

将创建的jar复制到hdfs:

hadoop fs -put /home/cloudera/ch17-hive/target/strip.jar /user/cloudera/.

在Hive中使用:

hive> create function strip_new as 'com.hadoopbook.hive.Strip' using jar 'hdfs:/user/cloudera/strip.jar';
converting to local hdfs:/user/cloudera/strip.jar
Added [/tmp/05a13d23-8051-431f-a354-793abac66160_resources/strip.jar] to class path
Added resources: [hdfs:/user/cloudera/strip.jar]
OK
Time taken: 0.071 seconds
hive>

相关问题