首页 > 解决方案 > Hive 自定义 UDF 类未找到问题

问题描述

我正在构建我的 UDF,如下所示:test_udf.java

package test;

import org.apache.commons.lang.StringUtils;
import org.apache.hadoop.hive.ql.exec.UDF;
import org.apache.hadoop.io.Text;

public class Strip extends UDF{
  private Text result = new Text();

  public Text evaluate(Text str){
    if ( str == null){
      return null;
    }

    result.set(StringUtils.strip(str.toString()));
    return result;
  }

  public Text evaluate(Text str,String stripChars){
    if ( str == null){
      return null;
    }

    result.set(StringUtils.strip(str.toString(),stripChars));
    return result;
  }
}

pom.xml

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>
  <groupId>sdf.dennis.com</groupId>
  <artifactId>test_udf</artifactId>
  <version>1.0</version>

  <name>hive</name>
  <url>http://maven.apache.org</url>

  <properties>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
  </properties>

  <dependencies>
    <dependency>
        <groupId>org.apache.hive</groupId>
        <artifactId>hive-exec</artifactId>
        <version>2.1.1</version>
    </dependency>
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-common</artifactId>
        <version>3.0.0</version>
    </dependency>
  </dependencies>
  <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-shade-plugin</artifactId>
                <version>2.2</version>
                <executions>
                    <execution>
                        <phase>package</phase>
                        <goals>
                            <goal>shade</goal>
                        </goals>
                        <configuration>
                        </configuration>
                    </execution>
                </executions>
            </plugin>
        </plugins>
 </build>
</project>

我使用了命令 mvn clean package。然后test_udf-1.0.jar在目标文件夹中生成了一个。

但是,我将 jar 文件添加到 hive 并创建了一个临时函数,它显示:

hive> add jar file:///home/dennis/java/target/test_udf-1.0.jar;
Added [file:///home/dennis/java/target/test_udf-1.0.jar] to class path
Added resources: [file:///home/dennis/java/target/test_udf-1.0.jar]
hive> create temporary function test_s as 'test.strip';
FAILED: Class test.strip not found
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.FunctionTask
hive> create temporary function test_s as 'test.Strip';
FAILED: Class test.Strip not found
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.FunctionTask

我无法弄清楚我犯了什么错误?

解压文件:

about_files                     groovyjarjarantlr            jodd                         module.properties                                     plugin.properties
about.html                      groovyjarjarasm              junit                        mozilla                                               plugin.xml
antlr                           groovyjarjarcommonscli       keytab.txt                   net                                                   properties.dtd
assets                          hdfs-default.xml             krb5-template.conf           org                                                   PropertyList-1.0.dtd
au                              hive-exec-log4j2.properties  krb5_udp-template.conf       org-apache-calcite-jdbc.properties                    schema
ccache.txt                      hive-log4j2.properties       license                      org.apache.hadoop.application-classloader.properties  shaded
codegen                         images                       LICENSE.txt                  org.codehaus.commons.compiler.properties              stylesheet.css
com                             javaewah                     log4j2.component.properties  overview.html                                         templates
common-version-info.properties  javax                        Log4j-config.xsd             overviewj.html                                        testpool.jocl
core-default.xml                javolution                   Log4j-events.dtd             package.jdo                                           tez-container-log4j2.properties
fr                              jersey                       Log4j-events.xsd             parquet                                               webapps
google                          jetty-dir.css                Log4j-levels.xsd             parquet-logging.properties                            yarn-default.xml
groovy                          jline                        META-INF                     parquet.thrift                                        yarn-version-info.properties

标签: javamavenhive

解决方案


可以尝试使用宏。

CREATE TEMPORARY MACRO fn_maskNull(input decimal(25,3))
    CASE
        WHEN input IS NULL THEN 0 else input
    END;
-- usage
select fn_maskNull(null), fn_maskNull(101);

更多信息:https ://medium.com/@gchandra/create-user-defined-functions-in-hive-beeline-ff965285d735


推荐阅读