spark和hbase客户端的版本兼容性

6pp0gazn  于 2021-06-10  发布在  Hbase
关注(0)|答案(1)|浏览(618)

我想写一个Spark批作业。我想把它打包成一个jar,和spark submit一起使用。我的程序在spark shell中运行得非常好,但是当我尝试使用spark submit运行它时,出现以下错误:

Exception in thread "main" java.lang.NoSuchMethodError: scala.Predef$.$conforms()Lscala/Predef$$less$colon$less;
    at HBaseBulkload$.saveAsHFile(ThereInLocationGivenTimeInterval.scala:103)
    at HBaseBulkload$.toHBaseBulk(ThereInLocationGivenTimeInterval.scala:178)
    at ThereInLocationGivenTimeInterval$.main(ThereInLocationGivenTimeInterval.scala:241)
    at ThereInLocationGivenTimeInterval.main(ThereInLocationGivenTimeInterval.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

根据这个答案,问题源于版本不兼容。我也发现了这个,但我的spark版本是1.6.0,这是我的.sbt文件:

name := "HbaseBulkLoad"

version := "1.0"

scalaVersion := "2.10.5"

resolvers += "Cloudera Repository" at "https://repository.cloudera.com/artifactory/cloudera-repos/"

libraryDependencies += "org.apache.spark" %% "spark-core" % "1.6.0"
//libraryDependencies += "org.apache.hbase" % "hbase-common" % "1.2.0-cdh5.9.0"
//libraryDependencies += "org.apache.hbase" % "hbase-client" % "1.2.0-cdh5.9.0"
//libraryDependencies += "org.apache.hbase" % "hbase-server" % "1.2.0-cdh5.9.0"

libraryDependencies += "org.apache.hbase" % "hbase-client" % "1.1.2"
libraryDependencies += "org.apache.hbase" % "hbase-server" % "1.1.2"
libraryDependencies += "org.apache.hbase" % "hbase-common" % "1.1.2"

我的导入和导致错误的代码段如下:/simpleapp.scala/import org.apache.spark.sparkcontext import org.apache.spark.sparkcontext.\uimport org.apache.spark.sparkconf

// HBaseBulkLoad imports
import java.util.UUID

import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.fs.permission.FsPermission
import org.apache.hadoop.fs.{Path, FileSystem}
import org.apache.hadoop.hbase.{KeyValue, TableName}
import org.apache.hadoop.hbase.client._
import org.apache.hadoop.hbase.io.ImmutableBytesWritable
import org.apache.hadoop.hbase.mapreduce.{HFileOutputFormat2, LoadIncrementalHFiles}
import org.apache.hadoop.hbase.util.Bytes
import org.apache.hadoop.mapreduce.Job
import org.apache.hadoop.mapreduce.lib.partition.TotalOrderPartitioner
import org.apache.spark.rdd.RDD
import org.apache.spark.Partitioner
import org.apache.spark.storage.StorageLevel

import scala.collection.JavaConversions._
import scala.reflect.ClassTag

// Hbase admin imports
import org.apache.hadoop.hbase.{HBaseConfiguration, HTableDescriptor}
import org.apache.hadoop.hbase.client.HBaseAdmin
import org.apache.hadoop.hbase.mapreduce.TableInputFormat
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.hbase.HColumnDescriptor
import org.apache.hadoop.hbase.util.Bytes
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.client.HTable;
import java.util.Calendar

val now = Calendar.getInstance.getTimeInMillis    
//val filteredRdd = myRdd.filter(...
val resultRdd= filteredRdd.map{ row =>  (row(0).asInstanceOf[String].getBytes(), 
                                scala.collection.immutable.Map("batchResults" -> 
                                        Array( ( "batchResult1", ("true", now) ) ) 
                                ) 
                        )
        }
println( resultRdd.count )
gojuced7

gojuced71#

工作.sbt文件如下:

name := "HbaseBulkLoad"

version := "1.0"

scalaVersion := "2.10.5"

resolvers += "Cloudera Repository" at "https://repository.cloudera.com/artifactory/cloudera-repos/"

libraryDependencies += "org.apache.spark" % "spark-core_2.10" % "1.6.0-cdh5.9.0" 
libraryDependencies += "org.apache.hbase" % "hbase-common" % "1.2.0-cdh5.9.0"
libraryDependencies += "org.apache.hbase" % "hbase-client" % "1.2.0-cdh5.9.0"
libraryDependencies += "org.apache.hbase" % "hbase-server" % "1.2.0-cdh5.9.0"

如果您使用的是cloudera,则可以在以下目录中找到JAR和相应的版本:

/opt/cloudera/parcels/CDH-5.9.0-1.cdh5.9.0.p0.23/jars

相关问题