java.lang.noclassdeffounderror在spark作业的执行器上

2ekbmq32  于 2021-05-29  发布在  Hadoop
关注(0)|答案(0)|浏览(350)

我正在尝试通过spark job为配置单元表中的每条记录写入dynamodb。详细错误是

Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 12 in stage 2.0 failed 4 times, most recent failure: Lost task 12.3 in stage 2.0 (TID 775, ip-10-0-0-xx.eu-west-1.compute.internal, executor 1): java.lang.NoClassDefFoundError: Could not initialize class com.amazonaws.services.dynamodbv2.AmazonDynamoDBClientBuilder

代码段如下:

object ObjName {

    def main(args: Array[String]): Unit = {
      print(classOf[com.amazonaws.services.dynamodbv2.AmazonDynamoDBClientBuilder].getProtectionDomain().getCodeSource().getLocation().toURI().getPath())

      val session = SparkSession.builder()
        .appName("app_name")
        .enableHiveSupport()
        .getOrCreate()
      import session.implicits._
      session.sparkContext.setLogLevel("WARN")

      session.sql("""
            select
                email,
                name
            from db.tbl
            """).rdd.repartition(40)
        .foreachPartition( iter => {
          val random = new Random();
          val client = AmazonDynamoDBClientBuilder.standard.withRegion(Regions.EU_WEST_1).withCredentials(new AWSStaticCredentialsProvider(new BasicAWSCredentials("access key", "secret key"))).build()
          val dynamoDB = new DynamoDB(client)
          val table = dynamoDB.getTable("table_name")
          iter.foreach(row => {
            val item = new Item().withPrimaryKey("email", row.getString(0)).withNumber("ts", (System.currentTimeMillis)*1000+random.nextInt(999+1)).withString("name", row.getString(1))
            table.putItem(item)
          })
        })
      }
}

maven依赖项:

<dependencies>
    <!-- https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk-dynamodb -->
    <dependency>
        <groupId>com.amazonaws</groupId>
        <artifactId>aws-java-sdk-dynamodb</artifactId>
        <version>1.11.170</version>
    </dependency>
    <dependency>
        <groupId>com.amazonaws</groupId>
        <artifactId>aws-java-sdk-core</artifactId>
        <version>1.11.170</version>
    </dependency>
    <dependency>
        <groupId>com.amazonaws</groupId>
        <artifactId>aws-java-sdk-s3</artifactId>
        <version>1.11.170</version>
    </dependency>

    <dependency>
        <groupId>com.amazonaws</groupId>
        <artifactId>jmespath-java</artifactId>
        <version>1.11.170</version>
    </dependency>

    <dependency>
        <groupId>org.apache.httpcomponents</groupId>
        <artifactId>httpclient</artifactId>
        <version>4.5.2</version>
    </dependency>
    <dependency>
        <groupId>org.apache.httpcomponents</groupId>
        <artifactId>httpcore</artifactId>
        <version>4.4.4</version>
    </dependency>

    <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core_2.11 -->
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.11</artifactId>
        <version>2.1.0</version>
        <scope>provided</scope>
    </dependency>

    <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-hive_2.10 -->
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-hive_2.11</artifactId>
        <version>2.1.0</version>
        <scope>provided</scope>
    </dependency>

    <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql_2.11 -->
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-sql_2.11</artifactId>
        <version>2.1.0</version>
        <scope>provided</scope>
    </dependency>
</dependencies>

在main方法的开头,我打印了类的jar文件位置 com.amazonaws.services.dynamodbv2.AmazonDynamoDBClientBuilder 这是成功的,所以它意味着这个类在驱动程序节点上加载得很好。
而且,我执行 jar tvf package.jar | grep -i AmazonDynamoDBClientBuilder --color 已经确认这个类在我打包的jar文件中。
提交spark作业的命令如下。不管添加 --jars 或者两者都会抱怨上述相同的错误。有什么建议吗?谢谢。

spark-submit --class MainClassName --jars /mnt/home/hadoop/aws-java-sdk-dynamodb-1.11.170.jar,/mnt/home/hadoop/aws-java-sdk-core-1.11.170.jar,/mnt/home/hadoop/aws-java-sdk-s3-1.11.170.jar,/mnt/home/hadoop/jmespath-java-1.11.170.jar --driver-memory 3G --num-executors 20 --executor-memory 4G --executor-cores 4 package.jar

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题