错误spark shell,返回到spark\u home下上传库

xmq68pz9  于 2021-05-29  发布在  Hadoop
关注(0)|答案(1)|浏览(302)

我正在尝试连接一个sparkshellamazonhadoop,但我一直在给出以下错误,不知道如何修复它或配置缺少的内容。 spark.yarn.jars , spark.yarn.archive ```
spark-shell --jars /usr/share/aws/emr/ddb/lib/emr-ddb-hadoop.jar
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel).
16/08/12 07:47:26 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
16/08/12 07:47:28 WARN Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.

谢谢!!!
错误1
我尝试运行一个sql查询,非常简单:

val sqlDF = spark.sql("SELECT col1 FROM tabl1 limit 10")
sqlDF.show()

警告yarnscheduler:初始作业未接受任何资源;检查集群ui以确保worker已注册并且具有足够的资源
错误2
然后我试着运行一个脚本scala,一些简单的东西收集在:https://blogs.aws.amazon.com/bigdata/post/tx2d93gzrhu3tes/using-spark-sql-for-etl

import org.apache.hadoop.io.Text;
import org.apache.hadoop.dynamodb.DynamoDBItemWritable
import com.amazonaws.services.dynamodbv2.model.AttributeValue
import org.apache.hadoop.dynamodb.read.DynamoDBInputFormat
import org.apache.hadoop.dynamodb.write.DynamoDBOutputFormat
import org.apache.hadoop.mapred.JobConf
import org.apache.hadoop.io.LongWritable
import java.util.HashMap

var ddbConf = new JobConf(sc.hadoopConfiguration)
ddbConf.set("dynamodb.output.tableName", "tableDynamoDB")
ddbConf.set("dynamodb.throughput.write.percent", "0.5")
ddbConf.set("mapred.input.format.class", "org.apache.hadoop.dynamodb.read.DynamoDBInputFormat")
ddbConf.set("mapred.output.format.class", "org.apache.hadoop.dynamodb.write.DynamoDBOutputFormat")

var genreRatingsCount = sqlContext.sql("SELECT col1 FROM table1 LIMIT 1")

var ddbInsertFormattedRDD = genreRatingsCount.map(a => {
var ddbMap = new HashMapString, AttributeValue

var col1 = new AttributeValue()
col1.setS(a.get(0).toString)
ddbMap.put("col1", col1)

var item = new DynamoDBItemWritable()
item.setItem(ddbMap)

(new Text(""), item)
}
)

ddbInsertFormattedRDD.saveAsHadoopDataset(ddbConf)

scala.reflect.internal.symbols$cyclicreference:非法循环引用,涉及scala.reflect.internal.symbols$symbol$$anonfun$info$3.apply(symbols。scala:1502)在scala.reflect.internal.symbols$symbol$$anonfun$info$3.apply(符号。scala:1500)在scala.function0$class.apply$mcv$sp(function0。scala:34)
dphi5xsq

dphi5xsq1#

看起来spark ui没有启动,尝试启动spark shell并检查sparkui localhost:4040 运行正常。

相关问题