spark Streaming中与elasticsearch连接时的scala获取错误

e0bqpujr  于 2021-05-29  发布在  Hadoop
关注(0)|答案(1)|浏览(431)

做ElasticSearch连接时出错,我做了谷歌ElasticSearch好像有问题,有人能帮忙吗?我做tweeter情绪分析使用从kafka获得数据后,我消费它,做情绪分析,然后创建json格式的ElasticSearch索引

I am using below code for build.sbt:

++++++

libraryDependencies ++= Seq("org.apache.spark" %% "spark-core" % "1.6.1",
  "org.apache.spark" %% "spark-mllib" % "1.6.1",
  "org.apache.spark" % "spark-streaming_2.10" % "1.6.1",
  "org.apache.spark" % "spark-streaming-kafka_2.10" % "1.6.1",
  "org.apache.spark" % "spark-streaming-twitter_2.11" % "1.5.2",
  "org.twitter4j" % "twitter4j-core" % "4.0.4",
  "org.twitter4j" % "twitter4j-stream" % "4.0.4",
  "org.elasticsearch" % "elasticsearch-spark_2.10" % "2.2.0-m1" % "compile",
  "edu.stanford.nlp" % "stanford-corenlp" % "3.5.2" artifacts (Artifact("stanford-corenlp", "models"), Artifact("stanford-corenlp")))

+++++++++++++++++++

    17/04/16 22:37:40 ERROR Executor: Exception in task 1.0 in stage 6.0 (TID 20)

下面是错误日志

org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot find node with id [AieO64NJRQO-vkTU_K56dA] (is HTTP enabled?) from shard [Shard[state=STARTED, primary=true, node=AieO64NJRQO-vkTU_K56dA, name=0, index=twitter2]] in nodes [[{state=STARTED, primary=true, node=AieO64NJRQO-vkTU_K56dA, relocating_node=null, shard=0, index=twitter2, allocation_id={id=0sjxOAtLQr-V9aGCq6xexg}}]]; layout [[[{state=STARTED, primary=true, node=AieO64NJRQO-vkTU_K56dA, relocating_node=null, shard=0, index=twitter2, allocation_id={id=0sjxOAtLQr-V9aGCq6xexg}}], [{state=STARTED, primary=true, node=AieO64NJRQO-vkTU_K56dA, relocating_node=null, shard=1, index=twitter2, `allocation_id={id=9sl-iD3wQaeZyLpSXYhCRg}}], [{state=STARTED, primary=true, node=AieO64NJRQO-vkTU_K56dA, relocating_node=null, shard=2, index=twitter2, allocation_id={id=p4kg_9WWS467ac0KHGRIOQ}}], [{state=STARTED, primary=true, node=AieO64NJRQO-vkTU_K56dA, relocating_node=null, shard=3, index=twitter2, allocation_id={id=DEnyCsYWRZaHDyZexo39UA}}], [{state=STARTED, primary=true, node=AieO64NJRQO-vkTU_K56dA, relocating_node=null, shard=4, index=twitter2, allocation_id={id=Pqn82NKITVSofZrang7sHg}}]]]`
hmmo2u0o

hmmo2u0o1#

在sbt中使用以下代码将有助于:

// Versions
val sparkVersion = "2.1.1"
val ElasticV = "5.5.2"

libraryDependencies ++= Seq(

  "org.elasticsearch" % "elasticsearch-spark-20_2.11" % "5.5.2" ,
  "org.scala-lang" % "scala-library" % "2.11.11" ,
  "org.apache.spark" % "spark-core_2.11" % "2.1.1",
  "org.apache.spark" % "spark-sql_2.11" % "2.1.1"

)

// Elasticsearch *copies* joda-time code and patch it into their codebase. It causes several issues

    assemblyShadeRules in assembly := Seq(
ShadeRule.rename("org.joda.time.base.**" -> "org.elasticsearch.joda.time.@1").inLibrary("org.elasticsearch" % "elasticsearch" % ElasticV).inProject

    )

相关问题