无法使用jupyter笔记本上的pyspark运行kafka流

ryevplcw  于 2021-06-08  发布在  Kafka
关注(0)|答案(0)|浏览(495)

我正在尝试从jupyter笔记本中的pyspark应用程序接收kafka消息流。我还将依赖的kafka spark包添加到jupyter笔记本本身的spark\u submit\u参数中,但在尝试侦听kafka消息时出现以下错误

Spark Streaming's Kafka libraries not found in class path. Try one of the 
following
1. Include the Kafka library and its dependencies with in the
 spark-submit command as
$ bin/spark-submit --packages org.apache.spark:spark-streaming-kafka-0- 
8:2.3.1 
...
2. Download the JAR of the artifact from Maven Central 
http://search.maven.org/,
Group Id = org.apache.spark, Artifact Id = spark-streaming-kafka-0-8- 
assembly, 
Version = 2.3.1.
Then, include the jar in the spark-submit command as
$ bin/spark-submit --jars <spark-streaming-kafka-0-8-assembly.jar>

我的代码是:

import os  
  os.environ['PYSPARK_SUBMIT_ARGS'] = '--packages org.apache.spark:spark- 
  streaming-kafka-0-8_2.11:2.0.2 pyspark-shell'
  from pyspark.streaming import StreamingContext
  from pyspark import SparkContext
  from pyspark.streaming.kafka import KafkaUtils
  sc=SparkContext(master='--local[2]')
  ssc=StreamingContext(sc,3)
  kk=KafkaUtils.createStream(ssc,'localhost:2181','demo',{'hello':'1'})

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题