hdinsight上的pyspark kafka流式处理-typeerror:“javapackage”对象不可调用

xj3cbfub  于 2021-05-18  发布在  Spark
关注(0)|答案(0)|浏览(421)

这是个老问题,但没有一个答案能解决我的问题。我将azurehdinsight用于我的spark集群和kafka消息代理的事件中心。
例如,我使用了和这个答案完全相同的方法,但仍然有错误。

  1. 'JavaPackage' object is not callable
  2. Traceback (most recent call last):
  3. File "/usr/hdp/current/spark2-client/python/pyspark/streaming/kafka.py", line 78, in createStream
  4. helper = KafkaUtils._get_helper(ssc._sc)
  5. File "/usr/hdp/current/spark2-client/python/pyspark/streaming/kafka.py", line 217, in _get_helper
  6. return sc._jvm.org.apache.spark.streaming.kafka.KafkaUtilsPythonHelper()
  7. TypeError: 'JavaPackage' object is not callable

我的代码是:

  1. import os
  2. sparkVersion = '2.3.2' # update this accordingly
  3. os.environ['PYSPARK_SUBMIT_ARGS'] = '--packages org.apache.spark:spark-streaming-kafka-0-8:{} pyspark-shell'.format(sparkVersion)
  4. # import Spark core
  5. from pyspark.sql import SparkSession
  6. from pyspark.streaming import StreamingContext
  7. # import extra packages
  8. from pyspark.streaming.kafka import KafkaUtils
  9. ssc = StreamingContext(sc, 60)
  10. kafkaStream = KafkaUtils.createStream(ssc, 'hellos.servicebus.windows.net:9093', 'test-group', {'twitter':1})

这是我的spark版本:

  1. sc.version
  2. '2.3.2.2.6.5.3026-7'

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题