连接pyspark和mongodb时出错| java.lang.noclassdeffounderror:org/bson/codecs/jsonObjectCodeProvider

evrscar2  于 2021-07-09  发布在  Spark
关注(0)|答案(0)|浏览(346)

我正在尝试使用pyspark连接mongodb并读取一些数据

  1. from pyspark.sql import SparkSession
  2. my_spark = SparkSession \
  3. .builder \
  4. .appName("MyApp") \
  5. .config("spark.mongodb.input.uri", "mongodb://127.0.0.1/my-db.my-coll") \
  6. .config("spark.mongodb.output.uri", "mongodb://127.0.0.1/my-db.my-coll") \
  7. .config("spark.jars.packages", "org.mongodb.spark:mongo-spark-connector_2.12:2.4.0") \
  8. .getOrCreate()
  9. df = my_spark.read.format("mongo").load()
  10. df.printSchema()

但我得到以下错误

  1. An error occurred while calling o42.load.
  2. : java.lang.NoClassDefFoundError: org/bson/codecs/JsonObjectCodecProvider

spark版本是3.1.1
我指的是这些资源:
https://docs.mongodb.com/spark-connector/current/python-api/
https://www.mongodb.com/blog/post/getting-started-with-mongodb-pyspark-and-jupyter-notebook

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题