从Azure连接到MongoDB

wyyhbhjk  于 2022-12-18  发布在  Go
关注(0)|答案(1)|浏览(195)

我正在尝试将Azure数据块与MongoDB连接,但我收到一条错误消息,无法解决。
我收到以下错误

com.mongodb.MongoTimeoutException: Timed out after 30000 ms while waiting to connect. Client view of cluster state is {type=UNKNOWN, servers=[{address=sopt-bo-halo-01.northeurope.cloudapp.azure.com:27017, type=UNKNOWN, state=CONNECTING, exception={com.mongodb.MongoSocketWriteException: Exception sending message}, caused by {javax.net.ssl.SSLHandshakeException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target}, caused by {sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target}, caused by {sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target}}, {address=sopt-bo-halo-03.northeurope.cloudapp.azure.com:27017, type=UNKNOWN, state=CONNECTING, exception={com.mongodb.MongoSocketWriteException: Exception sending message}, caused by {javax.net.ssl.SSLHandshakeException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target}, caused by {sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target}, caused by {sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target}}, {address=sopt-bo-halo-02.northeurope.cloudapp.azure.com:27017, type=UNKNOWN, state=CONNECTING, exception={com.mongodb.MongoSocketWriteException: Exception sending message}, caused by {javax.net.ssl.SSLHandshakeException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target}, caused by {sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target}, caused by {sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target}}]
from pyspark.sql import SparkSession
    database = "prod-backoffice"
    collection = "AmazonRegion"
    connectionString = ‘<Username>://<Password> @hostN 
    /defaultdb?
    ssl=true&readPreference=primary&maxIdleTimeMS=60000&connectTimeoutMS=10000&
    authSource=DBNAME &authMechanism=SCRAM-SHA-1&
    tlsAllowInvalidCertificates=true&tlsAllowInvalidHostnames=true
    &sslAllowConnectionsWithoutCertificates=true&sslInvalidHostNameAllowed=true'
    spark = SparkSession.builder\
    .appName("MongoSparkConnectorIntro")\
    .config('spark.mongodb.input.uri',connectionString)\
    .config('spark.mongodb.output.uri', connectionString)\
    .config('spark.jars.packages', 'org.mongodb.spark:mongo-spark-connector_2.12:3.0.1')\
    .getOrCreate()
    df = spark.read\
    .format("com.mongodb.spark.sql.DefaultSource")\
    .option("uri", connectionString)\
    .option("database", database)\
    .option("collection", collection)\
    .load()
y1aodyip

y1aodyip1#

将MongoDBMap集与DataBrick连接

1.与数据库连接
通过将Databricks群集节点的外部IP地址添加到Atlas中的白名单,将Databricks群集连接并启用到群集。

访问MongoDB网络并添加Databrick集群IP地址。

1.使用MongoDB连接URI配置数据块集群

  • 获取MongoDB连接URI。单击您在MongoDBAtlas UI中创建的集群。
  • 单击连接按钮。
  • 单击连接应用程序。

1.根据您的Databrick MongoDB连接器配置,确保您的scala和spark版本

1.复制生成的连接字符串。它应该类似于mongodb+srv://:@Firstdatabase-wlcof.azure.mongodb.net/test?retryWrites=true
1.将password和Firstdatabase名称替换为您密码和数据库名称

数据块中的配置
方法1

在您的databricks群集中,选择配置选项卡并单击高级选项下的编辑按钮-〉。请使用连接字符串选择spark配置选项卡并更新spark配置。在配置选项卡中遵循以下格式

spark.mongodb.output.uri<connection-string>

spark.mongodb.input.uri<connection-string>

方法2

在python notebook中直接使用下面的代码配置设置。

from pyspark.sql import SparkSession

database = "<datanase_name>" #your database name

collection = "millionsongs" #your collection name

connectionString= copy your connection string here  ('mongodb+srv://user:<password>@cluster0.9rvsi.mongodb.net/<database>?retryWrites=true&w=majority')

spark = SparkSession

.builder

.config('spark.mongodb.input.uri',connectionString)

.config('spark.mongodb.input.uri', connectionString)

.config('spark.jars.packages', 'org.mongodb.spark:mongo-spark-connector_2.12:3.0.1')

.getOrCreate()

#Reading from MongoDB

df = spark.read

.format("com.mongodb.spark.sql.DefaultSource")

.option("uri", connectionString)

.option("database", database)

.option("collection", collection)

.load()

相关问题