在python3中使用pyspark从mysql数据库加载数据

68de4m5k  于 2021-06-21  发布在  Mysql
关注(0)|答案(1)|浏览(272)

我正在尝试使用pyspark从mysql数据库加载表。我写了以下代码:

from  pyspark.sql import SparkSession
from pyspark.sql import SQLContext

hostname='localhost'
jdbcPort=3306
dbname='db'
username='user'
password='password'

# jdbc_url = "jdbc:mysql://{0}:{1}/{2}".format(hostname, jdbcPort, dbname)

url="jdbc:mysql://"

# For SQLServer, pass in the "driver" option

# driverClass = "com.microsoft.sqlserver.jdbc.SQLServerDriver"

# Add "driver" : driverClass

connectionProperties = {
  "user" : username,
  "password" : password
}
pushdown_query = "select * from table LIMIT 10;"
df = spark.read.jdbc(url=url, dbtable=pushdown_query, properties=connectionProperties)

# sqlContext=SQLContext(sc)

# df=sqlContext.read.jdbc(url=url, table=pushdown_query, properties=properties)

display(df)

但我得到以下错误:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-21-70890f1cf807> in <module>()
     15 }
     16 pushdown_query = "select * from table LIMIT 10;"
---> 17 df = spark.read.jdbc(url=url, dbtable=pushdown_query, properties=connectionProperties)
     18 #sqlContext=SQLContext(sc)
     19 #df=sqlContext.read.jdbc(url=url, table=pushdown_query, properties=properties)

AttributeError: 'property' object has no attribute 'jdbc'

有人能帮我解决这个错误吗?
谢谢

sycxhyv7

sycxhyv71#

请尝试下面的代码从mysql读取数据。

hostname = ""
dbname = ""
jdbcPort = 
jdbc_url = "jdbc:mysql://{0}:{1}/{2}".format(hostname, jdbcPort, dbname)

connectionProperties = {
  "user" : username,
  "password" : password
}

query = "select * from table_name"
df = spark.read.jdbc(url=jdbc_url, dbtable=query, properties=connectionProperties)
df.show()

如果有用请告诉我。

相关问题