我用相同的选项多次读取数据。有没有办法避免重复常见的dataframereader选项,并以某种方式分别初始化它们,以便在以后每次读取时使用它们?
metrics_df = spark.read.format("jdbc") \
.option("driver", self.driver) \
.option("url", self.url) \
.option("user", self.username) \
.option("password", self.password) \
.load()
1条答案
按热度按时间zwghvu4y1#
定义所有选项
dataframereader
即<class 'pyspark.sql.readwriter.DataFrameReader'>
然后添加dbtable选项以重用dataframereader。Example:
```metrics_df_options = spark.read.format("jdbc")
.option("driver", self.driver)
.option("url", self.url)
.option("user", self.username)
.option("password", self.password)
type(metrics_df_options)
<class 'pyspark.sql.readwriter.DataFrameReader'>
configure dbtable and pull data from rdbms table
metrics_df_options.option("dbtable","<table_name>").load().show()