我正在使用阿里云存储spark脚本的处理数据,但无法将数据上传到存储。我知道s3包含了一些jar,但不知道如何在阿里oss服务中实现
from pyspark.sql import SparkSession
conf = SparkConf()
conf.set("spark.hadoop.fs.oss.impl", "com.aliyun.fs.oss.nat.NativeOssFileSystem")
spark = SparkSession.builder.config("spark.jars", "/home/username/mysql-connector-java-5.1.38.jar") \
.master("local").appName("PySpark_MySQL_test").getOrCreate()
wine_df = spark.read.format("jdbc").option("url", "jdbc:mysql://db.com:3306/service_db") \
.option("driver", "com.mysql.jdbc.Driver").option("query", "select * from transactions limit 1000") \
.option("user", "***").option("password", "***").load()
outputPath = "oss://Bucket_name"
rdd = wine_df.rdd.map(list)
rdd.saveAsTextFile(outputPath)
暂无答案!
目前还没有任何答案,快来回答吧!