convert s3 object as pyspark dataframe with boto3:typeerror:path只能是string、list或rdd

c90pui9n 于 2021-05-29 发布在 Spark

关注(0)|答案(0)|浏览(564)

我正在阅读s3（minio）对象并试图将其转换为pysparkDataframe。我在试这个。

import boto3
from botocore.client import Config
s3 = boto3.client('s3',endpoint_url='Endpoint URL',aws_access_key_id='key',aws_secret_access_key='secret key',config=Config(signature_version='s3v4'),region_name='us-east-1')
obj = s3.get_object(Bucket='bucket_name', Key='file_name')
df = spark.read.option('header','true').option('inferSchema','true').csv(io.BytesIO(obj['Body'].read()))

但是我得到了下面的错误-

df = spark.read.option('header','true').option('inferSchema','true').csv(io.BytesIO(obj['Body'].read()))
'TypeError: path can be only string, list or RDD\n'])

同样的代码对于读取python3dataframe很好，但是对于pyspark dataframe却不起作用-

df = pd.read_csv(io.BytesIO(obj['Body'].read()))

请帮忙

python apache-spark pyspark boto3 amazon-s3

来源：https://stackoverflow.com/questions/62226629/convert-s3-object-as-pyspark-dataframe-with-boto3-typeerror-path-can-be-only

暂无答案！

目前还没有任何答案，快来回答吧！

我来回答

convert s3 object as pyspark dataframe with boto3:typeerror:path只能是string、list或rdd

暂无答案！

相关问题

热门标签

最新问答