我正在尝试将200万个项目插入dynamodb(wcu=40000)。但当我使用Spark贴图时,它是抛出错误。
%livy.pyspark
import shutil
from typing import Text, List
from pyspark.sql import SparkSession, DataFrame
import boto3
from urllib.parse import urlparse
from boto3.dynamodb.conditions import Key
dynamodb = boto3.resource('dynamodb', region_name="us-east-1")
table = dynamodb.Table("<dynamboDB>")
df=spark.read.parquet("s3 path").limit(10)
df.rdd.map(lambda row: table.put_item(Item=row.asDict()))
错误
Traceback (most recent call last):
File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/rdd.py", line 205, in __repr__
return self._jrdd.toString()
File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/rdd.py", line 2532, in _jrdd
self._jrdd_deserializer, profiler)
File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/rdd.py", line 2434, in _wrap_function
pickled_command, broadcast_vars, env, includes = _prepare_for_python_RDD(sc, command)
File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/rdd.py", line 2420, in _prepare_for_python_RDD
pickled_command = ser.dumps(command)
File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 607, in dumps
raise pickle.PicklingError(msg)
_pickle.PicklingError: Could not serialize object: TypeError: can't pickle SSLContext objects```
暂无答案!
目前还没有任何答案,快来回答吧!