我有一个glue作业,它从s3 bucket中获取一个csv文件,并将数据导入postgres rds表。它通过jdbc连接连接到数据库。string/varchar列被导入,但numeric列没有被导入。
下面是postgres rds的列类型:
下面是python glue脚本:
def __step_mapping_columns(self):
# Script generated for node S3 bucket
dynamicFrame_dept_summary = self.glueContext.create_dynamic_frame.from_options(
format_options={"quoteChar": '"', "withHeader": True, "separator": ","},
connection_type="s3",
format="csv",
connection_options={
"paths": [
""
],
"recurse": True,
},
transformation_ctx="dynamicFrame_dept_summary",
)
# Script generated for node ApplyMapping
applyMapping_dept_summary = ApplyMapping.apply(
frame=dynamicFrame_dept_summary,
mappings=[("PROCESS_MAIN", "string", "process_main", "string"),
("PROCESS_CORE", "string", "process_core", "string"),
("DC", "string", "dc", "string"),
("BAG_SIZE", "string", "bag_size", "string"),
("EVENT_30_LOC", "string", "start_time_utc", "string"),
("VOLUME", "long", "box_volume", "long"),
("MINUTES", "long", "minutes", "long"),
("PLAN_MINUTES", "long", "plan_minutes", "long"),
("PLAN_RATE", "long", "plan_rate", "long")],
transformation_ctx="applyMapping_dept_summary",
)
logger.info(mappings)
return applyMapping_dept_summary
有人知道问题出在哪里吗?
1条答案
按热度按时间qyzbxkaa1#
解决了。我需要先将这些列的类型转换为long类型,因为动态框架不确定数据类型。
解析选项(规格=[('VOLUME','cast:long')]).解析选项(规格=[('分钟','cast:long')]).解析选项(规格= [('计划分钟','cast:long')]).解析选项(规格= [('计划速率','cast:long')]).解析选项(规格= [('计划速率','cast:long')])