我想创建一个通过亚马逊电子病历气流触发电子病历集群。emr集群出现在amazon emr的ui中,但出现一个错误:“专有网络/子网配置无效:需要子网:指定的示例类型m5.xlarge只能在专有网络中使用”
下面是中的代码段和配置详细信息 json
气流脚本中使用的此任务的格式。
我的问题是如何将专有网络和子网的信息(id码)合并到json中(如果可能的话)?没有明确的例子。
提示:已创建网络和ec2子网
JOB_FLOW_OVERRIDES = {
"Name": "sentiment_analysis",
"ReleaseLabel": "emr-5.33.0",
"Applications": [{"Name": "Hadoop"}, {"Name": "Spark"}], # We want our EMR cluster to have HDFS and Spark
"Configurations": [
{
"Classification": "spark-env",
"Configurations": [
{
"Classification": "export",
"Properties": {"PYSPARK_PYTHON": "/usr/bin/python3"}, # by default EMR uses py2, change it to py3
}
],
}
],
"Instances": {
"InstanceGroups": [
{
"Name": "Master node",
"Market": "SPOT",
"InstanceRole": "MASTER",
"InstanceType": "m5.xlarge",
"InstanceCount": 1,
},
{
"Name": "Core - 2",
"Market": "SPOT", # Spot instances are a "use as available" instances
"InstanceRole": "CORE",
"InstanceType": "m5.xlarge",
"InstanceCount": 2,
},
],
"KeepJobFlowAliveWhenNoSteps": True,
"TerminationProtected": False, # this lets us programmatically terminate the cluster
},
"JobFlowRole": "EMR_EC2_DefaultRole",
"ServiceRole": "EMR_DefaultRole",
}
create_emr_cluster = EmrCreateJobFlowOperator(
task_id="create_emr_cluster",
job_flow_overrides=JOB_FLOW_OVERRIDES,
aws_conn_id="aws_default",
emr_conn_id="emr_default",
dag=dag,
)
暂无答案!
目前还没有任何答案,快来回答吧!