python-3.x 如何在从嵌套JSON中提取某个特定键值时获得正确的信息

v09wglhw  于 2023-02-26  发布在  Python
关注(0)|答案(2)|浏览(175)

我想提取任务名称和配置对应到每个任务到新的变量。
我分享的代码没有给我想要的输出。虽然它提取了一些信息,但它不能提取所有需要的细节。
下面是json:

old = {
        "tasks": [
            {
                "task_group_id": "Task_group_1",
                "branch": [
                    {
                        "task_id": "Task_Name_1",
                        "code_file_path": "tasks/base_creation/final_base_logic.hql",
                        "language": "hive",
                        "config": {
                            "k1": "v1",
                            "Q1":"W1"
                        },
                        "sequence": 1,
                        "condition": "in_start_date in range [2021-10-01 , 2023-11-04]"
                    }
                ],
                "default": {
                    "task_id": "Task_group_1_default",
                    "code_file_path": "tasks/base_creation/default_base_logic.hql",
                    "language": "hive",
                    "config": {}
                }
            },
            {
                "task_group_id": "Task_group_2",
                "branch": [
                    {
                        "task_id": "Task_Name_2",
                        "code_file_path": "tasks/variables_creation/final_cas_logic.py",
                        "language": "pyspark",
                        "config": {
                            "k2": "v2"
                        },
                        "sequence": 1,
                        "condition": "in_start_date in range [2022-02-01 , 2023-11-04]"
                    },
                    {
                        "task_id": "Task_Name_3",
                        "code_file_path": "tasks/variables_creation/final_sor_logic.py",
                        "language": "pyspark",
                        "config": {
                            "k3": "v3"
                        },
                        "sequence": 2,
                        "condition": "in_start_date in range [2021-10-01 , 2022-01-31]"
                    }
                ],
                "default": {
                    "task_id": "Task_group_2_default",
                    "code_file_path": "tasks/variables_creation/default_variables_logic.py",
                    "language": "pyspark",
                    "config": {}
                }
            }
        ],
        "dependencies": " ['task_group_id_01_Name >> task_group_id_02_Name']"
    }

下面是我的代码提取信息:

o_mod = []
for grp in range(len(old['tasks'])):
    for task_id in range(len(old['tasks'][grp]['branch'])):
        o_mod.append({})
        o_mod[grp]['task_id'] = old['tasks'][grp]['branch'][task_id]['task_id']
        o_mod[grp]['config'] = old['tasks'][grp]['branch'][task_id]['config']
            
print(o_mod)

以下是错误的输出:

[{'task_id': 'Task_Name_1', 'config': {'k1': 'v1', 'Q1': 'W1'}},
 {'task_id': 'Task_Name_3', 'config': {'k3': 'v3'}},
 {}]

我希望输出如下所示(正确的输出):

[{'task_id': 'Task_Name_1', 'config': {'k1': 'v1', 'Q1': 'W1'}},
 {'task_id': 'Task_Name_2', 'config': {'k2': 'v2'}},
 {'task_id': 'Task_Name_3', 'config': {'k3': 'v3'}}}]
vfwfrxfs

vfwfrxfs1#

可以在tasksbranch上使用嵌套列表解析:

o_mod = [ { 'task_id' : b['task_id'], 'config' : b['config'] } for t in old['tasks'] for b in t['branch'] ]

输出:

[
    {
        "task_id": "Task_Name_1",
        "config": {
            "k1": "v1",
            "Q1": "W1"
        }
    },
    {
        "task_id": "Task_Name_2",
        "config": {
            "k2": "v2"
        }
    },
    {
        "task_id": "Task_Name_3",
        "config": {
            "k3": "v3"
        }
    }
]
sxpgvts3

sxpgvts32#

如果你喜欢的话还有多行的。

new_list = []
task_list = old["tasks"]
for task in task_list:
    branch_list = task["branch"]
    for branch in branch_list:
        new_list.append({"task_id":branch["task_id"],"config":branch["config"]})


print(new_list)

输出:
[{"任务标识":"任务名称1","配置":{" k1 ":" v1 "," Q1 ":'W1'}},{'任务标识':"任务名称2","配置":{" k2 ":" v2 "}},{"任务标识":"任务名称3","配置":{" k3 ":" v3 "}}]

相关问题