通过保留键在现有字典中创建字典python

vlf7wbxs  于 2023-02-28  发布在  Python
关注(0)|答案(1)|浏览(147)

我有本字典:

Dialogues = {
   "dialogue_id": "000001", 
   "dialogue_turns": [
      { "turn_number": 0,
        "interlocutor_id": "0001",
        "turn_text": "Hi, how are you?" },
      { "turn_number": 1,
        "interlocutor_id": "0002",
        "turn_text": "Hi, I'm fine thanks. And you?" },
      { "turn_number": 2,
        "interlocutor_id": "0001",
        "turn_text": "I am good too, are you coming to the class today" },
      { "turn_number": 3,
        "interlocutor_id": "0002",
        "turn_text": "Yes, see you soon.bye" },
      { "turn_number": 4,
        "interlocutor_id": "0001",
        "turn_text": "bye" }
   ]
}

我想在其中添加一个嵌套字典,方法是对数据进行一些计数,如下所示:

Dialogues_analyzed = {
   "dialogue_id": "000001", 
   "dialogue_analysis": [
      {"interlocutor_id": "0001",
       "total_turns":"3",
       "total_words":"number of words in all turns of id=1"},
      {
         {"interlocutor_id": "0002",
          "total_turns":"2",
          "total_words":"number of words in all turns of id=2"}
      }
   ]
}

我怎样才能得到没有嵌套的for循环的输出呢?我试着创建一个新字典,然后将它合并到主字典中。但是,我丢失了新字典中的键。

2guxujil

2guxujil1#

如果你有

# [ just copied from your example ]
Dialogues = {'dialogue_id': '000001', 'dialogue_turns': [
  {'turn_number': 0, 'interlocutor_id': '0001', 'turn_text': 'Hi, how are you?'},
  {'turn_number': 1, 'interlocutor_id': '0002', 'turn_text': "Hi, I'm fine thanks. And you?"},
  {'turn_number': 2, 'interlocutor_id': '0001', 'turn_text': 'I am good too, are you coming to the class today'},
  {'turn_number': 3, 'interlocutor_id': '0002', 'turn_text': 'Yes, see you soon.bye'},
  {'turn_number': 4, 'interlocutor_id': '0001', 'turn_text': 'bye'}]}

并试图

dKey, iKey = 'dialogue_turns', 'interlocutor_id' ## [ just to shorten lines ]
dt_grouped = [
    (iid, [d for d in Dialogues[dKey] if d[iKey]==iid]) 
    for iid in set(dt[iKey] for dt in Dialogues[dKey])
] ## group turns by interlocutor_id
dialogue_analysis = [{
    'interlocutor_id': iid, 'total_turns': len(dt_list),
    'total_words': sum(len(dt['turn_text'].split()) for dt in dt_list)
} for iid, dt_list in dt_grouped]

Dialogues_analyzed = {  'dialogue_id': Dialogues['dialogue_id'],
                        'dialogue_analysis': dialogue_analysis  }

那么Dialogues_analyzed就像

{'dialogue_id': '000001',
 'dialogue_analysis': [
  {'interlocutor_id': '0002', 'total_turns': 2, 'total_words': 10},
  {'interlocutor_id': '0001', 'total_turns': 3, 'total_words': 16}]}

如果要将dialogue_analysis添加到Dialogues中,可以使用**Dialogues['dialogue_analysis'] = dialogue_analysis**;或者,如果你想保持原来的Dialogues不变,你可以把它解包(用**)到一个新的字典中,这个字典也包含dialogue_analysis

Dialogues_with_Analysis = {**Dialogues, 'dialogue_analysis': dialogue_analysis}

并且Dialogues_with_Analysis(或Dialogues,如果向其添加分析)将如下所示:

{'dialogue_id': '000001', 
 'dialogue_turns': [
  {'turn_number': 0, 'interlocutor_id': '0001', 'turn_text': 'Hi, how are you?'},
  {'turn_number': 1, 'interlocutor_id': '0002', 'turn_text': "Hi, I'm fine thanks. And you?"},
  {'turn_number': 2, 'interlocutor_id': '0001', 'turn_text': 'I am good too, are you coming to the class today'},
  {'turn_number': 3, 'interlocutor_id': '0002', 'turn_text': 'Yes, see you soon.bye'},
  {'turn_number': 4, 'interlocutor_id': '0001', 'turn_text': 'bye'}],
 'dialogue_analysis': [
  {'interlocutor_id': '0002', 'total_turns': 2, 'total_words': 10},
  {'interlocutor_id': '0001', 'total_turns': 3, 'total_words': 16}]}

也可以将逻辑 Package 在函数中:

def analyse_dialogue(dialogue_turns:list):
    dt_grouped = [
        (iid, [d for d in dialogue_turns if d['interlocutor_id']==iid]) 
        for iid in set(dt['interlocutor_id'] for dt in dialogue_turns)
    ]
    return [{
        'interlocutor_id': iid, 'total_turns': len(dt_list),
        'total_words': sum(len(dt['turn_text'].split()) for dt in dt_list)
    } for iid, dt_list in dt_grouped]

# dialogue_analysis = analyse_dialogue(Dialogues['dialogue_turns'])

这样,如果您有一个类似 * Dialogues_list = [{'dialogue_id': '000001', ...}, {'dialogue_id': '000002', ...}, ...] * 的对话框列表,则可以使用以下方法构建分析列表

dKey = 'dialogue_turns'
dialogues_analyses = [analyse_dialogue(d[dKey]) for d in Dialogues_list]

或构建包含分析的新对话列表

dialogues_with_analyses = [{
    **d, 'dialogue_analysis': analyse_dialogue(d['dialogue_turns'])
} for d in Dialogues_list]

或者将分析添加到Dialogues_list中的每个对话字典中,其中

dKey, aKey = 'dialogue_turns', 'dialogue_analysis'
for di, d in enumerate(Dialogues_list): 
    Dialogues_list[di][aKey] = analyse_dialogue(d[dKey])

相关问题