如何基于JSON数据创建Pickle字典[已关闭]

ohtdti5x  于 2023-03-13  发布在  其他
关注(0)|答案(1)|浏览(168)

**已关闭。**此问题需要debugging details。当前不接受答案。

编辑问题以包含desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem。这将有助于其他人回答问题。
昨天关门了。
Improve this question
嗨!
我需要创建一个字典:
{ job:['username'列表,在json数据中具有此职务的人员] }
下面是json数据

{'address': '01261 Cameron Spring\nTaylorfurt, AK 97791',
 'id': 35193,
 'jobs': ['Energy engineer',
          'Engineer, site',
          'Environmental health practitioner',
          'Biomedical scientist',
          'Jewellery designer'],
 'mail': 'jsalazar@gmail.com',
 'name': 'Lindsey Nguyen',
 'sex': 'F',
 'username': 'uhebert'}
{'address': '66992 Welch Brooks\nMarshallshire, ID 56004',
 'id': 91970,
 'jobs': ['Music therapist',
          'Volunteer coordinator',
          'Designer, interior/spatial'],
 'mail': 'bhudson@gmail.com',
 'name': 'Cheryl Lewis',
 'sex': 'F',
 'username': 'vickitaylor'}
{'address': 'Unit 1632 Box 2971\nDPO AE 23297',
 'id': 1848091,
 'jobs': ['Management consultant',
          'Engineer, structural',
          'Lecturer, higher education',
          'Theatre manager',
          'Designer, textile'],
 'mail': 'darren44@yahoo.com',
 'name': 'Julia Allen',
 'sex': 'F',
 'username': 'sheilaadams'}
{'address': '9880 Michelle Bridge\nNew Kimberlybury, WY 02583',
 'id': 50969,
 'jobs': ['Mechanical engineer', 'Retail banker', 'Barrister'],
 'mail': 'stevensonsarah@hotmail.com',
 'name': 'Gina Stevens',
 'sex': 'F',
 'username': 'nicole82'}
{'address': '9080 Monica Crescent Suite 820\nNorth Deanbury, HI 28977',
 'id': 676820,
 'jobs': ['Network engineer',
          'Youth worker',
          'Primary school teacher',
          'Engineer, broadcasting (operations)'],
 'mail': 'denise42@gmail.com',
 'name': 'Nicholas Harrington',
 'sex': 'M',
 'username': 'jean67'}
{'address': '635 Kenneth Ways Suite 172\nHancockfort, AZ 50544',
 'id': 64918,
 'jobs': ['Designer, ceramics/pottery',
          'Engineer, energy',
          'Engineer, manufacturing'],
 'mail': 'monique02@hotmail.com',
 'name': 'Allison Gomez',
 'sex': 'F',
 'username': 'james67'}

我不知道怎么做,但不幸的是它没有工作。它必须用pickle做,然后保存为名为“job_people.pickle”的文件

qni6mghb

qni6mghb1#

代码段不是JSON;最值得注意的是,JSON使用双引号"而不是单引号';但是更重要的是,包含用户数据的各个字典需要全部被安排到一个序列中(如下面的列表中)。

people = [
  {'address': '01261 Cameron Spring\nTaylorfurt, AK 97791', 'id': 35193, 'jobs': ['Energy engineer', 'Engineer, site', 'Environmental health practitioner', 'Biomedical scientist', 'Jewellery designer'], 'mail': 'jsalazar@gmail.com', 'name': 'Lindsey Nguyen', 'sex': 'F', 'username': 'uhebert'},
  {'address': '66992 Welch Brooks\nMarshallshire, ID 56004', 'id': 91970, 'jobs': ['Music therapist', 'Volunteer coordinator', 'Designer, interior/spatial'], 'mail': 'bhudson@gmail.com', 'name': 'Cheryl Lewis', 'sex': 'F', 'username': 'vickitaylor'},
  {'address': 'Unit 1632 Box 2971\nDPO AE 23297', 'id': 1848091, 'jobs': ['Management consultant', 'Engineer, structural', 'Lecturer, higher education', 'Theatre manager', 'Designer, textile'], 'mail': 'darren44@yahoo.com', 'name': 'Julia Allen', 'sex': 'F', 'username': 'sheilaadams'},
  {'address': '9880 Michelle Bridge\nNew Kimberlybury, WY 02583', 'id': 50969, 'jobs': ['Mechanical engineer', 'Retail banker', 'Barrister'], 'mail': 'stevensonsarah@hotmail.com', 'name': 'Gina Stevens', 'sex': 'F', 'username': 'nicole82'},
  {'address': '9080 Monica Crescent Suite 820\nNorth Deanbury, HI 28977', 'id': 676820, 'jobs': ['Network engineer', 'Youth worker', 'Primary school teacher', 'Engineer, broadcasting (operations)'], 'mail': 'denise42@gmail.com', 'name': 'Nicholas Harrington', 'sex': 'M', 'username': 'jean67'},
  {'address': '635 Kenneth Ways Suite 172\nHancockfort, AZ 50544', 'id': 64918, 'jobs': ['Designer, ceramics/pottery', 'Engineer, energy', 'Engineer, manufacturing'], 'mail': 'monique02@hotmail.com', 'name': 'Allison Gomez', 'sex': 'F', 'username': 'james67'}
]

[If如果您不确定该如何回答,您应该编辑问题以关注该问题,或者(最好)发布一个新问题。确保解释数据当前的格式。]
一旦你有了字典序列people,就有很多方法可以组成你的 * { job: [list of 'username', who has this job title in json data] } * 字典。
您可以使用.setdefault的嵌套for循环,如下所示:

job_people = {}
for p in people:
    for j in p['jobs']:
        job_people.setdefault(j, [])
        job_people[j].append(p['username'])

或者您可以使用dictionary comprehension

jobs = {j for p in people for j in p['jobs']} ## list of unique job titles
job_people = {j: [u['username'] for u in people if j in u['jobs']] for j in jobs}

您可以在一条语句中完成(使用 * job_people = {j: [u['username'] for u in people if j in u['jobs']] for p in people for j in p['jobs']}),但是每个列表([u['username'] for u in people if j in u['jobs']] *)只构建一次效率更高。
这对示例数据来说并没有太大的影响-因为每个职位似乎只出现一次。[因此每个列表只包含一个人/用户名...]使用上述方法之一,job_people [根据示例数据构建]应该如下所示

{'Energy engineer': ['uhebert'],
 'Engineer, site': ['uhebert'],
 'Environmental health practitioner': ['uhebert'],
 'Biomedical scientist': ['uhebert'],
 'Jewellery designer': ['uhebert'],
 'Music therapist': ['vickitaylor'],
 'Volunteer coordinator': ['vickitaylor'],
 'Designer, interior/spatial': ['vickitaylor'],
 'Management consultant': ['sheilaadams'],
 'Engineer, structural': ['sheilaadams'],
 'Lecturer, higher education': ['sheilaadams'],
 'Theatre manager': ['sheilaadams'],
 'Designer, textile': ['sheilaadams'],
 'Mechanical engineer': ['nicole82'],
 'Retail banker': ['nicole82'],
 'Barrister': ['nicole82'],
 'Network engineer': ['jean67'],
 'Youth worker': ['jean67'],
 'Primary school teacher': ['jean67'],
 'Engineer, broadcasting (operations)': ['jean67'],
 'Designer, ceramics/pottery': ['james67'],
 'Engineer, energy': ['james67'],
 'Engineer, manufacturing': ['james67']}

挑选数据

  • 必须使用pickle完成,然后保存为名为“job_people.pickle”的文件 *

一旦你有了job_people字典,你可以用pickle.dump将它保存到“job_people.pickle”中,就像这样:

# import pickle
with open('ob_people.pickle', 'wb') as fp:
    pickle.dump(job_people, fp)

顺便说一句,如果你想把职业列表缩减到更一般的职位,你可以使用字典(比如 * {'Primary school teacher': Teacher', 'Engineer, energy': 'Engineer',....} *)和.get(j,j),或者定义一个函数,比如

def generalize_job(job_title:str):
    titles_ref = {
        'engineer': ['engineer'], 
        'designer': ['design'],
        'educator': ['educat', 'teach']
    }
    for title, keywords in titles_ref.items():
        for kw in keywords:
            if kw.lower() in job_title.lower(): return title.title() 
    return job_title ## default: return the original string

并在for循环中使用它

job_people = {}
for p in people:
    for j in p['jobs']:
        j = generalize_job(j)
        if p['username'] in job_people.setdefault(j, []): continue
        job_people[j].append(p['username'])

得到job_people,如下所示:

{ 'Engineer': ['uhebert', 'sheilaadams', 'nicole82', 'jean67', 'james67'],
  'Designer': ['uhebert', 'vickitaylor', 'sheilaadams', 'james67'],
  'Educator': ['sheilaadams', 'jean67'],
  'Volunteer coordinator': ['vickitaylor'],
  'Management consultant': ['sheilaadams'],
  'Environmental health practitioner': ['uhebert'],
  'Retail banker': ['nicole82'],'Youth worker': ['jean67'], 
  'Barrister': ['nicole82'], 'Biomedical scientist': ['uhebert'], 
  'Theatre manager': ['sheilaadams'], 'Music therapist': ['vickitaylor'] }

相关问题