在Postgresql数据库中对JSON数据建模的选项?

ha5z0ras  于 2023-02-17  发布在  PostgreSQL
关注(0)|答案(1)|浏览(119)

我有一个JSON文件,里面有关于员工和他们技能的数据。我需要以某种方式在PostgreSQL数据库中对数据进行建模(原因与我们正在开发的应用程序有关)。
JSON文件中有很多数据我的应用程序实际上并不需要(至少现在不需要),我只需要几列:员工ID、姓名、资格。但是其余的数据应该存储在表中(只是暂时的,因为这仍然是POC)。

数据

{
  "employee": {
  "ID": 654534543,,
  "Name": "Max Mustermann",
  "Email": "max.mustermann@firma.de",
  "skills": [
    {"name": python, "level": 3},
    {"name": c, "level": 2},
    {"name": openCV, "level": 3}
    ],
  },
"employee":{
  "ID": 3213213,,
  "Name": "Alex Mustermann",
  "Email": "alex.mustermann@firma.de",
  "skills":[
    {"name": Jira, "level": 3},
    {"name": Git, "level": 2},
    {"name": Tensorflow, "level": 3}
    ],
  }
};

我想创建一个包含以下列的表:员工ID作为主键,CHAR作为名称,数组作为技能,JSONB作为关于员工的其余信息。

CREATE TABLE employee(
    id INT PRIMARY KEY,
    name VARCHAR(255) NOT NULL,
    position VARCHAR(255) NOT NULL,
    description VARCHAR (255),
        skills TEXT [],
        join_date DATE,      

);

需要记住的一些因素:数据应该定期更新(比如说一个月一次),应用程序应该使用数据库查询一个(或多个)员工ID,这些员工ID涵盖了特定的必需技能集(和技能水平)。到目前为止,我们还不确定是否要查询json字段(但在不久的将来可能会实现)
而且,数据复杂且密集(我在下面附上的只是一个简化的示例),所以我猜直接从JSONB列查询将不方便(如在其他类似问题中提到的)
我现在的问题是:1-建议的数据模型是否满足所需条件,我们有一个巨大的json数据文件(快速搜索员工技能,可扩展,轻松/快速查询和检索员工数据(例如员工ID)?
2-开发关系数据库模式时应该考虑什么?
3-将数据拆分到多个表中是否有优势?例如,一个表用于员工个人数据,其中员工ID作为主键;一个表用于技能,其中员工ID作为外键,并且一个文本字段用于技能;一个JSON表用于其余数据。
我正在Windows 10上使用PostgreSQL 15. 1。我也还在熟悉PostgreSQL数据库。
非常感谢

oxcyiej7

oxcyiej71#

我会这么做:

create table employee (
  id bigint not null primary key,
  name text not null,
  email text not null
);

create table skill (
  id bigint generated always as identity primary key,
  skill_name text not null unique
);

create table employee_skill (
  id bigint generated always as identity primary key,
  employee_id bigint not null references employee(id),
  skill_id bigint not null references skill(id),
  skill_level int not null,
  unique (employee_id, skill_id)
);

然后,填充模式(在用JSON纠正错误之后):

with indata as (
  select '[
  {
  "ID": 654534543,
  "Name": "Max Mustermann",
  "Email": "max.mustermann@firma.de",
  "skills": [
    {"name": "python", "level": 3},
    {"name": "c", "level": 2},
    {"name": "openCV", "level": 3}
    ]
  },
  {
  "ID": 3213213,
  "Name": "Alex Mustermann",
  "Email": "alex.mustermann@firma.de",
  "skills":[
    {"name": "Jira", "level": 3},
    {"name": "Git", "level": 2},
    {"name": "Tensorflow", "level": 3}
    ]
  }
]'::jsonb as j
), expand as (
  select emp, skill
    from indata
         cross join lateral jsonb_array_elements(j) as el(emp)
         cross join lateral jsonb_array_elements(emp->'skills') as sk(skill)
), insemp as (
  insert into employee (id, name, email)
  select distinct (emp->>'ID')::bigint, emp->>'Name', emp->>'Email'
    from expand
  on conflict (id) do update
    set name = excluded.name, email = excluded.email
  returning *
), insskill as (
  insert into skill (skill_name)
  select distinct skill->>'name'
    from expand
  on conflict (skill_name) do nothing
  returning *
), allemp as (
  select * from insemp union select * from employee
), allskill as (
  select * from insskill union select * from insskill
), insempskill as (
  insert into employee_skill (employee_id, skill_id, skill_level)
  select e.id as employee_id, s.id as skill_id, 
         (i.skill->>'level')::int as skill_level
    from expand i
         join allemp e on e.id = (i.emp->>'ID')::bigint
         join allskill s on s.skill_name = i.skill->>'name'
  on conflict (employee_id, skill_id) do update
    set skill_level = excluded.skill_level
  returning *
)
delete from employee_skill
 where (employee_id, skill_id) not in 
  (select employee_id, skill_id from insempskill 
    union 
   select employee_id, skill_id from employee_skill)
;

参见working fiddle

相关问题