使用json文件名的名称更新嵌套JSON

91zkwejq  于 2022-12-24  发布在  其他
关注(0)|答案(1)|浏览(111)

我想知道你是否能帮我用json的原始文件名填充它。下面是一个json的例子:jsv是一个json列表(第一个主键是文档编号(document_0,document_1 ...)

jsv =

[
   {
      {
         "document_0":{
            "id":111,
            "laboratory":"xxx",
            "document_type":"xxx",
            "language":"pl",
            "creation_date":"09-12-2022",
            "source_filename":"None",
            "version":"0.1",
            "exams_ocr_avg_confidence":0.0,
            "patient_data":{
               "first_name":"YYYY",
               "surname":"YYYY",
               "pesel":"12345678901",
               "birth_date":"1111-22-22",
               "sex":"F",
               "age":"None"
            },
            "exams":[
               {
                  "name":"xx",
                  "sampling_date":"2020-11-30",
                  "comment":"None",
                  "confidence":97,
                  "result":"222",
                  "unit":"ml",
                  "norm":"None",
                  "material":"None",
                  "icd9":"uuuuu"
               },
               {
                  "document_1":{
                     "id":111,
                     "laboratory":"xxx",
                     "document_type":"xxx",
                     "language":"pl",
                     "creation_date":"09-12-2022",
                     "source_filename":"None",
                     "version":"0.1",
                     "exams_ocr_avg_confidence":0.0,
                     "patient_data":{
                        "first_name":"YYYY",
                        "surname":"YYYY",
                        "pesel":"12345678901",
                        "birth_date":"1111-22-22",
                        "sex":"F",
                        "age":"None"
                     },
                     "exams":[
                        {
                           "name":"xx",
                           "sampling_date":"2020-11-30",
                           "comment":"None",
                           "confidence":97,
                           "result":"222",
                           "unit":"ml",
                           "norm":"None",
                           "material":"None",
                           "icd9":"uuuuu"
                        }
                     }
                  ]

在这个json里面有一把钥匙:我想用json文件名的真实名称更新的source_filename

my folder with files as an example:

'11111.pdf.json',
 '11112.pdf.json',
 '11113.pdf.json',
 '11114.pdf.json',
 '11115.pdf.json'

我想要达到的目标:

jsv =
[
   {
      {
         "document_0":{
            "id":111,
            "laboratory":"xxx",
            "document_type":"xxx",
            "language":"pl",
            "creation_date":"09-12-2022",
            "source_filename":"11111.pdf.json",
            "version":"0.1",
            "exams_ocr_avg_confidence":0.0,
            "patient_data":{
               "first_name":"YYYY",
               "surname":"YYYY",
               "pesel":"12345678901",
               "birth_date":"1111-22-22",
               "sex":"F",
               "age":"None"
            },
            "exams":[
               {
                  "name":"xx",
                  "sampling_date":"2222-22-22",
                  "comment":"None",
                  "confidence":22,
                  "result":"222",
                  "unit":"ml",
                  "norm":"None",
                  "material":"None",
                  "icd9":"uuuuu"
               },
               {
                  "document_1":{
                     "id":111,
                     "laboratory":"xxx",
                     "document_type":"xxx",
                     "language":"pl",
                     "creation_date":"22-22-2222",
                     "source_filename":"11111.pdf.json",
                     "version":"0.1",
                     "exams_ocr_avg_confidence":0.0,
                     "patient_data":{
                        "first_name":"YYYY",
                        "surname":"YYYY",
                        "pesel":"12345678901",
                        "birth_date":"1111-22-22",
                        "sex":"F",
                        "age":"None"
                     },
                     "exams":[
                        {
                           "name":"xx",
                           "sampling_date":"2222-11-22",
                           "comment":"None",
                           "confidence":22,
                           "result":"222",
                           "unit":"ml",
                           "norm":"None",
                           "material":"None",
                           "icd9":"uuuuu"
                        }
                     }
                  ]

document_0和document_1具有相同的文件名
我所得到的

dir_name = 'path_name'

from os import listdir
from os.path import isfile, join
onlyfiles = [f for f in listdir(dir_name) if isfile(join(dir_name, f))]

only_files是我的jsons的文件名列表。现在我想也许可以用它在一个循环中更新我的jsv?但是我也在寻找一个方法,由于我必须处理大量的数据,所以它会非常有效
编辑:我已经设法做到这一点与一个for循环,但也许有更有效的方法:
对于范围内的i(len(jsv)):如果(类型(jsv [i])== dict):

jsv[i]["document_0"].update({"source_filename": onlyfiles[i]})
else:
    print(onlyfiles[i])
kmbjn2e3

kmbjn2e31#

如果您的jsv是:

jsv = [
    {
        "document_0": {
            "id": 111,
            "laboratory": "xxx",
            "document_type": "xxx",
            "language": "pl",
            "creation_date": "09-12-2022",
            "source_filename": "None",
            "version": "0.1",
            "exams_ocr_avg_confidence": 0.0,
            "patient_data": {
                "first_name": "YYYY",
                "surname": "YYYY",
                "pesel": "12345678901",
                "birth_date": "1111-22-22",
                "sex": "F",
                "age": "None",
            },
            "exams": [
                {
                    "name": "xx",
                    "sampling_date": "2020-11-30",
                    "comment": "None",
                    "confidence": 97,
                    "result": "222",
                    "unit": "ml",
                    "norm": "None",
                    "material": "None",
                    "icd9": "uuuuu",
                },
            ],
        }
    },
    {
        "document_1": {
            "id": 111,
            "laboratory": "xxx",
            "document_type": "xxx",
            "language": "pl",
            "creation_date": "09-12-2022",
            "source_filename": "None",
            "version": "0.1",
            "exams_ocr_avg_confidence": 0.0,
            "patient_data": {
                "first_name": "YYYY",
                "surname": "YYYY",
                "pesel": "12345678901",
                "birth_date": "1111-22-22",
                "sex": "F",
                "age": "None",
            },
            "exams": [
                {
                    "name": "xx",
                    "sampling_date": "2020-11-30",
                    "comment": "None",
                    "confidence": 97,
                    "result": "222",
                    "unit": "ml",
                    "norm": "None",
                    "material": "None",
                    "icd9": "uuuuu",
                },
            ],
        },
    },
]

在Python中,你可以这样做:

arq = ['11111.pdf.json', '11112.pdf.json']

if len(arq) == len(jsv):
    for i, json in enumerate(jsv):
        for key in enumerate(json.keys()):
            json[key[1]]['source_filename'] = arq[i]

需要检查文件列表的长度是否与jsv列表的长度相同!

结果此JSV:

[
{
    "document_0": {
        "id": 111,
        "laboratory": "xxx",
        "document_type": "xxx",
        "language": "pl",
        "creation_date": "09-12-2022",
        "source_filename": "11111.pdf.json",
        "version": "0.1",
        "exams_ocr_avg_confidence": 0.0,
        "patient_data": {
            "first_name": "YYYY",
            "surname": "YYYY",
            "pesel": "12345678901",
            "birth_date": "1111-22-22",
            "sex": "F",
            "age": "None",
        },
        "exams": [
            {
                "name": "xx",
                "sampling_date": "2020-11-30",
                "comment": "None",
                "confidence": 97,
                "result": "222",
                "unit": "ml",
                "norm": "None",
                "material": "None",
                "icd9": "uuuuu",
            }
        ],
    }
},
{
    "document_1": {
        "id": 222,
        "laboratory": "xxx",
        "document_type": "xxx",
        "language": "pl",
        "creation_date": "09-12-2022",
        "source_filename": "11112.pdf.json",
        "version": "0.1",
        "exams_ocr_avg_confidence": 0.0,
        "patient_data": {
            "first_name": "YYYY",
            "surname": "YYYY",
            "pesel": "12345678901",
            "birth_date": "1111-22-22",
            "sex": "F",
            "age": "None",
        },
        "exams": [
            {
                "name": "xx",
                "sampling_date": "2020-11-30",
                "comment": "None",
                "confidence": 97,
                "result": "222",
                "unit": "ml",
                "norm": "None",
                "material": "None",
                "icd9": "uuuuu",
            }
        ],
    }
},

]

相关问题