pandas 将Json文件与连续索引组合

ohtdti5x  于 2023-01-04  发布在  其他
关注(0)|答案(2)|浏览(153)

Json文件示例:LINK
我有大量的Json文件,它们具有相同的结构:

0:
 Title
 Location
 Price
 Net area
 Gross area
 ...

1: 
 Title
 Location
 ...

2: 
 ...

可以从索引0运行一个Json文件:到56:,而另一个可以从0:to 60:。我无休止地试图将它们与.concat、.join和.merge组合在一起,但最终的Json文件从来都不是我所期望的。to 56:与0:to 60一起应该给予一个从0:到118(索引)。但我最多得到以下结构:

0: 
  0:
    Title
    ...
  1:
    Title
    ...
1:
  0: 
    Title
    ...
  1: 
    Title
    ...
2:
  0:
    Title
    ...
  1:
    Title
    ...

SCREENSHOT
下面是我使用的代码:

import json
import pandas as pd

with open('Belem_data_p1.json') as f1:
    data1=json.load(f1)
with open('Belem_data_p2.json') as f2:
    data2=json.load(f2)

df1=pd.DataFrame([data1])
df2=pd.DataFrame([data2])

MergeJson=pd.concat([df1,df2]).reset_index()
MergeJson.to_json('NewFileName.json')
print (MergeJson)

我需要DataFrame的组合来更新索引,这样我的数据就显示为:

0: 
  Title
  Location
  ...

[...]

118:
 Title
 Location
 ...

SCREENSHOT我非常感谢修复程序,因为我一直在无休止地搜索,但无法找到解决方案!

mzaanser

mzaanser1#

试试这个:

import json
import pandas as pd

#Call your first file instead of string
JSON1=json.loads('''[
    {"Title":"A","Location":"B","Price":"C","Net area":"D","Gross area":"E"},
    {"Title":"A","Location":"B","Price":"C","Net area":"D","Gross area":"E"},
    {"Title":"A","Location":"B","Price":"C","Net area":"D","Gross area":"E"},
    {"Title":"A","Location":"B","Price":"C","Net area":"D","Gross area":"E"},
    {"Title":"A","Location":"B","Price":"C","Net area":"D","Gross area":"E"},
    {"Title":"A","Location":"B","Price":"C","Net area":"D","Gross area":"E"}
]''')

#Call your second file instead of string
JSON2=json.loads('''[
    {"Title":"Z","Location":"Y","Price":"X","Net area":"W","Gross area":"V"},
    {"Title":"Z","Location":"Y","Price":"X","Net area":"W","Gross area":"V"},
    {"Title":"Z","Location":"Y","Price":"X","Net area":"W","Gross area":"V"},
    {"Title":"Z","Location":"Y","Price":"X","Net area":"W","Gross area":"V"}
]''')

df=pd.DataFrame(data=(JSON1+JSON2))

print((df.index))

但是,如果您的文件是dict;你应该把它们转换成一个list,然后把它们组合起来,否则在dict中类似的键会互相覆盖。

import json
import pandas as pd

JSON1=json.loads('''{
  "1": {"Title": "A", "Location": "B", "Price": "C", "Net area": "D", "Gross area": "E"}, 
  "2": {"Title": "A", "Location": "B", "Price": "C", "Net area": "D", "Gross area": "E"}, 
  "3": {"Title": "A", "Location": "B", "Price": "C", "Net area": "D", "Gross area": "E"}, 
  "4": {"Title": "A", "Location": "B", "Price": "C", "Net area": "D", "Gross area": "E"},
  "5": {"Title": "A", "Location": "B", "Price": "C", "Net area": "D", "Gross area": "E"}, 
  "6": {"Title": "A", "Location": "B", "Price": "C", "Net area": "D", "Gross area": "E"}
}''')

JSON2=json.loads('''{
    "1": {"Title":"Z", "Location":"Y", "Price":"X", "Net area":"W", "Gross area":"V"},
    "2": {"Title":"Z", "Location":"Y", "Price":"X", "Net area":"W", "Gross area":"V"},
    "3": {"Title":"Z", "Location":"Y", "Price":"X", "Net area":"W", "Gross area":"V"},
    "4": {"Title":"Z", "Location":"Y", "Price":"X", "Net area":"W", "Gross area":"V"}
}''')

df=pd.DataFrame(data=(list(JSON1.values())+list(JSON2.values())))

print((df.index))
6kkfgxo0

6kkfgxo02#

尝试不要为每个JSON文件创建新的DataFrame,而是直接连接数据:

import json

# Load the data from the JSON files
with open('Belem_data_p1.json') as f1:
    data1 = json.load(f1)
with open('Belem_data_p2.json') as f2:
    data2 = json.load(f2)

# Concatenate the data
data = dict(data1)
data.update(data2)

# Save the combined data to a new JSON file
with open('NewFileName.json', 'w') as outfile:
    json.dump(data, outfile)

相关问题