pandas 从JSON到DataFrame再到SQL

enxuqcxy 于 2023-06-20 发布在其他

关注(0)|答案(2)|浏览(69)

我试图将所有的json数据保存到sql数据库中，我使用python，所以我决定使用pandas。
JSON的一部分：

{
"stores": [
    {
        "ID": "123456",
        "name": "Store 1",
        "status": "Active",
        "date": "2019-03-28T15:20:00Z",
        "tagIDs": null,
        "location": {
            "cityID": 2,
            "countryID": 4,
            "geoLocation": {
                "latitude": 1.13121,
                "longitude": 103.4324231
            },
            "postcode": "123456",
            "address": ""
        },
        "new": false
    },
    {
        "ID": "223456",
        "name": "Store 2",
        "status": "Active",
        "date": "2020-03-28T15:20:00Z",
        "tagIDs": [
            12,
            35
        ],
        "location": {
            "cityID": 21,
            "countryID": 5,
            "geoLocation": {
                "latitude": 1.12512,
                "longitude": 103.23342
            },
            "postcode": "223456",
            "address": ""
        },
        "new": true
    }
]

}
密码：

response = requests.get(.....)
result = response.text
data = json.loads(result)
df = pd.json_normalize(data["store"])
.....

db_connection = sqlalchemy.create_engine(.....)
df.to_sql(con=db_connection, name="store", if_exists="append" )

Error: _mysql_connector.MySQLInterfaceError: Python type list cannot be converted
我希望dataframe实际上看起来像：

ID          tagIDs             date
0   123456        []         2020-04-23T09:32:26Z               
1   223456      [12,35]      2019-05-24T03:21:39Z                 
2   323456     [709,1493]    2019-03-28T15:38:39Z

到目前为止，我试过使用不同的dataframes和json对象，它们都能正常工作。所以我发现问题出在json对象上。没有“tagID”，其他一切都正常工作。
我在想，也许如果我把对象转换成字符串，它可以被解析成SQL，但它也不起作用。如何更改tagID，以便可以将所有内容解析为sql？有没有其他更有效的方法来做到这一点？

pandas

来源：https://stackoverflow.com/questions/63521604/converting-from-json-to-dataframe-to-sql

2条答案

按热度按时间

pn9klfpd1#

我认为tagIDs字段是一个列表，您的数据库似乎不喜欢它。
不确定这是最好的方法，但你可以尝试将它从列表转换为字符串

df['tagIDs'] = df['tagIDs'].apply(lambda x: str(x))

赞(0）回复(0）举报 2023-06-20

vsmadaxz2#

下面是一个没有dataframe但模式演变为SQL并进行规范化示例

import dlt

pipeline = dlt.pipeline(destination="duckdb", dataset_name="sequence")

info = pipeline.run([{'id':1}, {'id':2}, {'id':3}], table_name="three")

print(info)

赞(0）回复(0）举报 2023-06-20

我来回答

pandas 从JSON到DataFrame再到SQL

2条答案

相关问题

热门标签

最新问答