pycharm 当读入Pandas DataFrame时,JSON数据的一部分丢失了?

bsxbgnwa  于 2023-01-30  发布在  PyCharm
关注(0)|答案(1)|浏览(168)

我想对我从一个游戏研究中收集的数据做一些分析。我们存储时间戳、输入类型,然后是所玩的各个回合的元数据。我们将其存储为JSON,我想将其加载到python脚本中,以便使用matplotlib生成一些漂亮的图形。为了使用Pandas,我想将其转换为.CSV格式的 Dataframe 。然而,打印DF时,一些数据似乎丢失。

with open('/Users/me/Downloads/databackup.json', 'r') as f:
        data = json.loads(f.read())
    
    multiple_level_data = pd.json_normalize(data, record_path=['gameList'],
                                            meta=[], meta_prefix='config_params_',
                                            record_prefix='dbscan_')
    
    multiple_level_data.to_csv('GameData.csv', index=False)
    df = pd.read_csv("GameData.csv")

这是我用来将JSON转换为CSV的代码。现在,每当玩家在最后x轮中得分达到750时,我们都会创建一个新的时间戳。当只有一轮时,该时间戳的数据会显示出来,但当每个时间戳有两轮或更多轮时,这两轮的数据没有显示在我的df中。2我是选择了错误的record_path还是使用了错误的方法来转换它?

{
    "gameList": [
        {
            "startingTime": "20230125204032",
            "inputType": "joyStick",
            "Rounds": [
                {
                    "durationSeconds": 128,
                    "score": 492,
                    "platformCount": {
                        "normalPlatforms": 60,
                        "movingPlatforms": 41,            #this loads in
                        "powPlatforms": 5,
                        "normalEnemies": 5,
                        "movingEnemies": 8
                    }
                },
                {
                    "durationSeconds": 62,
                    "score": 258,
                    "platformCount": {
                        "normalPlatforms": 35,
                        "movingPlatforms": 23,             #this doesn't
                        "powPlatforms": 3,
                        "normalEnemies": 2,
                        "movingEnemies": 5
                    }
                }
            ]
        },
tzcvj98z

tzcvj98z1#

如果你想沿着路径向下走,你需要使用一个列表,如果你想在同一个级别中有多个项目,就像在元的情况下一样,你需要在列表的列表中Map级别。

df = pd.json_normalize(
    data=data,
    record_path=["gameList", "Rounds"],
    meta_prefix="config_params_",
    record_prefix="dbscan_",
    meta=[["gameList", "startingTime"], ["gameList", "inputType"]]
)

输出:

dbscan_durationSeconds  dbscan_score  dbscan_platformCount.normalPlatforms  dbscan_platformCount.movingPlatforms  dbscan_platformCount.powPlatforms  dbscan_platformCount.normalEnemies  dbscan_platformCount.movingEnemies config_params_gameList.startingTime config_params_gameList.inputType
0                     128           492                                    60                                    41                                  5                                   5                                   8                      20230125204032                         joyStick
1                      62           258                                    35                                    23                                  3                                   2                                   5                      20230125204032                         joyStick

相关问题