JSON文件到R Dataframe

ryoqjall  于 2022-12-06  发布在  其他
关注(0)|答案(2)|浏览(116)

我有一个JSON文件。虽然原始文件很大,但为了回答这个问题,我将其简化为一个小得多的可重现示例(无论大小,我仍然会得到相同的错误):

{
  "relationships_followers": [
    {
      "title": "",
      "media_list_data": [
        
      ],
      "string_list_data": [
        {
          "href": "https://www.instagram.com/testaccount1",
          "value": "testaccount1",
          "timestamp": 1669418204
        }
      ]
    },
    {
      "title": "",
      "media_list_data": [
        
      ],
      "string_list_data": [
        {
          "href": "https://www.instagram.com/testaccount2",
          "value": "testaccount2",
          "timestamp": 1660426426
        }
      ]
    },
    {
      "title": "",
      "media_list_data": [
        
      ],
      "string_list_data": [
        {
          "href": "https://www.instagram.com/testaccount3",
          "value": "testaccount3",
          "timestamp": 1648230499
        }
      ]
    },
       {
      "title": "",
      "media_list_data": [
        
      ],
      "string_list_data": [
        {
          "href": "https://www.instagram.com/testaccount4",
          "value": "testaccount4",
          "timestamp": 1379513403
        }
      ]
    }
  ]
}

我尝试将其转换为R中的 Dataframe ,其中包含hrefvaluetimestamp变量的值:

但是当我运行下面的代码时,我从另一个关于将JSON转换为R的SO答案中提取了这个代码:

library("rjson")

result <- fromJSON(file = "test_file.json")

json_data_frame <- as.data.frame(result)

我遇到了关于不同行的错误。

Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE,  : 
  arguments imply differing number of rows: 1, 0

如何将我的数据转换为所需的DF格式?

hsvhsicv

hsvhsicv1#

数据似乎是嵌套的...
试试这个:

library("rjson")
library("dplyr")

result <- fromJSON(file = "test_file.json")
result_list <-sapply(result$relationships_followers,
                     "[[", "string_list_data")
json_data_frame <- bind_rows(result_list)
l5tcr1uw

l5tcr1uw2#

这是因为存在嵌套数据。

df<- as.data.frame(do.call(rbind, lapply(
  lapply(result$relationships_followers, "[[", "string_list_data"), "[[", 1)))

df
#>      href                                     value          timestamp 
#>  "https://www.instagram.com/testaccount1" "testaccount1" 1669418204
#>  "https://www.instagram.com/testaccount2" "testaccount2" 1660426426
#>  "https://www.instagram.com/testaccount3" "testaccount3" 1648230499
#>  "https://www.instagram.com/testaccount4" "testaccount4" 1379513403

注意:默认情况下,jsonlite包在解析data.frame方面做得更好。

相关问题