pandas—在python中将json数据加载到Dataframe中

mm9b1k5b  于 2021-07-14  发布在  Spark
关注(0)|答案(4)|浏览(395)
list = {'masterId': 2, 'name': 'name', 'description': 'xyz', 'signalTypeRefId': 4, 'unitOfMeasureRefId': 1, 'precision': 1, 'min': -125, 'max': 125, 'isDeprecated': False}

我需要将上面的json加载到一个dataframe中,并尝试了以下操作,但没有成功

df = pd.DataFrame.from_dict(list, orient = 'index')
display(df)

错误:
typeerror:字段0:无法合并类型<class'pyspark.sql.types.longtype'>和<class'pyspark.sql.types.stringtype'>

plicqrtu

plicqrtu1#

在创建Dataframe之前,需要将字典 Package 到列表中:

data = {'masterId': 2, 'name': 'name', 'description': 'xyz', 'signalTypeRefId': 4, 'unitOfMeasureRefId': 1, 'precision': 1, 'min': -125, 'max': 125, 'isDeprecated': False}

df = spark.createDataFrame([data])

df.show()
+-----------+------------+--------+---+----+----+---------+---------------+------------------+
|description|isDeprecated|masterId|max| min|name|precision|signalTypeRefId|unitOfMeasureRefId|
+-----------+------------+--------+---+----+----+---------+---------------+------------------+
|        xyz|       false|       2|125|-125|name|        1|              4|                 1|
+-----------+------------+--------+---+----+----+---------+---------------+------------------+

或者,您可以将其转换为pandasDataframe,并从中创建sparkDataframe,但仍需要将字典 Package 到列表中:

data = {'masterId': 2, 'name': 'name', 'description': 'xyz', 'signalTypeRefId': 4, 'unitOfMeasureRefId': 1, 'precision': 1, 'min': -125, 'max': 125, 'isDeprecated': False}

df = spark.createDataFrame(pd.DataFrame([data]))

df.show()
+--------+----+-----------+---------------+------------------+---------+----+---+------------+
|masterId|name|description|signalTypeRefId|unitOfMeasureRefId|precision| min|max|isDeprecated|
+--------+----+-----------+---------------+------------------+---------+----+---+------------+
|       2|name|        xyz|              4|                 1|        1|-125|125|       false|
+--------+----+-----------+---------------+------------------+---------+----+---+------------+
oxalkeyp

oxalkeyp2#

data = pd.DataFrame([list])

有关将json转换为Dataframe的更多信息,请查看下面的链接:)
https://pandas.pydata.org/docs/reference/api/pandas.read_json.html

knsnq2tg

knsnq2tg3#

dct = {'masterId': 2, 'name': 'name', 'description': 'xyz', 'signalTypeRefId': 4, 'unitOfMeasureRefId': 1, 'precision': 1, 'min': -125, 'max': 125, 'isDeprecated': False}

df = pd.DataFrame.from_dict(dct, orient="index")
display(df)

"""
                        0
masterId                2
name                 name
description           xyz
signalTypeRefId         4
unitOfMeasureRefId      1
precision               1
min                  -125
max                   125
isDeprecated        False
"""

要将其作为一行,请使用 .transpose() ```
df.transpose()
"""
Out[15]:
masterId name description ... min max isDeprecated
0 2 name xyz ... -125 125 False
"""

zqdjd7g9

zqdjd7g94#

data = pd.json_normalize(list)

https://pandas.pydata.org/docs/reference/api/pandas.json_normalize.html

相关问题