如何用嵌套列表创建spark表

zbdgwd5y 于 2021-05-29 发布在 Spark

关注(0)|答案(1)|浏览(284)

如何使用这个答案（list to dataframe in pyspark）创建一个使用spark for nested list的表？

lst = [{'sfObject': 'event',
  'objID': 'Id',
  'interimRun': 'True',
  'numAttributes_Total': 140,
  'numAttributes_Compounded': 0,
  'numAttributes_nonCompounded': 140,
  'chunks': 1,
  'compoundStatus': 'False',
  'allAttributes': ['Id',
   'RecordTypeId',
   'WhoId',
   'Advisor_Team__c’,…],
  'compoundAttributes': [],
  'nonCompoundAttributes': ['Id',
   'RecordTypeId',
   'WhoId',
   'WhatId’…]},
 {'sfObject': 'fund__c',
  'objID': 'Id',
  'interimRun': 'False',
  'numAttributes_Total': 40,
  'numAttributes_Compounded': 0,
  'numAttributes_nonCompounded': 40,
  'chunks': 1,
  'compoundStatus': 'False',
  'allAttributes': ['Id',
   'IsDeleted',
   'Name’…],
  'compoundAttributes': [],
  'nonCompoundAttributes': ['Id',
   'IsDeleted',
   'Name',
   'RecordTypeId’…]}]

我想创建一个表来存储这个列表，所以需要它的结构如下：
下面的链接是我需要使用上述lst创建的表的图像：
在此处输入图像描述
这个嵌套列表最多有30个不同的项，因此答案需要为每个项动态创建最多30行。
谢谢您！

python apache-spark pyspark

来源：https://stackoverflow.com/questions/62294583/how-to-create-spark-table-with-nested-list

1条答案

按热度按时间

d8tt03nd1#

一旦你有了一个字典列表，运行下面的程序。它将推断出模式。

df = sc.parallelize(lst).toDF()

如果要将其视为运行sql查询的表，请运行：

df.createOrReplaceTempView("df_table")
new_df = spark.sql("SELECT * FROM df_table")

赞(0）回复(0）举报 2021-05-29

我来回答

如何用嵌套列表创建spark表

1条答案

相关问题

热门标签

最新问答