需要从spark shell将jsonDataframe写入avro文件格式

c8ib6hqw  于 2021-05-27  发布在  Spark
关注(0)|答案(2)|浏览(440)

嗨,我必须将json数据读取到spark df,然后我需要从spark shell以avro文件格式编写df,出现以下错误:

org.apache.spark.sql.AnalysisException: Failed to find data source: avro. Avro is built-in but external data source module since Spark 2.4. Please deploy the application as per the deployment section of "Apache Avro Data Source Guide".;
  at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:647)
  at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:245)
  at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:229)
  ... 49 elided
quhf5bfb

quhf5bfb1#

你试过添加spark avro库吗?
启动Spark壳时,可以按以下步骤操作:

spark-shell --packages org.apache.spark:spark-avro_2.11:2.4.6
mhd8tkvw

mhd8tkvw2#

spark-avro 外部库,必须导入 spark-avro Package 成 spark-shell .
检查下面

spark-shell --packages org.apache.spark:spark-avro_2.11:2.4.0

相关问题