spark convert rdd to dataframe-不支持枚举

jfewjypa  于 2021-06-29  发布在  Hive
关注(0)|答案(0)|浏览(262)

我有一个包含枚举字段的case类“ PersonType ". 我想将此记录插入配置单元表。

object PersonType extends Enumeration {
  type PersonType = Value
  val BOSS = Value
  val REGULAR = Value
}

case class Person(firstname: String, lastname: String)
case class Holder(personType: PersonType.Value, person: Person)

以及:

val hiveContext = new HiveContext(sc)
import hiveContext.implicits._

val item = new Holder(PersonType.REGULAR, new Person("tom", "smith"))
val content: Seq[Holder] = Seq(item)

val data : RDD[Holder] = sc.parallelize(content)
val df = data.toDF()

... 当我尝试将相应的rdd转换为dataframe时,出现以下异常:

Exception in thread "main" java.lang.UnsupportedOperationException:     
Schema for type com.test.PersonType.Value is not supported   
        ...
        at org.apache.spark.sql.catalyst.ScalaReflection$class.schemaFor(ScalaReflection.scala:691)
        at org.apache.spark.sql.catalyst.ScalaReflection$.schemaFor(ScalaReflection.scala:30)
        at org.apache.spark.sql.catalyst.ScalaReflection$class.schemaFor(ScalaReflection.scala:630)
        at org.apache.spark.sql.catalyst.ScalaReflection$.schemaFor(ScalaReflection.scala:30)
        at org.apache.spark.sql.SQLContext.createDataFrame(SQLContext.scala:414)
        at org.apache.spark.sql.SQLImplicits.rddToDataFrameHolder(SQLImplicits.scala:94)

我想在插入到配置单元之前将persontype转换为字符串。是否可以扩展隐式转换来处理persontype?我试过这样的方法,但没有成功:

object PersonTypeConversions {
    implicit def toString(personType: PersonTypeConversions.Value): String = personType.toString()
 }
import PersonTypeConversions._

Spark:1.6.0

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题