比较java中的两个spark模式,无法将seq< structfield>转换为list< structfield>

tp5buhyn  于 2021-05-27  发布在  Spark
关注(0)|答案(1)|浏览(483)

问题:我想以ddl格式获取两个模式之间的公共属性。
我有以下工作代码来获取scala中模式的交集:

val diff = df1.schema.intersect(df2.schema)
val sb = new StringBuilder(); 
diff.toStream.foreach(x => sb.append( x.toDDL + ", "))

但我在将这篇文章转换为java时面临着转换问题:

StructType s1 = new StructType().add("col1",StringType)
                                .add("col2",StringType)
                                .add("col3",StringType)
                                .add("col4",StringType);

StructType s2 = new StructType().add("col1",StringType)
                                .add("col4",StringType);

System.out.println("Output :" + s1.toList().intersect(s2.toList()));
Output :List(StructField(col1,StringType,true), StructField(col4,StringType,true))

我无法将此输出转换为ddl。我尝试将上面的对象作为seq读取,但由于编译错误而失败:

Seq<StructField> result = s1.toList().intersect(s2.toList());

Error: java: incompatible types: java.lang.Object cannot be converted to scala.collection.Seq<org.apache.spark.sql.types.StructField>

再来一次:

StringBuilder sb = new StringBuilder();
    s1.toList().intersect(s2.toList()).foreach( (schema) -> sb.append(schema.toDDL() + ","));

Error:(81, 39) java: cannot find symbol
  symbol:   method foreach((schema)->[...] ","))
  location: class java.lang.Object

有没有关于如何将此作为 List<StructType> ,以便将其转换为ddl?

zd287kbt

zd287kbt1#

我知道的唯一方法就是 JavaConversions ,大概是

Object something = s1.toList().intersect(s2.toList());
List<StructField> result = JavaConversions.seqAsJavaList((Seq<StructField>)something);
System.out.println("Output :" + result);

…将打印

Output :[StructField(col1,StringType,true), StructField(col4,StringType,true)]

相关问题