otherwise子句没有按预期工作,这里有什么问题吗?

uqxowvwt  于 2021-05-27  发布在  Spark
关注(0)|答案(2)|浏览(377)

我正在使用spark-sql-2.4.1v如何根据列的值进行各种连接我需要为给定的值列获取map\u val列的多个查找值,如下所示。
样本数据:

val data = List(
  ("20", "score", "school", "2018-03-31", 14 , 12),
  ("21", "score", "school", "2018-03-31", 13 , 13),
  ("22", "rate", "school", "2018-03-31", 11 , 14),
  ("21", "rate", "school", "2018-03-31", 13 , 12)
 )
val df = data.toDF("id", "code", "entity", "date", "value1", "value2")

df.show

+---+-----+------+----------+------+------+
| id| code|entity|      date|value1|value2|
+---+-----+------+----------+------+------+
| 20|score|school|2018-03-31|    14|    12|
| 21|score|school|2018-03-31|    13|    13|
| 22| rate|school|2018-03-31|    11|    14|
| 21| rate|school|2018-03-31|    13|    12|
+---+-----+------+----------+------+------+

 val resultDs = df
                 .withColumn("value1",
                        when(col("code").isin("rate") , functions.callUDF("udfFunc",col("value1")))
                         .otherwise(col("value1").cast(DoubleType))
                      )

udffuncMap如下

11->a
12->b
13->c
14->d

预期产量

+---+-----+------+----------+------+------+
| id| code|entity|      date|value1|value2|
+---+-----+------+----------+------+------+
| 20|score|school|2018-03-31|    14|    12|
| 21|score|school|2018-03-31|    13|    13|
| 22| rate|school|2018-03-31|    a |    14|
| 21| rate|school|2018-03-31|    c |    12|
+---+-----+------+----------+------+------+

但它的产出

+---+-----+------+----------+------+------+
| id| code|entity|      date|value1|value2|
+---+-----+------+----------+------+------+
| 20|score|school|2018-03-31|  null|    12|
| 21|score|school|2018-03-31|  null|    13|
| 22| rate|school|2018-03-31|    a |    14|
| 21| rate|school|2018-03-31|    c |    12|
+---+-----+------+----------+------+------+

为什么“否则”条件不能按预期工作。你知道这里怎么了吗??

rjjhvcjd

rjjhvcjd1#

列应包含相同的数据类型。
注- DoubleType 无法存储 StringTyp 所以你需要转换 DoubleTypeStringType .

val resultDs = df
.withColumn("value1",
        when(col("code") === lit("rate") ,functions.callUDF("udfFunc",col("value1")))
        .otherwise(col("value1").cast(StringType)) // Should be StringType
    )

val resultDs = df
                 .withColumn("value1",
                        when(col("code").isin("rate") , functions.callUDF("udfFunc",col("value1")))
                         .otherwise(col("value1").cast(StringType)) // Modified to StringType
                      )
b91juud3

b91juud32#

我建议修改为-

df
                 .withColumn("value1",
                        when(col("code") === lit("rate") , functions.callUDF("udfFunc",col("value1")))
                         .otherwise(col("value1").cast(StringType))
                      )

检查一次

相关问题