如何用spark中的其他值替换数组中的项?

yfwxisqw  于 2021-05-16  发布在  Spark
关注(0)|答案(1)|浏览(599)
+--------------------------------+
|Subject                         |
+--------------------------------+
|[English, Math, Science, Spark] |
+--------------------------------+
|[English, History, Art]         |
+--------------------------------+

我们如何用两排的英语来代替英语?

mzillmmw

mzillmmw1#

使用自定义自定义项替换单词:

val replace = udf{ x: Seq[String] => x.map(y => if(y == "English") "ENGLISH" else y) }

val df2 = df.select(replace($"Subject").alias("Subject"))

df2.show(false)
+-------------------------------+
|Subject                        |
+-------------------------------+
|[ENGLISH, Math, Science, Spark]|
|[ENGLISH, History, Art]        |
+-------------------------------+

相关问题