f.when(~f.col('text').like('\bfoo\b')在pyspark中无法按预期工作

lawou6xi  于 2021-07-09  发布在  Spark
关注(0)|答案(1)|浏览(276)

为什么是这部分 (~F.col('text').rlike('\bfoo\b') 不起作用?
更新时间:

import pyspark.sql.functions as F

df = spark.createDataFrame(['Some text with foo and more text','Some text with bar and more text'],['value_1',None], "string").toDF("text", "check")

df_new = df.withColumn('check', F.when(((F.col('text').isNotNull()) & \
                                        (F.col('check').isNull() & \
                                        (~F.col('text').rlike('\bfoo\b')), 
                                my_udf(F.col('text'))) \
.otherwise(F.col('check'))

df_new.show(truncate=False)
+----------------------------------------+
|text                            |check  |
+--------------------------------+-------|
|Some text with foo and more text|value_1|
|Some text with bar and more text|       |
+--------------------------------+-------+
trnvg8h3

trnvg8h31#

尝试 rlike 而是使用正则表达式:

df_new = df.withColumn('check', ~F.col('text').rlike(r'\bfoo\b'))

df_new.show(truncate=False)

# +--------------------------------+-----+

# |text                            |check|

# +--------------------------------+-----+

# |Some text with foo and more text|false|

# |Some text with bar and more text|true |

# +--------------------------------+-----+

没有必要使用 when 在这里, rlike/like 已返回布尔值。

相关问题