python—删除包含特殊字符的行,如pyspark或pysql中的“/”

yqkkidmi  于 2021-05-29  发布在  Spark
关注(0)|答案(1)|浏览(735)

我有一个包含a列的数据

A
107/108
105
103
103/104

输出应为like:-

105
103

我在pyspark和pysql中尝试了很多filter函数,但代码不起作用

xytpbqjk

xytpbqjk1#

你可以用任何一个 rlike,like,contains 具有的函数 negation (~) ```
df=spark.createDataFrame([('107/108',),('105',),('103',),('103/104',)],['A'])
df.show()

+-------+

| A|

+-------+

|107/108|

| 105|

| 103|

|103/104|

+-------+

from pyspark.sql.functions import *

using rlike function

df.filter(~col("A").rlike("/")).show()

using like function

df.filter(~col("A").like("%/%")).show()

using contains function

df.filter(~col("A").contains("/")).show()

+---+

| A|

+---+

|105|

|103|

+---+

`UPDATE:`
df=spark.createDataFrame([('107/108',),('105',),('103',),('103/104',),('',)],['A'])
df.show()

+-------+

| A|

+-------+

|107/108|

| 105|

| 103|

|103/104|

| |

+-------+

df.filter(~col("A").rlike("/")).show()
df.filter(~col("A").like("%/%")).show()
df.filter(~col("A").contains("/")).show()

+---+

| A|

+---+

|105|

|103|

| |

+---+

相关问题