我有一个包含a列的数据
A 107/108 105 103 103/104
输出应为like:-
105 103
我在pyspark和pysql中尝试了很多filter函数,但代码不起作用
xytpbqjk1#
你可以用任何一个 rlike,like,contains 具有的函数 negation (~) ```df=spark.createDataFrame([('107/108',),('105',),('103',),('103/104',)],['A'])df.show()
rlike,like,contains
negation (~)
from pyspark.sql.functions import *
df.filter(~col("A").rlike("/")).show()
df.filter(~col("A").like("%/%")).show()
df.filter(~col("A").contains("/")).show()
`UPDATE:`df=spark.createDataFrame([('107/108',),('105',),('103',),('103/104',),('',)],['A'])df.show()
`UPDATE:`
df.filter(~col("A").rlike("/")).show()df.filter(~col("A").like("%/%")).show()df.filter(~col("A").contains("/")).show()
1条答案
按热度按时间xytpbqjk1#
你可以用任何一个
rlike,like,contains
具有的函数negation (~)
```df=spark.createDataFrame([('107/108',),('105',),('103',),('103/104',)],['A'])
df.show()
+-------+
| A|
+-------+
|107/108|
| 105|
| 103|
|103/104|
+-------+
from pyspark.sql.functions import *
using rlike function
df.filter(~col("A").rlike("/")).show()
using like function
df.filter(~col("A").like("%/%")).show()
using contains function
df.filter(~col("A").contains("/")).show()
+---+
| A|
+---+
|105|
|103|
+---+
`UPDATE:`
df=spark.createDataFrame([('107/108',),('105',),('103',),('103/104',),('',)],['A'])
df.show()
+-------+
| A|
+-------+
|107/108|
| 105|
| 103|
|103/104|
| |
+-------+
df.filter(~col("A").rlike("/")).show()
df.filter(~col("A").like("%/%")).show()
df.filter(~col("A").contains("/")).show()
+---+
| A|
+---+
|105|
|103|
| |
+---+