检查值是否在两列之间,spark scala

xa9qqrwz  于 2021-07-12  发布在  Spark
关注(0)|答案(1)|浏览(336)

我有两个Dataframe,一个有我的数据,另一个要比较。我要做的是检查一个值是否在两个不同列的范围内,例如:

Df_player
    +--------+-------+
    | Baller | Power |
    +--------+-------+
    | John   |   1.5 |
    | Bilbo  |   3.7 |
    | Frodo  |   6   |
    +--------+-------+

Df_Check
    +--------+--------+--------+
    | First  | Second | Value  |
    +--------+--------+--------+
    |   1    |   1.5  |  Bad-  |
    |   1.5  |   3    |  Bad   |
    |   3    |   4.2  |  Good  |
    |   4.2  |   6    |  Good+ |
    +--------+--------+--------+

结果是:

Df_out
    +--------+-------+--------+
    | Baller | Power | Value  |
    +--------+-------+--------+
    | John   |   1.5 |  Bad-  |
    | Bilbo  |   3.7 |  Good  |
    | Frodo  |   6   |  Good+ |
    +--------+-------+--------+
5ktev3wc

5ktev3wc1#

可以基于between条件进行连接,但是请注意 .between 此处不适用,因为您希望在其中一个比较中出现不平等:

val result = df_player.join(
    df_check, 
    df_player("Power") > df_check("First") && df_player("Power") <= df_check("Second"), 
    "left"
).select("Baller", "Power", "Value")

result.show
+------+-----+-----+
|Baller|Power|Value|
+------+-----+-----+
|  John|  1.5| Bad-|
| Bilbo|  3.7| Good|
| Frodo|  6.0|Good+|
+------+-----+-----+

相关问题