比较来自不同df,spark scala的两列

9wbgstp7  于 2021-07-12  发布在  Spark
关注(0)|答案(1)|浏览(271)

我试图比较两列,每列来自不同的df。我有这两个:

df1
+----+-------+-------+
|Game|rev_1_t|rev_2_t|
+----+-------+-------+
|  CA|    AA |    AA |
|  FT|    B  |    C  |
+----+-------+-------+

df_prev
+----+-------+-------+
|Game|rev_1_t|rev_2_t|
+----+-------+-------+
|  CA|    C  |   AA  |  
|  FT|    B  |   C   |
+----+-------+-------+

我想比较df1中的rev_1_t和df_prev中的rev_1_t,如果有更改,则放入一个名为change的新列,如果没有更改,则使用“y”和“n”。同时,我想添加一个名为prev\u value的新列,在该列中存储df\u prev中rev\u 1\u t的先前值。版本2相同。输出为:

Output:
+----+-------+--------+------------+---------+----------+--------------+
|Game|rev_1_t| change | prev_value | rev_2_t | change_2 | prev_value_2 | 
+----+-------+--------+------------+---------+----------+--------------+
|  CA|    C  |   Y    |      C     |     AA  |   Y      |      C       |
|  FT|    B  |   Y    |      B     |     C   |   Y      |      B       |
+----+-------+--------+------------+---------+----------+--------------+

我试着做你在这里看到的,但我有不同的错误:

val change = df1.withColumn(
   "change", when (df1("rev_1_t") === df_prev("rev_1_t"), df1("rev_1_t")).otherwise(df_prev("rev_1_t"))
  .withColumn(
   "prev_value", when(df1("rev_1_t") === df_prev("rev_1_t"), "N").otherwise("Y"))
cigdeys3

cigdeys31#

可以执行联接,然后比较相关列:

import org.apache.spark.sql.expressions.Window

val result = df1.join(df_prev, Seq("Game"), "left")
    .select(col("Game"), 
            df1("rev_1_t"), 
            when(df1("rev_1_t") === df_prev("rev_1_t"), "N").otherwise("Y").as("change"), 
            df_prev("rev_1_t").as("prev_value"), 
            df1("rev_2_t"), 
            when(df1("rev_2_t") === df_prev("rev_2_t"), "N").otherwise("Y").as("change_2"), 
            df_prev("rev_2_t").as("prev_value_2")
    )
    .withColumn("change", max("change").over(Window.orderBy(lit(1))))
    .withColumn("change_2", max("change_2").over(Window.orderBy(lit(1))))

result.show
+----+-------+------+----------+-------+--------+------------+
|Game|rev_1_t|change|prev_value|rev_2_t|change_2|prev_value_2|
+----+-------+------+----------+-------+--------+------------+
|  CA|     AA|     Y|         C|     AA|       N|          AA|
|  FT|      B|     Y|         B|      C|       N|           C|
+----+-------+------+----------+-------+--------+------------+

相关问题