pyspark-如何在withcolumn中添加express

ybzsozfc  于 2021-05-18  发布在  Spark
关注(0)|答案(1)|浏览(447)

我想添加一个新列,它是两个现有列的连接,我正在使用以下查询:这个查询有什么问题?我看到新列的“null”

  1. df.select(df['DEST_COUNTRY_NAME'],df['ORIGIN_COUNTRY_NAME']).withColumn("COMPLETE_PATH",df['DEST_COUNTRY_NAME'] + ",").filter(df['DEST_COUNTRY_NAME']=='Egypt').show()
  2. +-----------------+-------------------+-------------+
  3. |DEST_COUNTRY_NAME|ORIGIN_COUNTRY_NAME|COMPLETE_PATH|
  4. +-----------------+-------------------+-------------+
  5. | Egypt| United States| null|
  6. | Egypt| United States| null|
  7. | Egypt| United States| null|
  8. | Egypt| United States| null|
  9. | Egypt| United States| null|
  10. | Egypt| United States| null|
  11. +-----------------+-------------------+-------------+
webghufk

webghufk1#

尝试:

  1. import org.apache.spark.sql.functions.concat
  2. ...
  3. df.withColumn(concat(col("DEST_COUNTRY_NAME"), lit(",")))

相关问题