df=spark.createDataFrame([('1' , 'A' ,'14-02-22' ) ,
('1' , 'B' , '11-03-22' )],
('Id' , 'Status' , 'Date' ))
df.show()
+---+------+--------+
| Id|Status| Date|
+---+------+--------+
| 1| A|14-02-22|
| 1| B|11-03-22|
(df.withColumn('Date',to_date('Date', "dd-MM-yy"))#Coerce string to date
.withColumn('Date1', F.struct(array(date_format(add_months('Date',-1),"MMM yy"), date_format('Date',"MMM yy")).alias('Previous')#Create an array of date and previous day, store in struct as Previous
,array(date_format('Date',"MMM yy"),date_format(add_months('Date',1),"MMM yy")).alias('Next')#Create an array of date and next day, store in struct as Previous
)).select('Id','Status','Date','Date1.*')#Select all required columns exploding Date1 with each struct element as column
).show(truncate=False)
+---+------+----------+----------------+----------------+
|Id |Status|Date |Previous |Next |
+---+------+----------+----------------+----------------+
|1 |A |2022-02-14|[Jan 22, Feb 22]|[Feb 22, Mar 22]|
|1 |B |2022-03-11|[Feb 22, Mar 22]|[Mar 22, Apr 22]|
+---+------+----------+----------------+----------------+
也可以使用concat_ws,如下所示
(df.withColumn('Date',to_date('Date', "dd-MM-yy"))#Coerce string to date
.withColumn('Date1', F.struct(concat_ws('-',lit((date_format(add_months('Date',-1),"MMM yy").astype('string'))), lit((date_format('Date',"MMM yy").astype('string')))).alias('Previous')#create string by concat of date with string format of the previous months date
,concat_ws('-',lit((date_format(add_months('Date',1),"MMM yy").astype('string'))), lit((date_format('Date',"MMM yy").astype('string')))).alias('Next')#Create a string by concat date with string format of the next months date
)).select('Id','Status','Date','Date1.*')#Select all required columns exploding Date1 with each struct element as column
).show(truncate=False)
+---+------+----------+-------------+-------------+
|Id |Status|Date |Previous |Next |
+---+------+----------+-------------+-------------+
|1 |A |2022-02-14|Jan 22-Feb 22|Mar 22-Feb 22|
|1 |B |2022-03-11|Feb 22-Mar 22|Apr 22-Mar 22|
+---+------+----------+-------------+-------------+
2条答案
按热度按时间zrfyljdw1#
使用日期函数。首先使用
to_date
将日期格式化为字符串接下来,使用
add_months
创建previous和next。使用date_format
对其进行格式化,并将每个数组存储在array
中。将两个数组合并为struct
列。分解struct列。下面是代码和逻辑数据类型
也可以使用concat_ws,如下所示
3b6akqbq2#
执行此任务所需的基本函数包括:使用“date_format”将日期设置为所需格式,使用“add_months”将日期加减。