我有:
Customerkeycode B01:B14:110083
我希望:
PlanningCustomerSuperGroupCode, DPGCode, APGCode BO1, B14, 110083
lrl1mhuk1#
import pandas as pd df = pd.DataFrame( { "Customerkeycode": [ "B01:B14:110083", "B02:B15:110084" ] } ) df['PlanningCustomerSuperGroupCode'] = df['Customerkeycode'].apply(lambda x: x.split(":")[0]) df['DPGCode'] = df['Customerkeycode'].apply(lambda x: x.split(":")[1]) df['APGCode'] = df['Customerkeycode'].apply(lambda x: x.split(":")[2]) df_rep = df.drop("Customerkeycode", axis = 1) print(df_rep) PlanningCustomerSuperGroupCode DPGCode APGCode 0 B01 B14 110083 1 B02 B15 110084
cidc1ykv2#
按“:”拆分为3列,列名称为[“计划客户超级组代码”、“DPGCode”、“APGCode”]
import pyspark.sql.functions as F df.withColumn('PlanningCustomerSuperGroupCode', F.split(F.col('Customerkeycode'), ':')[0]) \ .withColumn('DPGCode', F.split(F.col('Customerkeycode'), ':')[1]) \ .withColumn('APGCode', F.split(F.col('Customerkeycode'), ':')[2]) \ .drop('Customerkeycode') \ .show()
3条答案
按热度按时间lrl1mhuk1#
cidc1ykv2#
按“:”拆分为3列,列名称为[“计划客户超级组代码”、“DPGCode”、“APGCode”]