如何使用python更轻松地按子字符串拆分csv数据

von4xj4u  于 2022-12-06  发布在  Python
关注(0)|答案(1)|浏览(120)

Finally I want to split clearly like this photo * 不替换,我想拆分,而不仅仅是使用“”,要拆分必须根据子字符串来拆分它,我有一个csv,如:

date, time, ID1, ID2, ID3, "Action=xxx, ProdCode=XXXX, Cmd=xxx, Price=xxxxx, Qty=xxx, TradedQty=xxx, Validity=xxx, Status=xxx, AddBy=xxxxxx, TimeStamp=xxx, ClOrderId=xxx, ChannelId=xxx",x,x,ID4

date, time, ID1, ID2, ID3, "Action=xxx, RetCode=xxx, ProdCode=xxxx, Cmd=xxx, Price=xxxx, Qty=xxx, TradedQty=0, Validity=xxx, Status=xxx, ExtOrderNo=xxxxx, Ref=0, AddBy=xxxxx, Gateway=xxxxx, TimeStamp=xxx, ClOrderId=xxx",x,x,ID4

date, time, ID1, ID2, ID3, "Action=xx, RetCode=xx, ProdCode=xxx, Cmd=xx, Price=xxx, Qty=x, TradedQty=x, Status=xxx, ExtOrderNo=xxx, Ref=xxx, AddBy=xx, Gateway=xxx, TimeStamp=xxx",x,x,ID4

date,time,ID1,ID2,ID3,"Action=xxx, ProdCode=xxx, Cmd=xxx, Price=xxx, Qty=x, ExtOrderNo=xxx, TradeNo=xxx, Ref=@xxx, AddBy=xxx, Gateway=xxx",x,x,ID4

我怎样才能更容易地通过“="前的字符串拆分到不同的列?如果行中没有相关的单词,则该行为空或者在该位置添加“word=,”或者简单地添加“,”最终结果如下:

date, time, ID1, ID2, ID3, "Action=xxx, **RetCode=,** ProdCode=XXXX, Cmd=xxx, Price=xxxxx, Qty=xxx, TradedQty=xxx, Validity=xxx, Status=xxx, **ExtOrderNo=,**  **TradeNo=,** **Ref=,** AddBy=xxxxxx, **Gateway=,** TimeStamp=xxx, ClOrderId=xxx, ChannelId=xxx",x,x,ID4

date, time, ID1, ID2, ID3, "Action=xxx, RetCode=xxx, ProdCode=xxxx, Cmd=xxx, Price=xxxx, Qty=xxx, TradedQty=0, Validity=xxx, Status=xxx,    ExtOrderNo=xxxxx, **TradeNo=,**  Ref=0, AddBy=xxxxx, Gateway=xxxxx, TimeStamp=xxx, ClOrderId=xxx **ChannelId=,**",x,x,ID4

date, time, ID1, ID2, ID3, "Action=xxx, RetCode=xx, ProdCode=xxx, Cmd=xx, Price=xxx, Qty=x, TradedQty=x, **Validity=,** Status=xxx, ExtOrderNo=xxx, **TradeNo=,** Ref=xxx, AddBy=xx, Gateway=xxx, TimeStamp=xxx **ClOrderId=,** **ChannelId=,**",x,x,ID4

date,time,ID1,ID2,ID3,"Action=xxx, **RetCode=,** ProdCode=xxx, Cmd=xxx, Price=xxx, Qty=x, **TradedQty=,** **Validity=,** **Status=,** ExtOrderNo=xxx,  TradeNo=xxx, Ref=@xxx, AddBy=xxx, Gateway=xxx **TimeStamp=,** **ClOrderId=,** **ChannelId=,**",x,x,ID4

p.s.以上只是一些例子的csv,也许有其他的话=xxx,我怎么能更容易地分割它我想清楚地在csv或excel显示哪些数据存在,哪些数据不存在

mepcadol

mepcadol1#

我不确定自己是否100%理解,但请让我尝试提供帮助。重点是:

# import the pandas library and alias as pd
import pandas as pd

# read a csv with the example data
df = pd.read_csv("data.csv", sep=",", quoting=False, header = None)

# replace any values that match the pattern "something=value" with "value"
df.replace(to_replace=r"^(.*)=", value="", regex=True, inplace=True)
# save to a new csv file:
df.to_csv("new_data.csv", sep=",", header = None, index = False)

相关问题