pandas 如何将一个字符串拆分成多个列表?[关闭]

n53p2ov0  于 2024-01-04  发布在  其他
关注(0)|答案(2)|浏览(82)

已关闭。此问题需要更多focused。目前不接受回答。
**要改进此问题吗?**更新此问题,使其仅针对editing this post的一个问题。

19天前关闭
Improve this question
我有债券的数据

"Bond(secid='SU26238RMFS4', shortname='ОФЗ 26238', amortization=[Amortization(date=datetime.date(2041, 5, 15), value=1000, initialfacevalue=1000)], coupons=[Coupon(date=2021-12-08, value=34.04), Coupon(date=2022-06-08, value=35.4), Coupon(date=2022-12-07, value=35.4), Coupon(date=2023-06-07, value=35.4), Coupon(date=2023-12-06, value=35.4), Coupon(date=2024-06-05, value=35.4), Coupon(date=2024-12-04, value=35.4), Coupon(date=2025-06-04, value=35.4), Coupon(date=2025-12-03, value=35.4), Coupon(date=2026-06-03, value=35.4), Coupon(date=2026-12-02, value=35.4), Coupon(date=2027-06-02, value=35.4), Coupon(date=2027-12-01, value=35.4), Coupon(date=2028-05-31, value=35.4), Coupon(date=2028-11-29, value=35.4), Coupon(date=2029-05-30, value=35.4), Coupon(date=2029-11-28, value=35.4), Coupon(date=2030-05-29, value=35.4), Coupon(date=2030-11-27, value=35.4), Coupon(date=2031-05-28, value=35.4), Coupon(date=2031-11-26, value=35.4), Coupon(date=2032-05-26, value=35.4), Coupon(date=2032-11-24, value=35.4), Coupon(date=2033-05-25, value=35.4), Coupon(date=2033-11-23, value=35.4), Coupon(date=2034-05-24, value=35.4), Coupon(date=2034-11-22, value=35.4), Coupon(date=2035-05-23, value=35.4), Coupon(date=2035-11-21, value=35.4), Coupon(date=2036-05-21, value=35.4), Coupon(date=2036-11-19, value=35.4), Coupon(date=2037-05-20, value=35.4), Coupon(date=2037-11-18, value=35.4), Coupon(date=2038-05-19, value=35.4), Coupon(date=2038-11-17, value=35.4), Coupon(date=2039-05-18, value=35.4), Coupon(date=2039-11-16, value=35.4), Coupon(date=2040-05-16, value=35.4), Coupon(date=2040-11-14, value=35.4), Coupon(date=2041-05-15, value=35.4)], offers=[])"

字符串
我需要得到DataFrame与所有优惠券

date        value

2021-12-08  34.04
2022-06-08  35.4

etc


我知道如何用split()拆分它,然后一个一个地合并。这需要很多时间
我能更好地做到这一点吗?

htzpubme

htzpubme1#

您可以通过使用pandas创建框架来实现这一点

from datetime import datetime
import pandas as pd

bond_data = "Bond(secid='SU26238RMFS4', shortname='ОФЗ 26238', amortization=[Amortization(date=datetime.date(2041, 5, 15), value=1000, initialfacevalue=1000)], coupons=[Coupon(date=2021-12-08, value=34.04), Coupon(date=2022-06-08, value=35.4), ...]"
coupons = [coupon.strip("Coupon(date=").rstrip(")").split(", value=") for coupon in bond_data.split("Coupon(date=")[1:]]
df = pd.DataFrame(coupons, columns=["date", "value"])
df['date'] = pd.to_datetime(df['date'], format='%Y-%m-%d')
print(df)

字符串

jutyujz0

jutyujz02#

你可以尝试使用regex find all方法:

lst = re.findall(r'Coupon\(date=(.*?), value=(.*?)\)', data)

字符串
这将为您留下一个2D列表,然后您可以轻松地将其转换为Pandas数据框。

print(lst)

# [('2021-12-08', '34.04'), ('2022-06-08', '35.4'), ('2022-12-07', '35.4'),
#  ...]

相关问题