将字符串添加到pandas dataframe列,其中包含多个逗号分隔值

u3r8eeie  于 2023-04-19  发布在  其他
关注(0)|答案(3)|浏览(130)

我试图在pandas数据框中创建一个新列,该列包含字符串前缀和来自另一列的值。包含值的列具有多个逗号分隔值的示例。例如:

  1. MIMNumber
  2. 102610
  3. 114080,601079

我希望dataframe看起来像这样:

  1. MIMNumber OMIM_Link
  2. 102610 https://www.omim.org/entry/102610
  3. 114080,601079 https://www.omim.org/entry/114080,https://www.omim.org/entry/601079

我试过这个:

  1. df['OMIM_Link'] = df['MIMNumber'].map('https://www.omim.org/entry/{}'.format)

但这并没有将字符串前缀添加到所有有多个逗号分隔值的示例中:

  1. MIMNumber OMIM_Link
  2. 102610 https://www.omim.org/entry/102610
  3. 114080,601079 https://www.omim.org/entry/114080,601079

我也试过这个:

  1. url = 'https://www.omim.org/entry/'
  2. df['OMIM_Link'] = df['MIMNumber'].apply(url.join)

但是字符串前缀连接在每个值之间:

  1. MIMNumber OMIM_Link
  2. 102610 1https://www.omim.org/entry/0https://www.omim.org/entry/2https://www.omim.org/entry/6https://www.omim.org/entry/1https://www.omim.org/entry/0
  3. 114080,601079 1https://www.omim.org/entry/1https://www.omim.org/entry/4https://www.omim.org/entry/0https://www.omim.org/entry/8https://www.omim.org/entry/0https://www.omim.org/entry/,https://www.omim.org/entry/6https://www.omim.org/entry/0https://www.omim.org/entry/1https://www.omim.org/entry/0https://www.omim.org/entry/7https://www.omim.org/entry/9

有什么建议吗?

oymdgrw7

oymdgrw71#

你可以试试regex replace

  1. df['out'] = df['MIMNumber'].replace(r'(\d+)', r'https://www.omim.org/entry/\1', regex=True)
  1. print(df)
  2. MIMNumber \
  3. 0 102610
  4. 1 114080,601079
  5. out
  6. 0 https://www.omim.org/entry/102610
  7. 1 https://www.omim.org/entry/114080,https://www.omim.org/entry/601079
cwtwac6a

cwtwac6a2#

将逗号替换为,https://www.omim.org/entry/,并在开头添加https://www.omim.org/entry/

  1. df['OMIM_Link'] = 'https://www.omim.org/entry/' + df['MIMNumber'].str.replace(',', ',https://www.omim.org/entry/')
5m1hhzi4

5m1hhzi43#

如果你有多种域/路径,就把它放在这里:

  1. import pandas as pd
  2. df = pd.DataFrame({'MIMNumber': ['102610', '114080,601079'],
  3. 'OMIM_Link': ['https://www.omim.org/entry/',
  4. 'https://www.omim.org/entry/,https://www.omim.org/entry/']})
  5. for i in range(len(df)):
  6. mim = df['MIMNumber'][i]
  7. if "," in mim:
  8. mim = mim.split(",")
  9. link = df['OMIM_Link'][i].split(",")
  10. df['OMIM_Link'][i] = ",".join(['{o}{m}'.format(o=link[i], m=mim[i])
  11. for i in range(len(link))])
  12. else:
  13. link = df['OMIM_Link'][i]
  14. df['OMIM_Link'][i] = '{o}{m}'.format(o=link, m=mim)
  15. print(df)

它可以做你想要的:

  1. MIMNumber OMIM_Link
  2. 0 102610 https://www.omim.org/entry/102610
  3. 1 114080,601079 https://www.omim.org/entry/114080,https://www....
展开查看全部

相关问题