numpy Pandas-将具有多个标识符的DF处理到组合列中

aij0ehis  于 2022-11-10  发布在  其他
关注(0)|答案(1)|浏览(107)

我有下面的DF,我想要处理成像下面的样本。我有一个困难的时间,因为识别符是分享他们所在的网站的列。柱子的位置也不是固定的,因为在某些情况下,“地”有30天,而“水”只有5天
谢谢你对我的照顾
示例df:

  1. df = pd.DataFrame({'Unnamed: 0': ['Site',"A", "B","C"],
  2. 'Unnamed: 1': ['Identifier 1',"Land", "Land","Land"],
  3. 'Unnamed: 2': ['Day 1',1, 2,3],
  4. 'Unnamed: 3': ['Day 2',4, 5,6],
  5. 'Unnamed: 4': ['Day 3',7, 8,9],
  6. 'Unnamed: 5': ['Day 4',10, 11,12],
  7. 'Unnamed: 6': ['Identifier 2',"Water", "Water","Water"],
  8. 'Unnamed: 7': ['Day 1',13, 14,15],
  9. 'Unnamed: 8': ['Day 2',16, 17,18],
  10. 'Unnamed: 9': ['Day 3',19, 20,21],
  11. 'Unnamed: 10': ['Day 4',22, 23,24],
  12. })

输出文件:

  1. Site Identifier 1 Day 1 Day 2 Day 3 Day 4
  2. A Land 1 4 7 10
  3. B Land 2 5 8 11
  4. C Land 3 6 9 12
  5. A Water 13 16 19 22
  6. B Water 14 17 20 23
  7. C Water 15 18 21 24
dtcbnfnu

dtcbnfnu1#

这是你想要的吗?我首先将 Dataframe 拆分成两个 Dataframe ,每个 Dataframe 包含不同的标识符,最后将它们合并回tgt。以下是代码,希望它能有所帮助。

  1. df = pd.DataFrame({'Unnamed: 0': ['Site',"A", "B","C"],
  2. 'Unnamed: 1': ['Identifier 1',"Land", "Land","Land"],
  3. 'Unnamed: 2': ['Day 1',1, 2,3],
  4. 'Unnamed: 3': ['Day 2',4, 5,6],
  5. 'Unnamed: 4': ['Day 3',7, 8,9],
  6. 'Unnamed: 5': ['Day 4',10, 11,12],
  7. 'Unnamed: 6': ['Identifier 2',"Water", "Water","Water"],
  8. 'Unnamed: 7': ['Day 1',13, 14,15],
  9. 'Unnamed: 8': ['Day 2',16, 17,18],
  10. 'Unnamed: 9': ['Day 3',19, 20,21],
  11. 'Unnamed: 10': ['Day 4',22, 23,24],
  12. })
  13. ## Restriving Column Name
  14. header = [item for item in df.iloc[0]]
  15. ## Filter Out the first row
  16. df = df.iloc[1:]
  17. # Getting first 6 columns as df_land and rename column by header
  18. df_land = df.iloc[:, 0:6]
  19. df_land.columns = header[:6]
  20. # Getting the first column and the last 5 columns as df_water and rename column by header
  21. df_water = df.iloc[:,[0,6,7,8,9,10]]
  22. df_water.columns = header[:6]
  23. # Union two tables
  24. df_final = pd.concat( [df_land,df_water] , axis = 0 )

编辑-用于动态N列

  1. df = pd.DataFrame({'Unnamed: 0': ['Site',"A", "B","C"],
  2. 'Unnamed: 1': ['Identifier 1',"Land", "Land","Land"],
  3. 'Unnamed: 2': ['Day 1',1, 2,3],
  4. 'Unnamed: 3': ['Day 2',4, 5,6],
  5. 'Unnamed: 4': ['Day 3',7, 8,9],
  6. 'Unnamed: 5': ['Day 4',10, 11,12],
  7. 'Unnamed: 11': ['Day 5',10, 11,12],
  8. 'Unnamed: 12': ['Day 6',10, 11,12],
  9. 'Unnamed: 6': ['Identifier 2',"Water", "Water","Water"],
  10. 'Unnamed: 7': ['Day 1',13, 14,15],
  11. 'Unnamed: 8': ['Day 2',16, 17,18],
  12. 'Unnamed: 9': ['Day 3',19, 20,21],
  13. 'Unnamed: 10': ['Day 4',22, 23,24],
  14. # 'Unnamed: 13': ['Day 5',22, 23,24],
  15. })
  16. ## Restriving Column Name
  17. header = [item for item in df.iloc[0]]
  18. ## Filter Out the first row
  19. df = df.iloc[1:]
  20. n_col = len(header)
  21. n_of_column_exclude_from_land = 5
  22. # Getting first 6 columns as df_land and rename column by header
  23. df_land = df.iloc[:, 0:n_col-n_of_column_exclude_from_land]
  24. df_land.columns = header[:n_col-n_of_column_exclude_from_land]
  25. # Getting the first column and the last 5 columns as df_water and rename column by header
  26. water_col_index = [n for n in range(n_col-n_of_column_exclude_from_land,n_col)]
  27. water_col_index.append(0)
  28. water_col_index.sort()
  29. df_water = df.iloc[:,water_col_index]
  30. df_water.columns = header[:n_of_column_exclude_from_land+1]
  31. # Union two tables
  32. df_final = pd.concat( [df_land,df_water] , axis = 0 )
展开查看全部

相关问题