numpy python:根据列的数字名称将值排序到正确的数组中

4xy9mtcn  于 2024-01-08  发布在  Python
关注(0)|答案(1)|浏览(191)

我有两个篮子
df1是以以下形式收集的数据:

  1. Index 0 1 2 ...
  2. 0 (float,int) (float,int) (float,int) (float,int)
  3. 1 (float,int) (float,int) (float,int) (float,int)
  4. ... (float,int) (float,int) (float,int) (float,int)

字符串
df2是一个空的df构建,如下所示:

  1. df2 = pd.DataFrame(index=df1.index, columns = np.arange(min, max, step).tolist())
  2. Index float0 float1 float2 ...
  3. 0
  4. 1
  5. ...


我的问题是,我需要对df1中的每个条目,将列出的浮点数与df2中的列名进行比较,并将其对应的int数排序到df2中,并将其添加到任何预先存在的值中。
到目前为止,我得到了:

  1. for j in range(len(df1)): # for every row
  2. for i in range(len(df1.columns)): # for every column
  3. # the following line is only pseudo code which I can't figure out how to phrase
  4. y = df2 column to which df1[i][j][0] is closest in value
  5. df2[y][j] = df2[y][j] + df1[i][j][1]


例如,如果:
df1 =

  1. Index 0 1 2 ...
  2. 0 (.2,3) (.4,5) (.4,4) (.6,2)
  3. 1 (.5,2) (.8,8) (.8,5) (.2,9)
  4. ... (.4,3) (.2,7) (.3,4) (.7,1)


df2 =

  1. Index .24 .47 .79 ...
  2. 0
  3. 1
  4. ...


df2(填充)=

  1. Index .24 .47 .79 ...
  2. 0 3 5+4 2
  3. 1 9 2 8+5
  4. ...

wrrgggsh

wrrgggsh1#

你可以尝试遍历df1中的每个单元格,解包元组以获得float和int值,在df2中找到最接近的列,然后相应地更新df2。下面是你如何做到这一点:

  1. import pandas as pd
  2. import numpy as np
  3. # Assuming df1 is already defined
  4. # Define df2 with the given structure
  5. # df2 = pd.DataFrame(index=df1.index, columns=np.arange(min_value, max_value, step).tolist())
  6. def find_closest_column(value, columns):
  7. """Find the column name in df2 that is closest to the given value."""
  8. return min(columns, key=lambda x: abs(x - value))
  9. # Initialize df2 with zeros (or any default value you prefer)
  10. df2 = df2.fillna(0)
  11. for row_index in df1.index:
  12. for col_index in df1.columns:
  13. float_val, int_val = df1.at[row_index, col_index] # Unpack the tuple from df1
  14. closest_col = find_closest_column(float_val, df2.columns.astype(float))
  15. df2.at[row_index, closest_col] += int_val # Accumulate the int values in df2
  16. # df2 now contains the accumulated values

字符串
此脚本将修改df2,以便对于df1中的每个元组,它在df2中查找最近的列,并在df2中累积元组的整数部分。

**注意:**确保df2中的列名是浮点型的。如果不是,可能需要在find_closest_column函数中使用df2.columns.astype(float)进行转换。此外,此解决方案假设df1和df2在索引方面正确对齐。

展开查看全部

相关问题