numpy python:根据列的数字名称将值排序到正确的数组中

4xy9mtcn  于 2024-01-08  发布在  Python
关注(0)|答案(1)|浏览(155)

我有两个篮子
df1是以以下形式收集的数据:

Index       0            1            2           ...
0       (float,int)  (float,int)  (float,int)  (float,int)
1       (float,int)  (float,int)  (float,int)  (float,int)
...     (float,int)  (float,int)  (float,int)  (float,int)

字符串
df2是一个空的df构建,如下所示:

df2 = pd.DataFrame(index=df1.index, columns = np.arange(min, max, step).tolist())

Index       float0      float1       float2           ...
0
1
...


我的问题是,我需要对df1中的每个条目,将列出的浮点数与df2中的列名进行比较,并将其对应的int数排序到df2中,并将其添加到任何预先存在的值中。
到目前为止,我得到了:

for j in range(len(df1)): # for every row
    for i in range(len(df1.columns)): # for every column
        # the following line is only pseudo code which I can't figure out how to phrase
        y = df2 column to which df1[i][j][0] is closest in value
        df2[y][j] = df2[y][j] + df1[i][j][1]


例如,如果:
df1 =

Index      0       1       2      ...
0       (.2,3)  (.4,5)  (.4,4)  (.6,2)
1       (.5,2)  (.8,8)  (.8,5)  (.2,9)
...     (.4,3)  (.2,7)  (.3,4)  (.7,1)


df2 =

Index     .24     .47     .79     ...
0
1
...


df2(填充)=

Index     .24     .47     .79     ...
0          3      5+4      2
1          9       2      8+5
...

wrrgggsh

wrrgggsh1#

你可以尝试遍历df1中的每个单元格,解包元组以获得float和int值,在df2中找到最接近的列,然后相应地更新df2。下面是你如何做到这一点:

import pandas as pd
import numpy as np

# Assuming df1 is already defined
# Define df2 with the given structure
# df2 = pd.DataFrame(index=df1.index, columns=np.arange(min_value, max_value, step).tolist())

def find_closest_column(value, columns):
    """Find the column name in df2 that is closest to the given value."""
    return min(columns, key=lambda x: abs(x - value))

# Initialize df2 with zeros (or any default value you prefer)
df2 = df2.fillna(0)

for row_index in df1.index:
    for col_index in df1.columns:
        float_val, int_val = df1.at[row_index, col_index]  # Unpack the tuple from df1
        closest_col = find_closest_column(float_val, df2.columns.astype(float))
        df2.at[row_index, closest_col] += int_val  # Accumulate the int values in df2

# df2 now contains the accumulated values

字符串
此脚本将修改df2,以便对于df1中的每个元组,它在df2中查找最近的列,并在df2中累积元组的整数部分。

**注意:**确保df2中的列名是浮点型的。如果不是,可能需要在find_closest_column函数中使用df2.columns.astype(float)进行转换。此外,此解决方案假设df1和df2在索引方面正确对齐。

相关问题