pandas 分隔在一个框架中，在一定条件下，连续的列彼此分开

yuvru6vn 于 2024-01-04 发布在其他

关注(0)|答案(2)|浏览(98)

我是一个初学者，并不擅长使用Python。我遇到了以下问题，我相信这很容易解决.让我们假设我们有下面的'df_initial'结构作为输入。

print (df)
     Product       type    0    1    2    3    4    5    6
0   BURGUNDY     actual  645  600  720  640  500  320  300
1   BORDEAUX     actual  730  730  710  500  500  450  450
2  CHAMPAGNE     actual  320  260  280  100  100  100    0
3   BURGUNDY  objective  800  760  720  640  600  560  520
4   BORDEAUX  objective  750  730  710  690  670  550  630
5  CHAMPAGNE  objective  500  490  480    0    0    0    0

字符串
我创建了下面的pivot表：

df_pivot = df_initial.pivot_table(index='Product', columns=['type'],
                                  aggfunc=np.sum)

型
x1c 0d1x的数据
我想计算每个位置1、2、3、...、6处的每个产品的实际/目标比率，附加条件是，如果除法不可能，则比率应设置为100%（对于“目标”列中的值等于零的任何行）。

ratio_df = pd.DataFrame(index=df_pivot.index)

型
我定义了如下函数：

def division(a, b):
    if b == 0:
        return 1
    else:
        return a/b

型
然后我试着应用这个函数：

for col in df_pivot.columns:
    if col[1] == 'objective':
        continue
    ratio_df[col[0]] = division (df_pivot[(col[0], 'actual')] , df_pivot[(col[0], 'objective')])

型
这将返回：

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

型
我明白为什么这不能工作，但我不知道如何正确地编写脚本。
我想要一张这样的table：

的
我试着把函数改成：

def division(a, b):
    if b.values[0] == 0:
        return 1
    else:
        return a / b.values[0]

型
然后t：

ratio_df[col[0]] = df_pivot[col].apply(division, b=df_pivot.loc[:, (col[0], 'objective')])

型
但这并不产生正确的输出。

pandas

来源：https://stackoverflow.com/questions/77731558/division-of-sucessive-columns-by-one-another-in-a-dataframe-subject-to-a-condit

2条答案

按热度按时间

jfgube3f1#

您可以使用groupby.sum，然后使用loc，replace将Inf切片为1，并可选地使用applymap进行格式化：

tmp = df_initial.groupby(['type', 'Product']).sum()
ratio_df = (tmp.loc['actual']
            .div(tmp.loc['objective'])
            .replace(np.inf, 1).fillna(1)
            .applymap('{:.0%}'.format) # optional (to format as %)
            )

字符串
或者使用初始的pivot_table和swaplevel来更轻松地切片：

df_pivot = (df_initial
            .pivot_table(index='Product', columns='type', aggfunc='sum')
            .swaplevel(axis=1)
           )
out = (df_pivot['actual']
       .div(df_pivot['objective'])
       .replace(np.inf, 1).fillna(1)
       .applymap('{:.0%}'.format)
      )

型

注：对于最近的pandas版本，请将applymap替换为map。*

输出量：

0     1     2     3     4     5     6
Product                                          
BORDEAUX   97%  100%  100%   72%   75%   82%   71%
BURGUNDY   81%   79%  100%  100%   83%   57%   58%
CHAMPAGNE  64%   53%   58%  100%  100%  100%  100%

型

展开查看全部

赞(0）回复(0）举报 2024-01-04

1aaf6o9v2#

您可以聚合GroupBy.sum，除以DataFrame.xs，并通过DataFrame.xs选择MultiIndex级别，最后使用DataFrame.replace替换INF，乘以100，并通过1替换缺失值：

df1 = df_initial.groupby(['type', 'Product']).sum()
out = df1.xs('actual').div(df1.xs('objective')).replace(np.inf, 1).mul(100).fillna(1)
print (out)
                   0           1           2           3           4  \
Product                                                                
BORDEAUX   97.333333  100.000000  100.000000   72.463768   74.626866   
BURGUNDY   80.625000   78.947368  100.000000  100.000000   83.333333   
CHAMPAGNE  64.000000   53.061224   58.333333  100.000000  100.000000   
                    5          6  
Product                           
BORDEAUX    81.818182  71.428571  
BURGUNDY    57.142857  57.692308  
CHAMPAGNE  100.000000 100.000000

字符串
对于百分比，添加DataFrame.applymap并删除乘以100的倍数：

out = (df1.xs('actual')
          .div(df1.xs('objective'))
          .replace(np.inf, 1)
          .fillna(1)
          .applymap('{:.0%}'.format))
print (out)
             0     1     2     3     4     5     6
Product                                           
BORDEAUX   97%  100%  100%   72%   75%   82%   71%
BURGUNDY   81%   79%  100%  100%   83%   57%   58%
CHAMPAGNE  64%   53%   58%  100%  100%  100%  100%

型

展开查看全部

赞(0）回复(0）举报 2024-01-04

我来回答

pandas 分隔在一个框架中，在一定条件下，连续的列彼此分开

2条答案

相关问题

热门标签

最新问答