基于索引或列的范围进行聚合- Pandas

2q5ifsrm  于 2023-04-18  发布在  其他
关注(0)|答案(1)|浏览(126)

我有一个很大的df:

CallType    Broadcast   C1  C2  Csk Data    Netd2   Net3    OpenP1  OpenP2  OpenP8  SBD Voice
LFrame                                              
0   85.811985   0.820731    0.479020    0.550982    23.95   0.0 4.79    32.338503   23.573862   8.462412    6.696933    3.781450
22  0.000000    1.143358    0.303666    0.375464    28.74   0.0 9.58    29.013984   32.640732   8.462412    5.993194    2.666097
44  99.160516   0.918363    0.889132    0.489427    28.74   0.0 4.79    28.107297   33.849648   6.044580    4.790000    0.000000
66  0.000000    0.923886    0.983070    0.675550    28.74   0.0 9.58    27.805068   27.805068   5.440122    13.328801   4.984644

超过50,000行。我需要对一组LFrame的数据进行bin,比如binsize= 600(每个bin中有600行)。新df的值应该是这600行的平均LFrame,但所有其他列的600行组的总和。
我不知道如何使用pivot_table或groupby来实现这一点。同样,对于600的binsize,我希望LFrame col是600行的平均值,其余的是sums。所以类似于:

CallType    Broadcast   Certus1 Certus2 Certus8apsk Data    Netted2 Netted3 OpenPort1   OpenPort2   OpenPort8   ShortBurstData  Voice
LFrame                                              
300 585.811985  440.820731  0.479020    340.550982  4323.95 0.0 4.79    3332.338503 23.573862   308.462412  346.696933  7733.781450
900 1230.0000   341.143358  430.303666  430.375464  2448.74 0.0 9.58    3329.013984 32.640732   408.462412  345.993194  942.666097
1200    4099.160516 340.918363  730.889132  430.489427  4428.74 0.0 4.79    4428.107297 33.849648   406.044580  434.790000  340.0000
1500    400.0000    340.923886  0.983070    4430.675550 3428.74 0.0 9.58    2447.805068 2437.805068 4405.440122 343413.328801   334.984644
bxgwgixi

bxgwgixi1#

下面是一个方法来做你的问题问:

binsize = 2
res = ( df.set_index(df.index.to_series().floordiv(binsize))
    .groupby(level=0)
    .agg({col:'mean' if col == 'LFrame' else 'sum' for col in df.columns}) )

样品输入:

LFrame   CallType  Broadcast        C1        C2  Csk Data  Netd2  Net3     OpenP1     OpenP2    OpenP8        SBD     Voice
0     0.0  85.811985   0.820731  0.479020  0.550982     23.95    0.0  4.79  32.338503  23.573862  8.462412   6.696933  3.781450
1    22.0  85.811985   0.820731  0.479020  0.550982     23.95    0.0  4.79  32.338503  23.573862  8.462412   6.696933  3.781450
2    44.0   0.000000   1.143358  0.303666  0.375464     28.74    0.0  9.58  29.013984  32.640732  8.462412   5.993194  2.666097
3    66.0   0.000000   1.143358  0.303666  0.375464     28.74    0.0  9.58  29.013984  32.640732  8.462412   5.993194  2.666097
4    88.0  99.160516   0.918363  0.889132  0.489427     28.74    0.0  4.79  28.107297  33.849648  6.044580   4.790000  0.000000
5   110.0  99.160516   0.918363  0.889132  0.489427     28.74    0.0  4.79  28.107297  33.849648  6.044580   4.790000  0.000000
6   132.0   0.000000   0.923886  0.983070  0.675550     28.74    0.0  9.58  27.805068  27.805068  5.440122  13.328801  4.984644
7   154.0   0.000000   0.923886  0.983070  0.675550     28.74    0.0  9.58  27.805068  27.805068  5.440122  13.328801  4.984644

输出:

LFrame    CallType  Broadcast        C1        C2  Csk Data  Netd2   Net3     OpenP1     OpenP2     OpenP8        SBD     Voice
0    11.0  171.623970   1.641462  0.958040  1.101964     47.90    0.0   9.58  64.677006  47.147724  16.924824  13.393866  7.562900
1    55.0    0.000000   2.286716  0.607332  0.750928     57.48    0.0  19.16  58.027968  65.281464  16.924824  11.986388  5.332194
2    99.0  198.321032   1.836726  1.778264  0.978854     57.48    0.0   9.58  56.214594  67.699296  12.089160   9.580000  0.000000
3   143.0    0.000000   1.847772  1.966140  1.351100     57.48    0.0  19.16  55.610136  55.610136  10.880244  26.657602  9.969288

相关问题