pandas dataframe如何添加计算两个连续行之间差异的diff列

3qpi33ja  于 2023-09-29  发布在  其他
关注(0)|答案(1)|浏览(79)

我有一个有两行的dataframe,我想再添加一行来显示两行之间的差异。

data = [
            (10, 20, 30, 40, 50, 60, 70),
            (10, 30, 30, 40, 50, 60, 100)
            ]
        df = pd.DataFrame(data, columns=["a", "b", "c", "d", "d", "f", "g"])

下面的工作,但它添加了额外的行与nan

pd.concat([df, df.diff()])
a     b     c     d     d     f      g
0  10.0  20.0  30.0  40.0  50.0  60.0   70.0
1  10.0  30.0  30.0  40.0  50.0  60.0  100.0
0   NaN   NaN   NaN   NaN   NaN   NaN    NaN
1   0.0  10.0   0.0   0.0   0.0   0.0   30.0
vaj7vani

vaj7vani1#

diff总是产生第一行NaN,只需删除它:

out = pd.concat([df, df.diff().iloc[1:]])

输出量:

a     b     c     d     d     f      g
0  10.0  20.0  30.0  40.0  50.0  60.0   70.0
1  10.0  30.0  30.0  40.0  50.0  60.0  100.0
1   0.0  10.0   0.0   0.0   0.0   0.0   30.0

如果你真的只有两行,你可以硬编码减法:

out = df.copy() # optional, you can also assign to df
out.loc['diff'] = df.iloc[1]-df.iloc[0]

输出量:

a   b   c   d   d   f    g
0     10  20  30  40  50  60   70
1     10  30  30  40  50  60  100
diff   0  10   0   0   0   0   30

相关问题