pandas panda groupby和count相同的列值

x7yiwoj4 于 2023-01-15 发布在其他

关注(0)|答案(2)|浏览(153)

我有一个DataFrame，您可以通过运行以下命令获得它：

import pandas as pd
from io import StringIO

df = """  

               case_id    scheduled_date        status_code
               1213       2021-08               success
               3444       2021-06               fail
               4566       2021-07               unknown
               12213      2021-08               unknown
               34344      2021-06               fail
               44566      2021-07               unknown
               1213       2021-08               fail
              
        """
df= pd.read_csv(StringIO(df.strip()), sep='\s\s+', engine='python')

这将输出：

case_id   scheduled_date  status_code
0   1213    2021-08         success
1   3444    2021-06         fail
2   4566    2021-07         unknown
3   12213   2021-08         unknown
4   34344   2021-06         fail
5   44566   2021-07         unknown
6   1213    2021-08         fail

我如何计算每个月的成功、失败和未知数？
输出应如下所示：

scheduled_date  num of success  num of fail  num of unknown

2021-08           1               1           1
2021-06           0               2           0
2021-07           0               0           2

pandas

来源：https://stackoverflow.com/questions/75102100/pandas-groupby-and-count-same-column-value

2条答案

按热度按时间

jm2pwxwz1#

下面是pandas.crosstab的一个命题：

out = (
        pd.crosstab(df["scheduled_date"], df["status_code"])
            .rename_axis(None, axis=1)
            .add_prefix("num of ")
            .sort_index(ascending=False)
            .reset_index()
        )

输出：

print(out)

  scheduled_date  num of fail  num of succuss  num of unknown
0        2021-08            1               1               1
1        2021-07            0               0               2
2        2021-06            2               0               0

赞(0）回复(0）举报 2023-01-15

lhcgjxsq2#

可以使用.pivot_table()为每个（月份、状态代码）对创建计数，然后使用.fillna将NaNs替换为零计数：

df.pivot_table(index="scheduled_date", columns="status_code", aggfunc=len).fillna(0)

这将输出：

case_id
status_code       fail success unknown
scheduled_date
2021-06            2.0     0.0     0.0
2021-07            0.0     0.0     2.0
2021-08            1.0     1.0     1.0

赞(0）回复(0）举报 2023-01-15

我来回答

pandas panda groupby和count相同的列值

2条答案

输出：

相关问题

热门标签

最新问答