python-3.x 获取整个数据集中的单个值计数

0yg35tkg  于 2022-12-15  发布在  Python
关注(0)|答案(2)|浏览(152)

我是Pandas的新手,为了学习,我调查了互联网,我用了计数(),数值计数()来按列计数值,但现在我遇到了一个问题。我有一个车祸报告数据集,它的空值被替换为“未报告”,所以我想计算整个数据集中具有该值的单元格的数量,并按列显示它。我有没有办法达到这样的结果?
数据集的值如下所示

| Location     | Severity     | Time         | Outcome      | Substance Used | Traffic Signal |
| --------     | --------     | ----------   | -----------  | -------------- | -------------- |
| New York     | Level 1      | Not Reported | Casualty     | Alcohol        | Red            |
| Texas        | Not Reported |  7:00:00     | Minor Injury | Not Reported   | Green          |
| Not Reported | Level 4      | Not Reported | Not Reported | Smoking        | Yellow         |

所需的输出如下。

| Column         | Value        | Count |
| -------------- | ------------ | ----- |
| Location       | Not Reported | 1     |
| Severity       | Not Reported | 1     |
| Time           | Not Reported | 2     |
| Outcome        | Not Reported | 1     |
| Substance Used | Not Reported | 1     |
| Traffic Signal | Not Reported | 0     |
yzckvree

yzckvree1#

您可以用途:

(df.where(df.eq('Not Reported')).stack(dropna=False)
   .groupby(level=1).agg(Value='first', Count='count')
   .reset_index()
)

输出:

index         Value  Count
0        Location  Not Reported      1
1         Outcome  Not Reported      1
2        Severity  Not Reported      1
3  Substance Used  Not Reported      1
4            Time  Not Reported      2
5  Traffic Signal          None      0
xlpyo6sf

xlpyo6sf2#

您可以通过比较Not Reported的所有值与sum来计算Not Reported,无需groupby

s = df.eq('Not Reported').sum()
print (s)
Location          1
Severity          1
Time              2
Outcome           1
Substance Used    1
Traffic Signal    0
dtype: int64

在DataFrame构造函数中,您的预期输出可能是get:

df1 = pd.DataFrame({'Column': s.index, 'Value':'Not Reported', 'Count': s.to_numpy()})
print (df1)
           Column         Value  Count
0        Location  Not Reported      1
1        Severity  Not Reported      1
2            Time  Not Reported      2
3         Outcome  Not Reported      1
4  Substance Used  Not Reported      1
5  Traffic Signal  Not Reported      0

相关问题