我想使用pandas
Dataframe 计算MTD
和YTD
。为此,我编写了一个代码,并得到了以下错误。
代码:
import pandas as pd
data = {'date' : ['2017/01/01', '2017/01/02', '2017/01/03', '2017/01/04', '2017/01/15', '2017/01/20', '2017/01/23', '2017/01/30','2017/01/01', '2017/01/02', '2017/01/03', '2017/01/04', '2017/01/15', '2017/01/20', '2017/01/23', '2017/01/30', '2017/04/01', '2017/04/02', '2017/04/03', '2017/04/04', '2017/04/15', '2017/04/20', '2017/04/23', '2017/04/30','2017/04/01', '2017/04/02', '2017/04/03', '2017/04/04', '2017/04/15', '2017/04/20', '2017/04/23', '2017/04/30', '2017/05/01', '2017/05/02', '2017/05/03', '2017/05/04', '2017/05/15', '2017/05/20', '2017/05/23', '2017/05/30','2017/05/01', '2017/05/02', '2017/05/03', '2017/05/04', '2017/05/15', '2017/05/20', '2017/05/23', '2017/05/30'],
'product': ['Apple', 'Apple', 'Apple','Apple', 'Apple', 'Apple','Apple', 'Apple', 'Orange', 'Orange', 'Orange','Orange', 'Orange', 'Orange','Orange', 'Orange', 'Apple', 'Apple', 'Apple','Apple', 'Apple', 'Apple','Apple', 'Apple', 'Orange', 'Orange', 'Orange','Orange', 'Orange', 'Orange','Orange', 'Orange', 'Apple', 'Apple', 'Apple','Apple', 'Apple', 'Apple','Apple', 'Apple', 'Orange', 'Orange', 'Orange','Orange', 'Orange', 'Orange','Orange', 'Orange'],
'price': [10, 20, 10, 50, 10, 5, 10, 10, 20, 10, 5, 5, 10, 10, 20, 50, 10, 5, 20, 10, 10, 20, 50, 20, 5, 5, 10, 10, 20, 50, 30, 10, 20, 5, 5, 10, 20, 10, 20, 10, 40, 20, 10, 10, 20, 20, 10, 5]}
df = pd.DataFrame(data)
print("Dataframe-----------------------------------")
print(df)
print("Dataframe Ends------------------------------")
df.date = pd.to_datetime(df.date)
df = df.groupby('date', 'product').price.sum()
df = df.groupby(df.index.to_period('m')).cumsum().reset_index()
print("MTD Dataframe")
print(df)
错误:
Traceback (most recent call last):
File "/home/ab/PycharmProjects/parry-analytics/lib/python3.9/site-packages/pandas/core/generic.py", line 550, in _get_axis_number
return cls._AXIS_TO_AXIS_NUMBER[axis]
KeyError: 'product'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ab/parry-data_processing/parry-analytics/poc.py", line 15, in <module>
df = df.groupby('date', 'product').price.sum()
File "/home/ab/PycharmProjects/parry-analytics/lib/python3.9/site-packages/pandas/core/frame.py", line 7713, in groupby
axis = self._get_axis_number(axis)
File "/home/ab/PycharmProjects/parry-analytics/lib/python3.9/site-packages/pandas/core/generic.py", line 552, in _get_axis_number
raise ValueError(f"No axis named {axis} for object type {cls.__name__}")
ValueError: No axis named product for object type DataFrame
有谁能提出解决这个问题的方案吗?
预期MTD输出:
date product price
0 2017/01/01 Apple 10
1 2017/01/02 Apple 30
2 2017/01/03 Apple 40
3 2017/01/04 Apple 90
4 2017/01/15 Apple 100
5 2017/01/20 Apple 105
6 2017/01/23 Apple 115
7 2017/01/30 Apple 125
8 2017/01/01 Orange 20
9 2017/01/02 Orange 30
10 2017/01/03 Orange 35
11 2017/01/04 Orange 40
12 2017/01/15 Orange 50
13 2017/01/20 Orange 60
14 2017/01/23 Orange 80
15 2017/01/30 Orange 130
16 2017/04/01 Apple 10
17 2017/04/02 Apple 15
18 2017/04/03 Apple 35
19 2017/04/04 Apple 45
20 2017/04/15 Apple 55
21 2017/04/20 Apple 75
22 2017/04/23 Apple 125
23 2017/04/30 Apple 145
24 2017/04/01 Orange 5
25 2017/04/02 Orange 10
26 2017/04/03 Orange 20
27 2017/04/04 Orange 30
28 2017/04/15 Orange 50
29 2017/04/20 Orange 100
30 2017/04/23 Orange 130
31 2017/04/30 Orange 140
32 2017/05/01 Apple 20
33 2017/05/02 Apple 25
34 2017/05/03 Apple 30
35 2017/05/04 Apple 40
36 2017/05/15 Apple 60
37 2017/05/20 Apple 70
38 2017/05/23 Apple 90
39 2017/05/30 Apple 100
40 2017/05/01 Orange 40
41 2017/05/02 Orange 60
42 2017/05/03 Orange 70
43 2017/05/04 Orange 80
44 2017/05/15 Orange 100
45 2017/05/20 Orange 120
46 2017/05/23 Orange 130
47 2017/05/30 Orange 135
预计本年迄今产量:
同上。但应从财政年度开始(4月份)和产品开始计算。
2条答案
按热度按时间xzlaal3s1#
用途:
7bsow1i62#
您看到的错误可能是由于groupby函数的语法问题造成的。
在行
df = df.groupby('date', 'product').price.sum()
中,groupby函数有两个参数:'date'和'product'。但是,这些参数应作为字符串列表传递,如下所示:同时更改行