如何选择行数>=2的所有行?
我有下面的PandasDataframe。
df = pd.DataFrame({"date": ["2000-01-03", "2000-01-04", "2000-01-04", "2000-01-04", "2000-01-04",
"2000-01-03", "2000-01-04", "2000-01-05", "2000-01-05",
"2000-01-03", "2000-01-05", "2000-01-05",
"2000-01-04", "2000-01-05"],
"sym": ["A", "A", "A", "A", "A" ,"B", "B","B", "B" ,"C", "C", "C", "D", "E"],
"val1": [1, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 2, 2],
"val2": [2, 2, 2, 2, 2, 2, 3, 3, 3, 1, 1, 2, 2, 2]
})
测向
date sym val1 val2
0 2000-01-03 A 1 2
1 2000-01-04 A 1 2
2 2000-01-04 A 1 2
3 2000-01-04 A 1 2
4 2000-01-04 A 1 2
5 2000-01-03 B 2 2
6 2000-01-04 B 2 3
7 2000-01-05 B 2 3
8 2000-01-05 B 2 3
9 2000-01-03 C 3 1
10 2000-01-05 C 3 1
11 2000-01-05 C 3 2
12 2000-01-04 D 2 2
13 2000-01-05 E 2 2
我申请了
df.groupby(['date', 'sym'], as_index=False).mean().sort_values(['sym','date'])
为每个符号指定日期的val1、val2求平均值。
date sym val1 val2
0 2000-01-03 A 1.0 2.0
3 2000-01-04 A 1.0 2.0
1 2000-01-03 B 2.0 2.0
4 2000-01-04 B 2.0 3.0
6 2000-01-05 B 2.0 3.0
2 2000-01-03 C 3.0 1.0
7 2000-01-05 C 3.0 1.5
5 2000-01-04 D 2.0 2.0
8 2000-01-05 E 2.0 2.0
接下来,我需要选择行计数>=2的“sym”的所有行。在本例中,结果df将是sym=a,b,c中的所有行
期望输出:
date sym val1 val2
0 2000-01-03 A 1.0 2.0
3 2000-01-04 A 1.0 2.0
1 2000-01-03 B 2.0 2.0
4 2000-01-04 B 2.0 3.0
6 2000-01-05 B 2.0 3.0
2 2000-01-03 C 3.0 1.0
7 2000-01-05 C 3.0 1.5
我尝试了组合groupby,pivot,count,但运气不好。
1条答案
按热度按时间mhd8tkvw1#
请参阅:如何基于值计数过滤Dataframe?
输出: