我有下面的数据集,我希望创建不同的组来计算name下的值的出现次数。
有:(县在串)
name state county
apple MD 1
apple DC 1
pear VA 1
pear VA 2
pear CA 5
peach CO 3
peach CO 3
peach CO 2
peach CO 2
想要:
name state county freq_name freq_state freq_county
apple MD 1 2 1 2
apple DC 1 2 1 2
pear VA 1 3 2 3
pear VA 2 3 2 3
pear CA 5 3 1 3
peach CO 3 4 4 2
peach CO 2 4 4 2
我相信通过sql,over partition将允许按不同级别进行计数,例如:
count(name) over (partition by name) as freq_name,
count(name) over (partition by state) as freq_state,
count(name) as freq_county
from have
group by name,state, county;
由于某些原因,这段代码没有为freq\u name提供正确的计数。我还想检查我的freq\u state和freq\u county的代码是否正确。谢谢!
2条答案
按热度按时间8hhllhi21#
你似乎想要:
bgibtngc2#
对于
freq_name
,使用count(*)
而不是count(name)
```count(*) over (partition by name) as freq_name,
count(name) over (partition by state) as freq_state,
count(name) as freq_county
from have
group by name,state, county;