matplotlib 重复值线性图的绘制

t1rydlwq  于 2023-03-13  发布在  其他
关注(0)|答案(2)|浏览(223)

我有一个csv文件,列的值是重复的,并且有它们的数量。
现在我怎样画出这些值的线性图呢?
我试过了,但没用。

import matplotlib.pyplot as plt
import pandas as pd

data = {'location': ['Afghanistan'] * 5 + ['Africa'] * 4, 'new_cases': [3, 0, 0, 3, 6, 0, 1, 0, 0]}
newData = pd.DataFrame(data)

fig, ax = plt.subplots(figsize=(15,7))
byLoc = newData.groupby('location').count()['new_cases'].unstack().plot(ax=ax)

追溯

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In [141], line 2
      1 fig, ax = plt.subplots(figsize=(15,7))
----> 2 byLoc = newData.groupby('location').count()['new_cases'].unstack().plot(ax=ax)

File ~\anaconda3\envs\py11\Lib\site-packages\pandas\core\series.py:4455, in Series.unstack(self, level, fill_value)
   4412 """
   4413 Unstack, also known as pivot, Series with MultiIndex to produce DataFrame.
   4414 
   (...)
   4451 b    2    4
   4452 """
   4453 from pandas.core.reshape.reshape import unstack
-> 4455 return unstack(self, level, fill_value)

File ~\anaconda3\envs\py11\Lib\site-packages\pandas\core\reshape\reshape.py:483, in unstack(obj, level, fill_value)
    478         return obj.T.stack(dropna=False)
    479 elif not isinstance(obj.index, MultiIndex):
    480     # GH 36113
    481     # Give nicer error messages when unstack a Series whose
    482     # Index is not a MultiIndex.
--> 483     raise ValueError(
    484         f"index must be a MultiIndex to unstack, {type(obj.index)} was passed"
    485     )
    486 else:
    487     if is_1d_only_ea_dtype(obj.dtype):

ValueError: index must be a MultiIndex to unstack, <class 'pandas.core.indexes.base.Index'> was passed
xienkqul

xienkqul1#

  • Pivot DataFrame,然后通过删除NaN值并“压缩”透视列来对齐索引,如answer所示。
    *python 3.11pandas 1.5.3matplotlib 3.7.0中测试

导入和数据框

import pandas as pd

df = pd.DataFrame({'location': ['Afghanistan'] * 5 + ['Africa'] * 4, 'new_cases': [3, 0, 0, 3, 6, 0, 1, 0, 0]})

绘制新案例

# pivot and drop nan
dfp = df.pivot(columns='location', values='new_cases').apply(lambda x: pd.Series(x.dropna().values))

# plot
ax = dfp.plot(figsize=(8, 6), title='New Cases', xticks=dfp.index)

绘制累积新病例

# add a cumulative column
df['cumulative'] = df.groupby('location').new_cases.transform('cumsum')

# pivot and drop nan
dfp = df.pivot(columns='location', values='cumulative').apply(lambda x: pd.Series(x.dropna().values))

# plot
ax = dfp.plot(figsize=(8, 6), title='New Cases', xticks=dfp.index)

ssm49v7z

ssm49v7z2#

为了画线图,你必须使用时间属性。我假设你有时间属性,这条线显示新的情况下通过时间。

import matplotlib.pyplot as plt

plt.plot(df['time'], df['new_cases'])
plt.title('New Cases over Time')
plt.xlabel('Time')
plt.ylabel('New Cases')
plt.show()

为了显示新病例和新地点的相关性,你可以使用条形图,这是更合适的。

相关问题