matplotlib 覆盖在地块顶部的Pandas

xxb16uws 于 2022-11-15 发布在其他

关注(0)|答案(1)|浏览(188)

我在Pandas数据框中有一些时间序列数据，我可以将其可视化如下：

import pandas as pd

d = {'end_time': [datetime.datetime(2020, 3, 17, 0, 0), datetime.datetime(2020, 3, 17, 0, 5), datetime.datetime(2020, 3, 17, 0, 10), datetime.datetime(2020, 3, 17, 0, 15), datetime.datetime(2020, 3, 17, 0, 20), datetime.datetime(2020, 3, 17, 0, 25), datetime.datetime(2020, 3, 17, 0, 30), datetime.datetime(2020, 3, 17, 0, 35), datetime.datetime(2020, 3, 17, 0, 40), datetime.datetime(2020, 3, 17, 0, 45), datetime.datetime(2020, 3, 17, 0, 50), datetime.datetime(2020, 3, 17, 0, 55)], "measurement": [2000, 1500, 800, 900, 400, 4000, 300, 900, 1000, 1250, 1100, 1300], "reliability": [99, 81, 84, 85, 99, 86, 96, 97, 98, 99, 98, 97]}

# select some relevant columns
subset_df = pd.DataFrame.from_dict(d)

# plot measurements over time
subset_df.plot('end_time', 'measurement')

reliability列是0和100之间的一个数字。我想做的是突出显示可靠性分数低于95的区域。因此，我可以在这些区域周围覆盖一个透明框，以直观地突出显示测量结果可能不太可靠的区域。

matplotlib

来源：https://stackoverflow.com/questions/74192039/pandas-overlay-on-top-of-a-plot

1条答案

按热度按时间

ctrmrzij1#

考虑这个随机 Dataframe ，其中

end_time：从2020-03-17 00:00:00到2020-03-17 00:55:00的日期，以5分钟为间隔
measurement：介于300和4000之间的随机整数
reliability：介于0和100之间的随机整数

import pandas as pd
import numpy as np

df = pd.DataFrame({'end_time': pd.date_range(start='2020-03-17 00:00:00', end='2020-03-17 00:55:00', freq='5min'),
                   'measurement': np.random.randint(300, 4000, size=12),
                      'reliability': np.random.randint(0, 100, size=12)})

[Out]:

              end_time  measurement  reliability
0  2020-03-17 00:00:00         3905            7
1  2020-03-17 00:05:00         1143           93
2  2020-03-17 00:10:00         2672           55
3  2020-03-17 00:15:00          416           29
4  2020-03-17 00:20:00         1246           21
5  2020-03-17 00:25:00         2743           32
6  2020-03-17 00:30:00         2798           49
7  2020-03-17 00:35:00         1012           21
8  2020-03-17 00:40:00         3894           64
9  2020-03-17 00:45:00         1218           18
10 2020-03-17 00:50:00         1600           97
11 2020-03-17 00:55:00          729           76

如果目标是将reliability小于95的所有测量标绘为红色，其余的标绘为蓝色，那么让我们首先创建一些有用的变量：

measurement，其中reliability低于95：

measures = df[df.reliability < 95].measurement

measurement的end_time，其中reliability低于95：

dates = df[df.reliability < 95].end_time

measurement与高于95：

measures2 = df[df.reliability >= 95].measurement

measurement的end_time与高于95的reliability：

dates2 = df[df.reliability >= 95].end_time

现在让我们创建情节

import matplotlib.pyplot as plt

# Create the plot:
plt.plot(dates, measures, 'ro', dates2, measures2, 'bo')
# Set the title:
plt.title('Measures over time')
# Set the x label:
plt.xlabel('Date')
# Set the y label:
plt.ylabel('Measure')
# Set the x ticks:
plt.xticks(rotation=45)
# Show the plot:
plt.show()

现在，根据要求（use fill_between so that I can paint a transparent box from the x-axes to the top of y-axes），在plt.show()之前，可以使用以下内容

plt.fill_between(dates, 0, measures, color='red', alpha=0.2)

赞(0）回复(0）举报 2022-11-15

我来回答

matplotlib 覆盖在地块顶部的Pandas

1条答案

相关问题

热门标签

最新问答