matplotlib 如何在python中添加一些统计数据

zhte4eai  于 2023-10-24  发布在  Python
关注(0)|答案(3)|浏览(130)

所以我用matplotlib.pyplot绘制直方图

plt.hist(var)

我很好奇,我是否可以将一些统计数据附加到由

var.describe()

这是一个系列。
结果是这样

8dtrkrch

8dtrkrch1#

使用figtext()

plt.hist(var)
plt.figtext(1.0, 0.2, var.describe())

使用bbox_inches='tight'也可以将文本保存到图片中:

plt.savefig('fig1.png', bbox_inches='tight')
daolsyd0

daolsyd02#

如上面的解决方案所示,文本格式有点混乱。为了解决这个问题,我添加了一个解决方案,我们将描述分为两个图,然后对齐。
助手:

def describe_helper(series):
    splits = str(series.describe()).split()
    keys, values = "", ""
    for i in range(0, len(splits), 2):
        keys += "{:8}\n".format(splits[i])
        values += "{:>8}\n".format(splits[i+1])
    return keys, values

现在绘制图表:

demo = np.random.uniform(0,10,100)
plt.hist(demo, bins=10)
plt.figtext(.95, .49, describe_helper(pd.Series(demo))[0], {'multialignment':'left'})
plt.figtext(1.05, .49, describe_helper(pd.Series(demo))[1], {'multialignment':'right'})
plt.show()

如果您还想在保存图像时保存figtext,请参考答案1

of1yzvn4

of1yzvn43#

通常,对于散点图和折线图,最简单的方法是将其粘贴在图例中。这样,它也会自动定位到图上的“最佳”位置!:

plt.plot(
    df["x"].to_numpy(),
    df["y"].to_numpy(),
    'b-o', # blue line with circles
    label=f"{df['y'].describe()}"  # <==== summary stats in the legend
)
plt.legend(fontsize="x-small")  # <=== set their font size in the legend

请注意,根据the official Pandas documentation for .describe()
50百分位数与中位数相同。
所以如果你想知道中位数在哪里,它在那里。
其他图例fontsize选项包括任何整数或以下关键字:xx-smallx-smallsmallmediumlargex-largexx-large。请参见:https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.legend.html
下面是一个完整的、可运行的示例:

**pandas_plot_scatter_add_summary_describe_statistics_to_legend.py**来自我的eRCaGuy_hello_world repo,带有大量的解释性注解:

#!/usr/bin/env python3

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

# generate sinusoidal data with some random Gaussian (normal distribution, or
# "bell curve") noise added to it

NUM_POINTS = 200
ONE_PERIOD = 2*np.pi  # 2*pi radians = 360 degrees = 1 full period
x = np.linspace(0, 2*ONE_PERIOD, NUM_POINTS)

# the standard deviation, or "width", of the noise; see:
# https://numpy.org/doc/stable/reference/random/generated/numpy.random.normal.html
CENTER = 0  # mean (center) of the normal distribution
SCALE = 0.3  # standard deviation (scale) of the normal distribution
NOISE_SIZE = NUM_POINTS  # number of points in the noise
noise_array = np.random.normal(CENTER, SCALE, NUM_POINTS)

y = np.sin(x) + noise_array

# create a dataframe from the numpy arrays above
df = pd.DataFrame({'x': x, 'y': y})
print(df)

# plot the data

# fig = plt.figure(figsize=(18, 10.8))  # default is `(6.4, 4.8)` inches
fig = plt.figure()
plt.title("Sine wave with noise")
plt.plot(
    df["x"].to_numpy(),
    df["y"].to_numpy(),
    # 'bo', # blue circles, no line
    'b-o', # blue line with circles
    # add summary statistics to the legend; NB: "The 50 percentile is the same
    # as the median."; see:
    # https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.describe.html
    label=f"{df['y'].describe()}"
)
# fontsize: int or 'xx-small', 'x-small', 'small', 'medium', 'large', 'x-large',
# 'xx-large'; see:
# https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.legend.html
plt.legend(fontsize="x-small")
plt.xlabel("radians (rad)")
plt.ylabel("amplitude (-)")

plt.show()

相关问题