matplotlib pythonplt.hist()方法有什么问题吗？

pprl5pva 于 2023-05-18 发布在 Python

关注(0)|答案(1)|浏览(130)

我有2个dataframes的名称合并和初始化。第二个是第一个的子集。我绘制了两个数据集每列的直方图，以进行比较。我看到第二个dataframe的值中存在一些不应该存在的差异，因为第二个是第一个的子集。为了确保列的值，我打印了两个 Dataframe 的值。因此，对于列fragC，我有以下值[13.01 46.03 12.05 64.08 14.04]和[13.01 64.08]正如你所看到的，第二个是第一个的子集。当我绘制直方图时，我接收到这个

OPERA是第二个 Dataframe 。这是奇怪的，因为对于第二个 Dataframe ，它看起来有第一个 Dataframe 中不存在的值，但这不是真的。我正在使用下面的代码绘图

for column in common_columns:
    # Exclude the excluded_columns from the comparison
    if column not in excluded_columns:
        print("")
        our_values = df1[column].values
        opera_values = df2[column].values
        print(column)
        print(our_values)
        print(opera_values)
        # Plot the distribution for df1 and df2
        plt.figure(figsize=(10, 6))
        plt.hist(df1[column], bins=20, alpha=0.5, label='our dataset')
        plt.hist(df2[column], bins=20, alpha=0.5, label='OPERA')
        plt.xlabel('Values')
        plt.ylabel('Frequency')
        plt.title(f'Distribution Comparison for Column: {column}')
        plt.legend()
        plt.tight_layout()
        plt.show()

Dataframe 的列大小非常大，但下面我只提供了特定的列

{0: 13.01, 1: 46.03, 2: 12.05, 3: 64.08, 4: 14.04}
{0: 13.01, 1: 64.08}

matplotlib

来源：https://stackoverflow.com/questions/76258737/is-there-something-wrong-with-the-python-plt-hist-method

1条答案

按热度按时间

laximzn51#

原因是仓位分布不同。第一个数据集有20个区间，从12.05到64.08。第二个数据集有20个区间，从13.01运行到64.08。
如果您希望bin从0开始，则需要指定range或bins。
https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.hist.html

赞(0）回复(0）举报 2023-05-18

我来回答

matplotlib pythonplt.hist()方法有什么问题吗？

1条答案

相关问题

热门标签

最新问答