在Python中使用Matplotlib或Pandas绘制直方图

3j86kqsm 于 2023-10-24 发布在 Python

关注(0)|答案(1)|浏览(144)

我已经从这个论坛上的不同职位，但我找不到一个答案，我看到的行为。
我有一个CSV文件，它的头有许多条目，每个条目有300个点。（csv文件的列）我想绘制一个直方图。x轴包含该列上的元素，y轴应该有每个bin内的样本数量。由于我有300个点，所有bin中的样本总数加在一起应该是300，所以y轴应该从0到50（只是一个例子）。然而，这些值是巨大的（400 e8），这是没有意义的。

表点mydata示例

1 |250.23 e-9 2| 250.123e-9.|三百|251.34e-9

请检查我的代码，下面。我使用pandas打开csv和Matplotlib的休息。

df=pd.read_csv("/home/pcardoso/raw_data/myData.csv")

# Figure parameters
figPath='/home/pcardoso/scripts/python/matplotlib/figures/'
figPrefix='hist_'           # Prefix to the name of the file.
figSuffix='_something'      # Suffix to the name of the file.
figString=''    # Full string passed as the figure name to be saved

precision=3
num_bins = 50

columns=list(df)

for fieldName in columns:

    vectorData=df[fieldName]
    
    # statistical data
    mu = np.mean(vectorData)  # mean of distribution
    sigma = np.std(vectorData)  # standard deviation of distribution

    # Create plot instance
    fig, ax = plt.subplots()

    # Histogram
    n, bins, patches = ax.hist(vectorData, num_bins, density='True',alpha=0.75,rwidth=0.9, label=fieldName)
    ax.legend()
    
    # Best-fit curve
    y=mlab.normpdf(bins, mu, sigma)
    ax.plot(bins, y, '--')
    
    # Setting axis names, grid and title
    ax.set_xlabel(fieldName)
    ax.set_ylabel('Number of points')
    ax.set_title(fieldName + ': $\mu=$' + eng_notation(mu,precision) + ', $\sigma=$' + eng_notation(sigma,precision))
    ax.grid(True, alpha=0.2)
    
    fig.tight_layout()      # Tweak spacing to prevent clipping of ylabel
    
    # Saving figure
    figString=figPrefix + fieldName +figSuffix
    fig.savefig(figPath + figString)

plt.show()

plt.close(fig)

总而言之，我想知道如何使y轴值正确。
编辑：2020年7月6日

我希望密度估计器遵循这样的图：

matplotlib

来源：https://stackoverflow.com/questions/62734245/plotting-histograms-in-python-using-matplotlib-or-pandas

1条答案

按热度按时间

egdjgwm81#

不要使用density='True'，因为使用该选项时，显示的值是bin中的成员除以bin的宽度。如果该宽度很小（如在相当小的x-值的情况下，值会变大）。

**编辑：**好的，要对赋范曲线进行非赋范，需要将其乘以点数和一个bin的宽度。我做了一个更精简的例子：

from numpy.random import normal
from scipy.stats import norm
import pylab

N = 300
sigma = 10.0
B = 30

def main():
    x = normal(0, sigma, N)

    h, bins, _ = pylab.hist(x, bins=B, rwidth=0.8)
    bin_width = bins[1] - bins[0]

    h_n = norm.pdf(bins[:-1], 0, sigma) * N * bin_width
    pylab.plot(bins[:-1], h_n)

if __name__ == "__main__":
    main()

赞(0）回复(0）举报 2023-10-24

我来回答

在Python中使用Matplotlib或Pandas绘制直方图

表点mydata示例

1条答案

相关问题

热门标签

最新问答