pandas 如何用区间数据绘制百分位数图

bzzcjhmw  于 2023-11-15  发布在  其他
关注(0)|答案(2)|浏览(110)

如何用区间数据绘制百分位数图?
请参阅下面的代码,以根据特定的间隔计算数据的百分比。

idx = pd.IntervalIndex.from_breaks([39.9, 42.9,45.9,48.9,51.9,54.9,57.9])
df = pd.DataFrame({"Bin": idx, "Frequency": [2,2,5,5,12,3]})
n = df["Frequency"].sum()
df['cumulativeSumFreq'] = df["Frequency"].cumsum()
df['cumulativePercent'] = (df["Frequency"]/n)*100
df

bins = [39.9, 42.9,45.9,48.9,51.9,54.9,57.9]
df.hist(column='cumulativePercent', bins=bins)
plt.show()

字符串
出于某种原因,df.hist()不排除bins=idx
我得到下面的图,它没有遵循正确的装箱。我如何实现这一点?


的数据


frebpwbc

frebpwbc1#

使用pyplot.stairsAxes.stairs

edges = [39.9, 42.9,45.9,48.9,51.9,54.9,57.9]
plt.stairs(df['cumulativePercent'], edges, fill=True)

字符串

edges = [39.9, 42.9,45.9,48.9,51.9,54.9,57.9]
fig, ax = plt.subplots()
ax.stairs(df['cumulativePercent'], edges, fill=True)


输出量:


的数据

sxissh06

sxissh062#

出现这个问题是因为hist方法试图从数据中创建自己的bin,而不是严格使用您提供的bin。您可以使用bar来绘制预分仓数据的直方图,而不是使用hist方法。这里是更新的完整代码:

import pandas as pd
import matplotlib.pyplot as plt

# Create the dataframe
idx = pd.IntervalIndex.from_breaks([39.9, 42.9,45.9,48.9,51.9,54.9,57.9])
df = pd.DataFrame({"Bin": idx, "Frequency": [2,2,5,5,12,3]})
n = df["Frequency"].sum()

# Calculate cumulative sum and cumulative percent
df['cumulativeSumFreq'] = df["Frequency"].cumsum()
df['cumulativePercent'] = (df["Frequency"]/n)*100

# Plotting the histogram using bar
left_edges = [interval.left for interval in df["Bin"]]
right_edges = [interval.right for interval in df["Bin"]]

# The bar heights will be the 'cumulativePercent' values
heights = df['cumulativePercent']

# Plot the bars
plt.bar(left_edges, heights, width=[right-left for left, right in zip(left_edges, right_edges)], align='edge', edgecolor='black')

# Label the x-axis
plt.xticks((left_edges + right_edges)[:-1], labels=[str(interval) for interval in df["Bin"]], rotation=45, ha="right")

plt.show()

字符串

相关问题