matplotlib 我如何用seaborn绘制一个直方图来可视化一个大型 Dataframe 中字符串在时间跨度上的计数?[副本]

sirbozc5  于 2023-05-23  发布在  其他
关注(0)|答案(1)|浏览(140)

此问题已在此处有答案

How to plot value_counts from groupby(1个答案)
pandas extract year from datetime: df['year'] = df['date'].year is not working(5个答案)
countplot of year from datetime(1个答案)
Stacked Bar Chart with Centered Labels(2个答案)
4天前关闭。
我想用seaborn创建一个直方图。我有一个很大的dataframe:一列包含“YYYY-MM-DD hh:mm:ss”类型的日期-时间值,而在另一列中存在可以是诸如“A”、“B”、“C”之类的字符串值。
因此,我希望直方图在x轴上显示日期(只有年份或每6个月),在y轴上绘制该时间跨度内字符串的value_counts。我希望通过multiple=“stack”选项获得一个类似的图。

但是我不知道如何求和和绘制字符串时间跨度的计数。有人能帮帮我吗

k5ifujac

k5ifujac1#

# Generate random data

start_date = dt.datetime(2013, 1, 1)
end_date = dt.datetime(2023, 5, 18)

delta = end_date - start_date

random_days = np.sort(np.random.randint(0, delta.days, 1000))

random_dates = start_date + np.array(
    [dt.timedelta(days=int(day)) for day in random_days]
)

random_values = random.choices(["A", "B", "C"], k=1000)

df = pd.DataFrame({"date": random_dates, "value": random_values})

# Get year and half-year periods

df["year"] = df["date"].dt.year

first_half = df.date.astype("datetime64[Y]")
second_half = first_half + np.timedelta64(6, "M")
second_half = second_half.astype("datetime64[M]")

df["half-year"] = np.where(df["date"].dt.month <= 6, first_half, second_half)

# Count by period and value, here we used year
counts = df.groupby(["year", "value"])["date"].count().reset_index()
# Pivot for more convenient format
pvt = counts.pivot(index="year", columns="value", values="date")

# Stacked bar plot
pvt.plot(kind = 'bar', stacked = True)

相关问题