matplotlib 如何使用遮罩创建自定义热图注解

vxqlmq5t  于 2023-10-24  发布在  其他
关注(0)|答案(1)|浏览(151)

我只想在热图上标注最高值,但只显示了第一个数字。我不知道为什么。缩小字体似乎不起作用。在写这篇文章时,我猜忽略注解变量并添加文本可能会起作用,但我无法理解这一点,因为subplot:cryingface:
你可以看到我在这里得到了什么:

玩具数据生成

  1. np.random.seed(42)
  2. n_rows = 10**6
  3. n_ids = 1000
  4. n_groups = 3
  5. times = np.random.normal(12, 2.5, n_rows).round().astype(int) + np.random.choice([0,24,48,72,96,120,144], size=n_rows, p=[0.2,0.2,0.2,0.2,0.15,0.04,0.01])
  6. timeslots= np.arange(168)
  7. id_list = np.random.randint(low=1000, high=5000, size=1000)
  8. ID_probabilities = np.random.normal(10, 1, n_ids-1)
  9. ID_probabilities = ID_probabilities/ID_probabilities.sum()
  10. final = 1 - ID_probabilities.sum()
  11. ID_probabilities = np.append(ID_probabilities,final)
  12. id_col = np.random.choice(id_list, size=n_rows, p=ID_probabilities)
  13. data = pd.DataFrame(times[:,None]==timeslots, index=id_col)
  14. n_ids = data.index.nunique()
  15. data = data.groupby(id_col).sum()
  16. data['grp'] = np.random.choice(range(n_groups), n_ids)
  17. data

复制玩具数据的面食样本:

  1. 0 1 2 3 4 5 6 7 8 9 ... 159 160 161 162 163 164 165 166 167 grp
  2. 1011 0 0 0 0 0 0 2 3 15 21 ... 1 1 0 0 0 0 0 0 0 1
  3. 1016 0 0 0 0 0 0 4 3 18 41 ... 2 0 0 0 0 0 0 0 0 2
  4. 1020 0 0 0 0 0 1 1 2 6 16 ... 1 1 0 0 0 0 0 0 0 0
  5. 1024 0 0 0 0 0 0 2 3 7 13 ... 0 1 1 0 0 0 0 0 0 0
  6. 1029 0 0 0 0 0 0 1 5 3 14 ... 1 0 1 0 0 0 0 0 0 1
  7. ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
  8. 4965 0 0 0 0 0 2 4 2 10 9 ... 0 1 0 0 0 0 0 0 0 1
  9. 4984 0 0 0 0 0 1 0 6 10 12 ... 0 0 0 0 0 0 0 0 0 2
  10. 4989 0 0 0 0 0 1 3 4 7 16 ... 1 1 0 0 0 0 0 0 0 0
  11. 4995 0 0 0 0 2 0 2 2 2 23 ... 0 1 0 0 0 0 0 0 0 0
  12. 4999 0 0 0 0 0 1 1 7 9 11 ... 0 0 0 0 0 0 0 0 0 2

我用来生成图表的代码

  1. import seaborn as sns
  2. import matplotlib.pyplot as plt
  3. rows = 1
  4. cols = n_groups
  5. # profiles['grp'] = results
  6. grpr = data.groupby('grp')
  7. actual_values = []
  8. fig, axs = plt.subplots(rows, cols, figsize=(cols*3, rows*3), sharey=True, sharex=True)
  9. for grp, df in grpr:
  10. plt.subplot(rows,cols,grp+1)
  11. annot_labels = np.empty_like(df[range(168)].sum(), dtype=str)
  12. annot_mask = df[range(168)].sum() == df[range(168)].sum().max()
  13. actual_values.append(df[range(168)].max().max())
  14. annot_labels[annot_mask] = str(df[range(168)].max().max())
  15. sns.heatmap(df[range(168)].sum().values.reshape(7,-1), cbar=False, annot=annot_labels.reshape(7,-1), annot_kws={'rotation':90, 'fontsize':'x-small'}, fmt='')
  16. ppl = df.shape[0]
  17. journs = int(df.sum().sum()/1000)
  18. plt.title(f'{grp}: {ppl:,} people, {journs:,}k trips')
  19. for ax in axs.flat:
  20. ax.set(xlabel='Hour', ylabel='Day')
  21. ax.set_yticklabels(['M','T','W','T','F','S','S'], rotation=90)
  22. # Hide x labels and tick labels for top plots and y ticks for right plots.
  23. for ax in axs.flat:
  24. ax.label_outer()
  25. score_ch = ordered_scores['calinski_harbasz'][p]
  26. score_si = ordered_scores['silhouette'][p]
  27. plt.suptitle(f"Why don't these labels work? Actual values = {actual_values}")
  28. plt.tight_layout()
  29. plt.show()
olhwl3o2

olhwl3o21#

感谢@TrentonMcKinney和this post on numpy array fixed length strings的评论,我有一个简单的解决方案。像这样创建空结构会导致长度为1个字符的字符串数组:
annot_labels = np.empty_like(df[range(168)].sum(), dtype=str)
更改dtype可以解决这个问题。np.empty_like(a, dtype='U5')创建一个具有5个unicode字符的数组。

相关问题