scipy 如何从NetworkX最小生成树中绘制树状图?

bvjveswy  于 2022-11-09  发布在  其他
关注(0)|答案(1)|浏览(205)

I've computed a minimum spanning tree from a distance matrix, using NetworkX. I want now to build a dendrogram from it.
My MST :

I've tried using the adjacency matrix (using NetworkX's to_pandas_adjacency)
(T is my MST)

df = nx.to_pandas_adjacency(T)

from scipy.spatial.distance import squareform
dist_array = squareform(df) #https://stackoverflow.com/questions/18952587/use-distance-matrix-in-scipy-cluster-hierarchy-linkage

plt.figure(figsize=(10,10)) 
mergings = linkage(dist_array, method='complete', metric='euclidean')
dendrogram(mergings, labels = distances.index, leaf_rotation=90, leaf_font_size=14)
plt.show()

Now, as the adjacency matrix is filled with 0's for non-edges, I guess linkage compute Euclidean distance and end up with a 3 clusters dendrogram (where all the cluster's points are at 0 distance), while I'm expecting to get the same linkage as in my original MST !

I tried using inf or large value for the nonedge default value to to_pandas_adjacency, but end up with invalid matrix...
Help anyone ? My best guess is that I'm not understanding and using linkage as I should...

EditI know, doing it the other way around (DT and then build the MST) might probably be easier, but I need to reproduce the order of operations in order to recreate the results of an original study...
Edit 2Since the scipy's linkage function compute Euclidean distance between each point (or node here), I guess (but without any certainty) I need to find a way to convert my adjacency matrix to an array similar to what's linkage function output, ie weighted edge list, but with sub clusters size as fourth column.

l2osamch

l2osamch1#

我有一个类似的问题,并试图找到一个解决方案。这是我第一次张贴的答案,所以请让我知道,如果有任何问题。
我建议您直接从scipy.cluster.hierarchy使用linkagedendrogram,而不是使用networkx包。
首先,通过scipy.spatial.distance.squareform将距离矩阵转换为压缩距离矩阵,然后使用scipy.cluster.hierarchy.linkage获得聚类。
linkage中可以使用不同的距离函数
最后,使用dendrogram绘制聚类图。
结果应与networkx中的最小生成树一致。

相关问题