python 有谁知道在做了k均值聚类之后如何查看聚类中的数据吗？

iibxawm4 于 2022-12-17 发布在 Python

关注(0)|答案(2)|浏览(271)

在python中做了k-means聚类之后，是否有代码可以查看聚类中的数据，这样我就可以知道哪种类型的数据被聚类到哪个聚类中，以及为什么。
帮我一下？
群集文件的扩展名为.，所以我无法打开它。

来源：https://stackoverflow.com/questions/74821265/can-anyone-know-how-to-see-the-data-in-a-cluster-after-doing-k-means-clustering

2条答案

按热度按时间

62lalag41#

这取决于您如何执行Kmeans......但是......显示分类聚类分配（或“标签”）的属性是：
KMeans().fit().labels_

代码：（source here）

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans
%matplotlib inline

X= -2 * np.random.rand(100,2)
X1 = 1 + 2 * np.random.rand(50,2)
X[50:100, :] = X1
plt.scatter(X[ : , 0], X[ :, 1], s = 50)
plt.show()

Kmean = KMeans(n_clusters=2).fit(X)
print(Kmean.labels_)

输出：

Kmean.labels_

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])

如果您要将X、X1和labels_放入一个** Dataframe **中，则如下所示：

X        X1  Labels
0  -1.918458 -1.918458       0
1  -1.378906 -1.378906       0
2  -0.888738 -0.888738       0
3  -1.924301 -1.924301       0
4  -0.619357 -0.619357       0
..       ...       ...     ...
95  1.893219  1.893219       1
96  2.820921  2.820921       1
97  2.454180  2.454180       1
98  1.599229  1.599229       1
99  2.270729  2.270729       1

[100 rows x 3 columns]

赞(0）回复(0）举报 2022-12-17

jljoyd4f2#

任何预测值或任何值具有制作色彩Map的功能，通常都可以做到这一点，您所需要的是使颜色等于您的颜色主题列表和标签器，如下所示（重新标签器仅用于制作地面真实数据颜色，如预测的颜色）：

MyColorTheme = np.array(["darkgrey", "lightsalmon", "powderblue"])
MyRelabeler = np.choose(MyCluster.labels_, [2, 0, 1]).astype(np.int64)

plt.subplot(1, 2, 1)
plt.title("My Ground Truth Classification Module")
plt.scatter(x = MyDataFrame[["Petal Length"]], y = MyDataFrame[["Petal 
Width"]], c = MyColorTheme[MyData.target], s = 50)

plt.subplot(1, 2, 2)
plt.title("K clustring Classification Module")
plt.scatter(x = MyDataFrame[["Petal Length"]], y = MyDataFrame[["Petal 
Width"]], c = MyColorTheme[MyRelabeler], s = 50)

结果会是这样

this is a iris SpectralClustering method from sklearn

赞(0）回复(0）举报 2022-12-17

我来回答

python 有谁知道在做了k均值聚类之后如何查看聚类中的数据吗？

2条答案

代码：（source here）

输出：

相关问题

热门标签

最新问答