如何分成两组(PythonPandas数据框)?

gab6jxml  于 2023-03-11  发布在  Python
关注(0)|答案(1)|浏览(124)

我有一个这样的数据集,关于设备的安装(开)-移除(关)。A,B,C,D是每个独立设备的ID。我想把这些ID分成两组,并有一些规则。

如你所见,当我移除B时,我安装A。移除A后,我安装C。移除C后,T也是如此。同样,当我移除D时,我安装F。F后,H也是如此。
我的假设是有两组设备。例如,我们可以说:
第1组:B-A-C-T
第2组:D-F-H

ON = ['A','C','F','T','H']
OFF = ['B','A','D','C','F']
df= pd.DataFrame({'ON':ON,'OFF':OFF})

也许我可以试试字典,但我不知道。
我想列出两个结果:

Group 1 = ['B','A','C','T']
Group 2 = ['D','F','H']
gdx19jrr

gdx19jrr1#

使用像networkx这样的网络库可以简化这个问题,你想要的是从根节点和叶节点找到所有的路径。

# pip install networkx
import networkx as nx
import itertools

# Create a directed graph from Pandas edges list
G = nx.from_pandas_edgelist(df, source='OFF', target='ON', create_using=nx.DiGraph)

# Find all roots and leaves
roots = [node for node, degree in G.in_degree if degree == 0]
leaves = [node for node, degree in G.out_degree if degree == 0]

# Get all possible paths between roots and leaves
paths = []
for root, leaf in itertools.product(roots, leaves):
    for path in nx.all_simple_paths(G, root, leaf):
        paths.append(path)

输出:

>>> paths
[['B', 'A', 'C', 'T'], ['D', 'F', 'H']]

可视化:

import matplotlib.pyplot as plt

nx.draw_networkx(G)
plt.show()

输出:

相关问题