有没有办法从pyspark中graphframe的图中找到一个具有给定中心节点的诱导子图?我试过用母题做一个诱导子图,但没有成功。
我试过使用networkx的ego图,它可以正常工作,但是对于大型图(12m+边),加载整个图需要很长时间。
下面是一个中心节点为“a”的示例
v = sqlc.createDataFrame([
("a", "Alice", 34),
("b", "Bob", 36),
("c", "Charlie", 30),
("d", "David", 29),
("e", "Esther", 32),
("f", "Fanny", 36),
("g", "Gabby", 60)
], ["id", "name", "age"])
# Edge DataFrame
e = sqlc.createDataFrame([
("a", "b", "friend"),
("b", "c", "friend"),
("c", "b", "friend"),
("f", "c", "friend"),
("e", "f", "friend"),
("e", "d", "friend"),
("d", "a", "friend"),
("a", "e", "friend"),
("b", "d", "friend")
], ["src", "dst", "relationship"])
# Create a GraphFrame
g = GraphFrame(v, e)
get_community(g,1)
def create_motif(length: int) -> str:
"""Create a motif string.
Args:
length (int):
"""
motif_path = "(start)-[edge0]->"
for i in range(1, length):
motif_path += "(n%s);(n%s)-[edge%s]->" % (i - 1, i - 1, i)
motif_path += "(end)"
return motif_path
def get_community(G,depth):
motif_path = create_motif(depth)
current_motif = G.find(motif_path)\
current_motif.select(f.col("start.*"),"*").show()
退货:
+---+-----+---+--------------+--------------+---------------+
| id| name|age| start| edge0| end|
+---+-----+---+--------------+--------------+---------------+
| a|Alice| 34|[a, Alice, 34]|[a, e, friend]|[e, Esther, 32]|
| a|Alice| 34|[a, Alice, 34]|[a, b, friend]| [b, Bob, 36]|
+---+-----+---+--------------+--------------+---------------+
你应该回来
+---+-----+---+--------------+--------------+---------------+
| id| name|age| start| edge0| end|
+---+-----+---+--------------+--------------+---------------+
| a|Alice| 34|[a, Alice, 34]|[a, e, friend]|[e, Esther, 32]|
| a|Alice| 34|[a, Alice, 34]|[a, b, friend]| [b, Bob, 36]|
| a|Alice| 34|[a, Alice, 34]|[a, d, friend]| [d, David, 29]|
| b| Bob| 36|[b, Bob, 36]|[b, d, friend]| [d, David, 29]|
+---+-----+---+--------------+--------------+---------------+
暂无答案!
目前还没有任何答案,快来回答吧!