SQL Server Given a directed graph with a set of connected vertices, find all reachable nodes from any vertex

ih99xse1  于 2023-02-18  发布在  其他
关注(0)|答案(1)|浏览(136)

Having a sample graph ( yEd live ):

These two basic tables defined on SQL Server 2019:

create table nodes (id varchar(100), name varchar(15)) as NODE;
create table connections (id varchar(100), from_id varchar(100), to_id varchar(100)) as EDGE;

Trying to query all reachable nodes, I created this table value function:

CREATE FUNCTION calculateNetworkForNode(
    @DiscoverScenarioFor VARCHAR(15)
)
returns @network TABLE (base varchar(15), pathToNode varchar(max), networkMember varchar(15))
begin

    INSERT @network
    SELECT base, pathToNode, networkMember
    FROM (
        SELECT
            nodes.name as base,
            STRING_AGG(nodes2.name, '->') WITHIN GROUP (GRAPH PATH) AS pathToNode,
            LAST_VALUE(nodes2.name) WITHIN GROUP (GRAPH PATH) AS networkMember
        FROM
            nodes,
            connections FOR PATH,
            nodes FOR PATH AS nodes2
        WHERE MATCH(SHORTEST_PATH(nodes(-(connections)->nodes2)+))
        AND nodes.name = @DiscoverScenarioFor
    ) AS Q
    WHERE Q.networkMember != @DiscoverScenarioFor

    insert into @network (base, pathToNode, networkMember) values (@DiscoverScenarioFor, null, @DiscoverScenarioFor)

    RETURN
end

Running this function with the A node, I can get all reachable nodes downstream:

+------+------------+---------------+
| base | pathToNode | networkMember |
+------+------------+---------------+
| A    | B          | B             |
| A    | E          | E             |
| A    | E->F       | F             |
| A    | B->C       | C             |
| A    | B->C->D    | D             |
| A    |            | A             |
+------+------------+---------------+

(Known issue: if F had any other inbound links, this would already not work.)

Running the same function above with node F results in a rather boring table of:

+------+------------+---------------+
| base | pathToNode | networkMember |
+------+------------+---------------+
| F    |            | F             |
+------+------------+---------------+

The goal is to modify this function or create a new one to generate the complete network including all connected nodes. So searching with A or F or any other member of these connected nodes, should always results in the same set of nodes: A,B,C,D,E,F

I had another helper function:

SELECT @csv = (SELECT STRING_AGG(n.networkMember, ',') as csv
                   FROM (select top 1000 * from calculateNetworkForNode(@DiscoverScenarioFor) c order by c.networkMember asc) as [n])

Which turns the above result to a csv string, but the initial issue is still present: the query only sees the downstream nodes and not all of the connected members.

issues

A query results: A, B, C, D, E, F
B query results: B,C,D
C query results: C,D
D query results: D
E query results: E,F
F query results: F

Any pointers are welcome, where to look/what to read up on to find a solution how all of these queries with the different members could return the same A,B,C,D,E,F network.

yiytaume

yiytaume1#

It looks like you want to interpret the graph as an undirected graph, while your current solution interprets the graph as directed.

A pragmatic solution could be to create a view v_connections which includes all the connections and their reverse.

So something like:

create view c_connections as
select id, from_id, to_id
from connections
union
select concat(id, '<'), to_id, from_id
from connections;

And then replace connections in your current procedure with v_connections .

相关问题