HiveBrain v1.2.0
Get Started
← Back to all entries
snippetsqlMajor

How to write a query which finds all circular references when a table references itself?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
itselfallquerywritecircularfindshowwhichreferenceswhen

Problem

I have the following schema (names changed), which I cannot change:

CREATE TABLE MyTable (
    Id INT NOT NULL PRIMARY KEY,
    ParentId INT NOT NULL
);

ALTER TABLE MyTable ADD FOREIGN KEY (ParentId) REFERENCES MyTable(Id);


That is, each record is a child of another record. If a record’s ParentId is equal to its Id, then the record is considered a root node.

I want to run query which will find all circular references. For example, with the data

INSERT INTO MyTable (Id, ParentId) VALUES
    (0, 0),
    (1, 0),
    (2, 4),
    (3, 2),
    (4, 3);


the query should return

Id | Cycle
2  | 2 < 4 < 3 < 2
3  | 3 < 2 < 4 < 3
4  | 4 < 3 < 2 < 4


I wrote the following query for SQL Server 2008 R2, and I am wondering if this query can be improved:

```
IF OBJECT_ID(N'tempdb..#Results') IS NOT NULL DROP TABLE #Results;
CREATE TABLE #Results (Id INT, HasParentalCycle BIT, Cycle VARCHAR(MAX));

DECLARE @i INT,
@j INT,
@flag BIT,
@isRoot BIT,
@ids VARCHAR(MAX);

DECLARE MyCursor CURSOR FAST_FORWARD FOR
SELECT Id
FROM MyTable;

OPEN MyCursor;
FETCH NEXT FROM MyCursor INTO @i;
WHILE @@FETCH_STATUS = 0
BEGIN
IF OBJECT_ID(N'tempdb..#Parents') IS NOT NULL DROP TABLE #Parents;
CREATE TABLE #Parents (Id INT);

SET @ids = NULL;
SET @isRoot = 0;
SET @flag = 0;
SET @j = @i;
INSERT INTO #Parents (Id) VALUES (@j);

WHILE (1=1)
BEGIN
SELECT
@j = ParentId,
@isRoot = CASE WHEN ParentId = Id THEN 1 ELSE 0 END
FROM MyTable
WHERE Id = @j;

IF (@isRoot = 1)
BEGIN
SET @flag = 0;
BREAK;
END

IF EXISTS (SELECT 1 FROM #Parents WHERE Id = @j)
BEGIN
INSERT INTO #Parents (Id) VALUES (@j);
SET @flag = 1;
SELECT @ids = COALESCE(@ids + ' < ', '') + CAST(Id AS VARCHAR) FROM #Parents;
BREAK;
END
ELSE
BEGIN
INSERT INTO

Solution

This calls for a recursive CTE:

WITH FindRoot AS
(
    SELECT Id,ParentId, CAST(Id AS NVARCHAR(MAX)) Path
    FROM dbo.MyTable

    UNION ALL

    SELECT C.Id, P.ParentId, C.Path + N' > ' + CAST(P.Id AS NVARCHAR(MAX))
    FROM dbo.MyTable P
    JOIN FindRoot C
    ON C.ParentId = P.Id AND P.ParentId <> P.Id AND C.ParentId <> C.Id
 )
SELECT *
FROM FindRoot R
WHERE R.Id = R.ParentId 
  AND R.ParentId <> 0;


See it in action here: SQL Fiddle

Update:

Added distance to be able to exclude all self cycles (see ypercube's comment):

WITH FindRoot AS
(
    SELECT Id,ParentId, CAST(Id AS NVARCHAR(MAX)) Path, 0 Distance
    FROM dbo.MyTable

    UNION ALL

    SELECT C.Id, P.ParentId, C.Path + N' > ' + CAST(P.Id AS NVARCHAR(MAX)), C.Distance + 1
    FROM dbo.MyTable P
    JOIN FindRoot C
    ON C.ParentId = P.Id AND P.ParentId <> P.Id AND C.ParentId <> C.Id
 )
SELECT *
FROM FindRoot R
WHERE R.Id = R.ParentId 
  AND R.ParentId <> 0
  AND R.Distance > 0;


SQL Fiddle

Which one you should use depends on your requirement.

Code Snippets

WITH FindRoot AS
(
    SELECT Id,ParentId, CAST(Id AS NVARCHAR(MAX)) Path
    FROM dbo.MyTable

    UNION ALL

    SELECT C.Id, P.ParentId, C.Path + N' > ' + CAST(P.Id AS NVARCHAR(MAX))
    FROM dbo.MyTable P
    JOIN FindRoot C
    ON C.ParentId = P.Id AND P.ParentId <> P.Id AND C.ParentId <> C.Id
 )
SELECT *
FROM FindRoot R
WHERE R.Id = R.ParentId 
  AND R.ParentId <> 0;
WITH FindRoot AS
(
    SELECT Id,ParentId, CAST(Id AS NVARCHAR(MAX)) Path, 0 Distance
    FROM dbo.MyTable

    UNION ALL

    SELECT C.Id, P.ParentId, C.Path + N' > ' + CAST(P.Id AS NVARCHAR(MAX)), C.Distance + 1
    FROM dbo.MyTable P
    JOIN FindRoot C
    ON C.ParentId = P.Id AND P.ParentId <> P.Id AND C.ParentId <> C.Id
 )
SELECT *
FROM FindRoot R
WHERE R.Id = R.ParentId 
  AND R.ParentId <> 0
  AND R.Distance > 0;

Context

StackExchange Database Administrators Q#45158, answer score: 35

Revisions (0)

No revisions yet.