snippetsqlMinor
Create hierarchy of multiple levels where each node has a random number of children
Viewed 0 times
randomchildrennumbereachcreatehierarchynodewherehaslevels
Problem
I need to create some test data that involves a hierarchy. I could make it easy and do a couple of
CROSS JOINs, but that would give me a structure that is completely uniform / without any variation. That not only seems dull, but lack of variation in test data sometimes masks problems that would otherwise be found. So, I am wanting to generate a non-uniform hierarchy that follows these rules:- 3 levels deep
- Level 1 is randomly 5 - 20 nodes
- Level 2 is 1 - 10 nodes, random per each node of Level 1
- Level 3 is 1 - 5 nodes, random per each node of Level 2
- All branches will be 3 levels deep. Uniformity in depth is ok at this point.
- There can be overlap in names of child nodes on any given level (i.e. names of child nodes do not need to be unique across all nodes on the same level).
- The term "random" is defined here as being pseudo-random, not uniquely random. This needs to be mentioned since the term "random" is often used to mean "random ordering of a given set that does not produce duplicates". I accept that random = random and if the number of children per each node of Level 1 is only 4, 7, and 8, even across 20 nodes on Level 1 that has a potential spread of 1 - 10 children per each of those nodes, then that is fine, because that is what random is.
- Even though this can be done quite easily with nested
WHILEloops, the preference is to find a set-based approach. Generally speaking, generating test data does not have the requirements for efficiency that Production code would have, but shooting for a set-based approach will likely be more educational and help in the future with finding set-based approaches to problems. SoWHILEloops are not ruled-out, but can only be used if no set-based approach is possible.
- Set-based = ideally a single query, regardless of CTEs, APPLYs, etc. So using an existing or inline numbers table is fine. Using a WHILE / CURSOR / procedural approach will not work. I suppose staging portions of the data into temp ta
Solution
(O.P.'s note: preferred solution is the 4th / final code block)
XML seems to me to be the obvious choice of data structure to use here.
The trick to make SQL Server use different values for
Flatteing the XML:
And a version totally void of XML.
Correlation
A version using a table with 20 names to be used instead of just integers.
XML seems to me to be the obvious choice of data structure to use here.
with N as
(
select T.N
from (values(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),
(12),(13),(14),(15),(16),(17),(18),(19),(20)) as T(N)
)
select top(5 + abs(checksum(newid())) % 15)
N1.N as '@Value',
(
select top(1 + abs(checksum(newid())) % 10)
N2.N as '@Value',
(
select top(1 + abs(checksum(newid())) % 5)
N3.N as '@Value'
from N as N3
where N2.N > 0
for xml path('Level3'), type
)
from N as N2
where N1.N > 0
for xml path('Level2'), type
)
from N as N1
for xml path('Level1'), root('Root');The trick to make SQL Server use different values for
top() for each node is to make the sub-queries correlated. N1.N > 0 and N2.N > 0.Flatteing the XML:
declare @X xml;
with N as
(
select T.N
from (values(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),
(12),(13),(14),(15),(16),(17),(18),(19),(20)) as T(N)
)
select @X = (
select top(5 + abs(checksum(newid())) % 15)
N1.N as '@Value',
(
select top(1 + abs(checksum(newid())) % 10)
N2.N as '@Value',
(
select top(1 + abs(checksum(newid())) % 5)
N3.N as '@Value'
from N as N3
where N2.N > 0
for xml path('Level3'), type
)
from N as N2
where N1.N > 0
for xml path('Level2'), type
)
from N as N1
for xml path('Level1')
);
select L1.X.value('@Value', 'varchar(10)')+'\'+
L2.X.value('@Value', 'varchar(10)')+'\'+
L3.X.value('@Value', 'varchar(10)')
from @X.nodes('/Level1') as L1(X)
cross apply L1.X.nodes('Level2') as L2(X)
cross apply L2.X.nodes('Level3') as L3(X);And a version totally void of XML.
with N as
(
select T.N
from (values(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),
(12),(13),(14),(15),(16),(17),(18),(19),(20)) as T(N)
)
select cast(N1.N as varchar(10))+'\'+
cast(N2.N as varchar(10))+'\'+
cast(N3.N as varchar(10))
from (
select top(5 + abs(checksum(newid())) % 15)
N.N
from N
) as N1
cross apply
(
select top(1 + abs(checksum(newid())) % 10)
N.N
from N
where N1.N > 0
) as N2
cross apply
(
select top(1 + abs(checksum(newid())) % 5)
N.N
from N
where N2.N > 0
) as N3;Correlation
N1.N > 0 and N2.N > 0 is still important. A version using a table with 20 names to be used instead of just integers.
declare @Elements table
(
Name nvarchar(50) not null
);
insert into @Elements(Name)
select top(20) C.name
from sys.columns as C
group by C.name;
select N1.Name + N'\' + N2.Name + N'\' + N3.Name
from (
select top(5 + abs(checksum(newid())) % 15)
E.Name
from @Elements as E
) as N1
cross apply
(
select top(1 + abs(checksum(newid())) % 10)
E.Name
from @Elements as E
where N1.Name > ''
) as N2
cross apply
(
select top(1 + abs(checksum(newid())) % 5)
E.Name
from @Elements as E
where N2.Name > ''
) as N3;
Code Snippets
with N as
(
select T.N
from (values(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),
(12),(13),(14),(15),(16),(17),(18),(19),(20)) as T(N)
)
select top(5 + abs(checksum(newid())) % 15)
N1.N as '@Value',
(
select top(1 + abs(checksum(newid())) % 10)
N2.N as '@Value',
(
select top(1 + abs(checksum(newid())) % 5)
N3.N as '@Value'
from N as N3
where N2.N > 0
for xml path('Level3'), type
)
from N as N2
where N1.N > 0
for xml path('Level2'), type
)
from N as N1
for xml path('Level1'), root('Root');declare @X xml;
with N as
(
select T.N
from (values(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),
(12),(13),(14),(15),(16),(17),(18),(19),(20)) as T(N)
)
select @X = (
select top(5 + abs(checksum(newid())) % 15)
N1.N as '@Value',
(
select top(1 + abs(checksum(newid())) % 10)
N2.N as '@Value',
(
select top(1 + abs(checksum(newid())) % 5)
N3.N as '@Value'
from N as N3
where N2.N > 0
for xml path('Level3'), type
)
from N as N2
where N1.N > 0
for xml path('Level2'), type
)
from N as N1
for xml path('Level1')
);
select L1.X.value('@Value', 'varchar(10)')+'\'+
L2.X.value('@Value', 'varchar(10)')+'\'+
L3.X.value('@Value', 'varchar(10)')
from @X.nodes('/Level1') as L1(X)
cross apply L1.X.nodes('Level2') as L2(X)
cross apply L2.X.nodes('Level3') as L3(X);with N as
(
select T.N
from (values(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),
(12),(13),(14),(15),(16),(17),(18),(19),(20)) as T(N)
)
select cast(N1.N as varchar(10))+'\'+
cast(N2.N as varchar(10))+'\'+
cast(N3.N as varchar(10))
from (
select top(5 + abs(checksum(newid())) % 15)
N.N
from N
) as N1
cross apply
(
select top(1 + abs(checksum(newid())) % 10)
N.N
from N
where N1.N > 0
) as N2
cross apply
(
select top(1 + abs(checksum(newid())) % 5)
N.N
from N
where N2.N > 0
) as N3;Context
StackExchange Database Administrators Q#117469, answer score: 9
Revisions (0)
No revisions yet.