snippetsqlMinor
How to avoid duplicating information in a series of nested SELECT statements?
Viewed 0 times
statementsselectavoidduplicatingnestedhowseriesinformation
Problem
I'm working on an SQL query that puts together customer history based on past purchases (the industry is venue ticketing). The report will be LARGE, something like 80-90 columns, with one column per event or category of event per record in the output CSV.
The database structure requires that I use nested select statements to get the necessary data into columns for each record--each column containing a quantity of tickets for that type of event. I know how to do this in theory, but each embedded select statement is so large that the full report would blow through the 8,000 character limit on my DB interface. One of these statements looks like this:
That's about 400 characters, which only leaves enough room for <20 iterations in a single query without going over the character limit.
Is there a way to reduce this or generalize it so that I don't need to waste all those characters on duplicated information in each SELECT?
I've put together an example at the following link which contains a simplified version of my DB structure and some sample data. Objective is to get the same result data without using so many characters per nested SELECT.
http://sqlfiddle.com/#!3/fc316/2
The database structure requires that I use nested select statements to get the necessary data into columns for each record--each column containing a quantity of tickets for that type of event. I know how to do this in theory, but each embedded select statement is so large that the full report would blow through the 8,000 character limit on my DB interface. One of these statements looks like this:
(select count (*)
FROM
guest ig, event2 e, eventseat es, "order" o
WHERE ig.guestid = g.guestid
and ig.guestid = o.guestid
and o.orderid = es.orderid
and e.eventid = es.eventid
and e.incometype = 'T'
and e.eventtype in ('SS', 'BMF')
and es.status in ('2','4')
and es.price <> '0.00'
and es.price <> '5.00'
and e.year = '2010' <-- THESE LINES ARE THE ONLY ONES THAT
and e.run in ('SS-DANCE') <-- CHANGE PER SELECT STATEMENT
) as 'SS-Dance',That's about 400 characters, which only leaves enough room for <20 iterations in a single query without going over the character limit.
Is there a way to reduce this or generalize it so that I don't need to waste all those characters on duplicated information in each SELECT?
I've put together an example at the following link which contains a simplified version of my DB structure and some sample data. Objective is to get the same result data without using so many characters per nested SELECT.
http://sqlfiddle.com/#!3/fc316/2
Solution
;WITH x AS
(
SELECT g.guestid, e.[year], e.run, c = COUNT(*)
FROM dbo.guest AS g
INNER JOIN dbo.[order] AS o
ON g.guestid = o.guestid
INNER JOIN dbo.eventseat AS es
ON o.orderid = es.orderid
INNER JOIN dbo.event2 AS e
ON e.eventid = es.eventid
WHERE e.incometype = 'T'
AND e.eventtype in ('SS', 'BMF')
AND es.status in ('2','4')
AND es.price NOT IN ( '0.00', '5.00')
AND e.[year] = '2010'
AND e.run IN ('SS-DANCE', 'SS-FILM', 'SS-OPERA')
GROUP BY g.guestid, e.[Year], e.run
),
y AS
(
SELECT * FROM x PIVOT (MAX(c) FOR run IN
([SS-DANCE],[SS-FILM],[SS-OPERA])) AS z
)
SELECT y.guestid, y.[year], g.[first], g.[last],
[SS-DANCE] = COALESCE(y.[SS-DANCE], 0),
[SS-FILM] = COALESCE(y.[SS-FILM], 0),
[SS-OPERA] = COALESCE(y.[SS-OPERA], 0)
FROM y
INNER JOIN dbo.guest AS g
ON y.guestid = g.guestid;I'm not sure if my where clause is going to work for you as written. This currently assumes you only care about these three types (dance, film and opera) and you only care about the year 2010. If you need to cross years you can remove that
WHERE clause. If you only care about dance in 2010 and film in any year you'll need to write a more complex set of where clauses.Some comments:
- try to avoid reserved words like
order,year,firstandlast. These only complicate matters because they need to be enclosed in square brackets (which are preferable over"double quotes"mostly for readability and unambiguity).
- always use
dbo.(or whateverschemaprefix is appropriate).
- as I suggested above, stick to explicit
INNER JOINsyntax.FROM x, y, zis for the birds.
Code Snippets
;WITH x AS
(
SELECT g.guestid, e.[year], e.run, c = COUNT(*)
FROM dbo.guest AS g
INNER JOIN dbo.[order] AS o
ON g.guestid = o.guestid
INNER JOIN dbo.eventseat AS es
ON o.orderid = es.orderid
INNER JOIN dbo.event2 AS e
ON e.eventid = es.eventid
WHERE e.incometype = 'T'
AND e.eventtype in ('SS', 'BMF')
AND es.status in ('2','4')
AND es.price NOT IN ( '0.00', '5.00')
AND e.[year] = '2010'
AND e.run IN ('SS-DANCE', 'SS-FILM', 'SS-OPERA')
GROUP BY g.guestid, e.[Year], e.run
),
y AS
(
SELECT * FROM x PIVOT (MAX(c) FOR run IN
([SS-DANCE],[SS-FILM],[SS-OPERA])) AS z
)
SELECT y.guestid, y.[year], g.[first], g.[last],
[SS-DANCE] = COALESCE(y.[SS-DANCE], 0),
[SS-FILM] = COALESCE(y.[SS-FILM], 0),
[SS-OPERA] = COALESCE(y.[SS-OPERA], 0)
FROM y
INNER JOIN dbo.guest AS g
ON y.guestid = g.guestid;Context
StackExchange Database Administrators Q#19769, answer score: 7
Revisions (0)
No revisions yet.