HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Best way to get two columns from a table

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
columnswaytwogetfromtablebest

Problem

Basically I have taken over a project and found a table that has the following video and video title fields in it (why they didn't create a separate linked table is beyond me):

[VIDEOURL]
[VIDEOTITLE]
[COUKVIDEO1URL]
[COUKVIDEO2URL]
[COUKVIDEO3URL]
[COUKVIDEO4URL]
[COUKVIDEO1TITLE]
[COUKVIDEO2TITLE]
[COUKVIDEO3TITLE]
[COUKVIDEO4TITLE]


Can this query be improved to get a single list of titles and urls from the above fields:

SELECT 
    t.VIDEOURL, 
    t.VIDEOTITLE 
FROM
(
    SELECT VIDEOURL, VIDEOTITLE FROM [VincentV5].[dbo].[MARCMSITEMPRESENTATION]
    UNION
    SELECT [COUKVIDEO1URL] AS VIDEOURL, [COUKVIDEO1TITLE] AS VIDEOTITLE FROM [VincentV5].[dbo].[MARCMSITEMPRESENTATION]
    UNION
    SELECT [COUKVIDEO2URL] AS VIDEOURL, [COUKVIDEO2TITLE] AS VIDEOTITLE FROM [VincentV5].[dbo].[MARCMSITEMPRESENTATION]
    UNION
    SELECT [COUKVIDEO3URL] AS VIDEOURL, [COUKVIDEO3TITLE] AS VIDEOTITLE FROM [VincentV5].[dbo].[MARCMSITEMPRESENTATION]
    UNION
    SELECT [COUKVIDEO4URL] AS VIDEOURL, [COUKVIDEO4TITLE] AS VIDEOTITLE FROM [VincentV5].[dbo].[MARCMSITEMPRESENTATION]
) t
WHERE t.VIDEOURL != ''
AND t.VIDEOTITLE != ''

Solution

There is no telling how the database will optimize the query plan for this, but I would expect that the best query plan would involve applying the predicates to the tables before the union is done. Doing a UNION of multiple small result sets would be more efficient than doing a union on large sets and then filtering the results.

The UNION operation requires a lot of scanning because UNION is also a DISTINCT process (you only have one record of each value in the result).

So, two things.

If you are sure that the values in in a column-pair will not also be in any of the other column-pairs, then you can replace the UNION with the faster UNION ALL.

If you are sure that a column-pair is unique, or empty, then you can avoid the DISTINCT process completely.

I would write the query as:

SELECT VIDEOURL,
           VIDEOTITLE
    FROM [VincentV5].[dbo].[MARCMSITEMPRESENTATION]
    WHERE [VIDEOURL] != ''
      AND [VIDEOTITLE] != ''
  UNION ALL
    SELECT [COUKVIDEO1URL] AS VIDEOURL,
           [COUKVIDEO1TITLE] AS VIDEOTITLE
    FROM [VincentV5].[dbo].[MARCMSITEMPRESENTATION]
    WHERE [COUKVIDEO1URL] != ''
      AND [COUKVIDEO1TITLE] != ''
  UNION ALL
    SELECT [COUKVIDEO2URL] AS VIDEOURL,
           [COUKVIDEO2TITLE] AS VIDEOTITLE
    FROM [VincentV5].[dbo].[MARCMSITEMPRESENTATION]
    WHERE [COUKVIDEO2URL] != ''
      AND [COUKVIDEO2TITLE] != ''
  UNION ALL
    SELECT [COUKVIDEO3URL] AS VIDEOURL,
           [COUKVIDEO3TITLE] AS VIDEOTITLE
    FROM [VincentV5].[dbo].[MARCMSITEMPRESENTATION]
    WHERE [COUKVIDEO3URL] != ''
      AND [COUKVIDEO3TITLE] != ''
  UNION ALL
    SELECT [COUKVIDEO4URL] AS VIDEOURL,
           [COUKVIDEO4TITLE] AS VIDEOTITLE
    FROM [VincentV5].[dbo].[MARCMSITEMPRESENTATION]
    WHERE [COUKVIDEO4URL] != ''
      AND [COUKVIDEO4TITLE] != ''


If values in one column-pair can appear in other column pairs, then change the UNION to UNION ALL.

If the values can be duplicated in a column-pair, then add DISTINCT to each SELECT.

Do both DISTINCT and UNION ALL if you need to.

The idea is to make the data as small as possible before you start doing the big merging processes.

Code Snippets

SELECT VIDEOURL,
           VIDEOTITLE
    FROM [VincentV5].[dbo].[MARCMSITEMPRESENTATION]
    WHERE [VIDEOURL] != ''
      AND [VIDEOTITLE] != ''
  UNION ALL
    SELECT [COUKVIDEO1URL] AS VIDEOURL,
           [COUKVIDEO1TITLE] AS VIDEOTITLE
    FROM [VincentV5].[dbo].[MARCMSITEMPRESENTATION]
    WHERE [COUKVIDEO1URL] != ''
      AND [COUKVIDEO1TITLE] != ''
  UNION ALL
    SELECT [COUKVIDEO2URL] AS VIDEOURL,
           [COUKVIDEO2TITLE] AS VIDEOTITLE
    FROM [VincentV5].[dbo].[MARCMSITEMPRESENTATION]
    WHERE [COUKVIDEO2URL] != ''
      AND [COUKVIDEO2TITLE] != ''
  UNION ALL
    SELECT [COUKVIDEO3URL] AS VIDEOURL,
           [COUKVIDEO3TITLE] AS VIDEOTITLE
    FROM [VincentV5].[dbo].[MARCMSITEMPRESENTATION]
    WHERE [COUKVIDEO3URL] != ''
      AND [COUKVIDEO3TITLE] != ''
  UNION ALL
    SELECT [COUKVIDEO4URL] AS VIDEOURL,
           [COUKVIDEO4TITLE] AS VIDEOTITLE
    FROM [VincentV5].[dbo].[MARCMSITEMPRESENTATION]
    WHERE [COUKVIDEO4URL] != ''
      AND [COUKVIDEO4TITLE] != ''

Context

StackExchange Code Review Q#42967, answer score: 3

Revisions (0)

No revisions yet.