HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Finding customer searches and graphing their items

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
graphingitemstheirfindingandsearchescustomer

Problem

One of the reports I am working on requires me to locate searches a customer makes, and graph them in certain manners.

I have one query I'm particularly concerned about. This one takes the most amount of time (relative to the batch) and seems quite ugly to me.

Essentially, this should graph the search results by the day of the week and give me: the total number of searches, the average searches per week day, the number of searches that had no returned results, and the percent of searches that have no results.

SELECT
    DATENAME(WEEKDAY, [AtDateTime]) AS [Day of Week]
    ,COUNT(*) AS [Number of Searches]
    ,CAST(CAST(COUNT(*) AS DECIMAL(10, 2)) / COUNT(DISTINCT CONVERT(DATE, [AtDateTime])) AS DECIMAL(10, 2)) AS [Average Searches per Day]
    ,SUM(CASE WHEN [NumFound] = 0 THEN 1 ELSE 0 END) AS [Number of Searches with no Results]
    ,CAST(CAST(SUM(CASE WHEN [NumFound] = 0 THEN 1 ELSE 0 END) AS DECIMAL(10, 2)) / COUNT(*) AS DECIMAL(10, 4)) AS [Percent of Searches with no Results]
FROM [DB].[dbo].[SearchHistory] 
WHERE NOT EXISTS 
(
    SELECT 1 FROM [DB].[dbo].[CustomerExclusions]
    WHERE [CustomerNumber] = [SearchHistory].[CustomerNumber]
    AND [ExcludeFromSearch] = 1
)
GROUP BY DATENAME(WEEKDAY, [AtDateTime]), DATEPART(WEEKDAY, [AtDateTime])
ORDER BY DATEPART(WEEKDAY, [AtDateTime])


Any advice/critique is welcome.

DDL for the SearchHistory table:

CREATE TABLE [dbo].[SearchHistory](
    [CustomerNumber] [char](8) NOT NULL,
    [Username] [varchar](16) NOT NULL,
    [AtDateTime] [datetime2](7) NOT NULL,
    [Terms] [varchar](128) NOT NULL,
    [NumFound] [int] NOT NULL,
    [NumInStock] [int] NOT NULL
) ON [PRIMARY]
ALTER TABLE [dbo].[SearchHistory] ADD  CONSTRAINT [DF_SearchHistory_AtDateTime]  DEFAULT (sysdatetime()) FOR [AtDateTime]
GO


DDL for the CustomerExclusions table:

```
CREATE TABLE [dbo].CustomerExclusions NOT NULL,
[ExcludeFromSearch] [bit] NOT NULL,
[ExcludeFromErrors] [bit] NOT

Solution

Regarding performance the code looks fine. Possibly you are missing useful indexes.

As far as the code itself is concerned I have the following observations.

Remove three part names.

[DB].[dbo].[SearchHistory] and [DB].[dbo].[CustomerExclusions]

It is common to want to run queries against different versions of the database. Hardcoding the database name in to SQL makes this harder.

If this SQL lives inside a stored proc in a copy of the database DB_copy on the same instance as DB you may be unaware that this is, in fact, referencing a different database for some time. Similarly if the SQL is sent by an application it is likely that most people will assume that changing the "Initial Catalogue" in the connection string will change the tables being queried.

Add table aliases and use them for all columns in the correlated sub query

/*Original Code*/
FROM [DB].[dbo].[SearchHistory] 
WHERE NOT EXISTS 
(
    SELECT 1 FROM [DB].[dbo].[CustomerExclusions]
    WHERE [CustomerNumber] = [SearchHistory].[CustomerNumber]
    AND [ExcludeFromSearch] = 1
)


/*Rewritten Code*/
FROM   dbo.SearchHistory sh
WHERE  NOT EXISTS (SELECT *
                   FROM   dbo.CustomerExclusions ce
                   WHERE  ce.CustomerNumber = sh.CustomerNumber
                          AND ce.ExcludeFromSearch = 1)


If the table CustomerExclusions does not have any columns called CustomerNumber or ExcludeFromSearch but these do exist in SearchHistory then the original correlated sub query will resolve them from the outer scope giving you incorrect results rather than an error message. Using two part names as in the second example would give an error in that case as well as making the code more explicitly self documenting. I have also removed the square brackets as they are not required and I find them distracting. For reasons of personal preference I have replaced the 1 with a *. It makes no difference to the execution plan.

Other changes.

  • I used a CTE to remove some of the repeated formulas scattered around the code.



  • I replaced your use of SUM()/COUNT(*) with AVG



  • It is only necessary to cast one of the operands to avoid integer division - not both.



  • I changed the column names returned by the query so they meet the rules for standard identifiers and don't need to be quoted.



  • I changed to the non standard column_alias = expression syntax as I find it makes the query easier to read when the select list contains long expressions.



WITH CTE
     AS (SELECT AtDateTime,
                /*Use CASE if < SQL Server 2012*/
                IIF(NumFound = 0, 1, 0)       AS NoResults,
                DATENAME(WEEKDAY, AtDateTime) AS WeekDayName,
                DATEPART(WEEKDAY, AtDateTime) AS WeekDayNumber
         FROM   dbo.SearchHistory sh
         WHERE  NOT EXISTS (SELECT *
                            FROM   dbo.CustomerExclusions ce
                            WHERE  ce.CustomerNumber = sh.CustomerNumber
                                   AND ce.ExcludeFromSearch = 1))
SELECT WeekDayName,
       NumberOfSearches 
             = COUNT(*),
       AverageSearchesPerDay
             = COUNT(*) / CAST(COUNT(DISTINCT CONVERT(DATE, AtDateTime)) AS DECIMAL(10, 2)),
       NumberOfSearchesWithNoResults 
             = SUM(NoResults),
       PercentOfSearchesWithNoResults
             = CAST(AVG(CAST(NoResults AS DECIMAL(10, 2))) AS DECIMAL(10, 4))
FROM   CTE
GROUP  BY WeekDayNumber,
          WeekDayName
ORDER  BY WeekDayNumber

Code Snippets

/*Original Code*/
FROM [DB].[dbo].[SearchHistory] 
WHERE NOT EXISTS 
(
    SELECT 1 FROM [DB].[dbo].[CustomerExclusions]
    WHERE [CustomerNumber] = [SearchHistory].[CustomerNumber]
    AND [ExcludeFromSearch] = 1
)
/*Rewritten Code*/
FROM   dbo.SearchHistory sh
WHERE  NOT EXISTS (SELECT *
                   FROM   dbo.CustomerExclusions ce
                   WHERE  ce.CustomerNumber = sh.CustomerNumber
                          AND ce.ExcludeFromSearch = 1)
WITH CTE
     AS (SELECT AtDateTime,
                /*Use CASE if < SQL Server 2012*/
                IIF(NumFound = 0, 1, 0)       AS NoResults,
                DATENAME(WEEKDAY, AtDateTime) AS WeekDayName,
                DATEPART(WEEKDAY, AtDateTime) AS WeekDayNumber
         FROM   dbo.SearchHistory sh
         WHERE  NOT EXISTS (SELECT *
                            FROM   dbo.CustomerExclusions ce
                            WHERE  ce.CustomerNumber = sh.CustomerNumber
                                   AND ce.ExcludeFromSearch = 1))
SELECT WeekDayName,
       NumberOfSearches 
             = COUNT(*),
       AverageSearchesPerDay
             = COUNT(*) / CAST(COUNT(DISTINCT CONVERT(DATE, AtDateTime)) AS DECIMAL(10, 2)),
       NumberOfSearchesWithNoResults 
             = SUM(NoResults),
       PercentOfSearchesWithNoResults
             = CAST(AVG(CAST(NoResults AS DECIMAL(10, 2))) AS DECIMAL(10, 4))
FROM   CTE
GROUP  BY WeekDayNumber,
          WeekDayName
ORDER  BY WeekDayNumber

Context

StackExchange Code Review Q#115468, answer score: 5

Revisions (0)

No revisions yet.