HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Performance issues summing a count column over a date range

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
columnissuessummingrangedateperformancecountover

Problem

In our application, we have a query which on an API endpoint to return users who have 'activity' to a named client or matter (one or many - specified by ID). This query takes about 15 seconds to return when the activity table has 30 million rows (note there is also ~600k rows in the 'Asset' table and 2700 'users').

The rough schema for the tables can be found at the bottom of my question. Below is our query to find 'users who have activity over a threshold'. For brevity I have placed the view definition below also.

As this endpoint returns a page of data, a second, similar, query is also ran to get the total count of elements to populate paging response - essentially then the performance of the endpoint is 2x the query performance.

My question is essentially, what techniques should I be applying to get improve the performance of this query? A 'benchmark' we try to stick to is 'subsecond' on our endpoint responses.

Query plan can be found here.

SELECT DISTINCT t.type, t.sid, t.name, t.emailAddress, t.jobTitle
FROM sec.Trustee t
    INNER JOIN (
        SELECT data.sid, SUM(data.hoursBilled) as hoursBilled, SUM(data.docsAccessed) as docsAccessed,     data.asset_type as asset_type, data.displayId as displayId, data.displayName as displayName
        FROM (
            SELECT billing.trustee_sid as sid, 0 as hoursBilled, billing.recordedValue as docsAccessed, a.type as asset_type, a.displayId, a.displayName
            FROM sec._DocumentsBilling billing
            INNER JOIN sec.SessionSid s
            ON s.sid = billing.client_sid
            AND s.setID = @P0
            INNER JOIN sec.Asset a
            ON a.sid = billing.client_sid
            AND billing.recordedDate > @P1
            AND billing.client_sid IN (@P2)
        ) data
        GROUP BY data.sid, data.asset_type, data.displayId, data.displayName 
        HAVING SUM(data.docsAccessed) > @P4
) trusteeData ON trusteeData.sid = t.sid
ORDER BY sid
OFFSET @P6 ROWS
FETCH NEXT @P7 ROWS ONLY


The

Solution

I can't speak for the query accessing the view but the samples you provided give a quick insight into the performance of the view.

You are getting a filter operator at the end of all of your joins (INNER & LEFT OUTER) to be able to satisfy the where clause:

WHERE (matter.sid IS NULL AND data.matterId = '') OR matter.sid IS NOT NULL


Splitting the view up further with the use of UNION puts the filter operator a bit earlier in the execution plan, but not (n)early enough. (heh)

Splitting the OR was a good beginning for improving the performance of the view , but there are additional parts to improve.

We could change the first part of the filter:

WHERE (matter.sid IS NULL AND data.matterId = '')


By removing AND data.matterId = '')

and adding it to each part of the UNION subquery:

SELECT N'DOCUMENTS' as billingType, t.sid as trustee_sid, recordedValue, recordedDate, clientId, matterId
FROM [tenant].userActivityDocuments d
INNER JOIN [tenant].Trustee t ON t.emailAddress = d.userString AND d.userType = N'EMAIL'
WHERE  matterId = ''
UNION ALL
SELECT N'DOCUMENTS' as billingType, t.sid as trustee_sid, recordedValue, recordedDate, clientId, matterId
FROM [tenant].userActivityDocuments d
INNER JOIN [tenant].Trustee t ON t.foreignId = d.userString AND d.userType = N'FOREIGNID'
WHERE  matterId = ''
UNION ALL
SELECT N'DOCUMENTS' as billingType, t.sid as trustee_sid, recordedValue, recordedDate, clientId, matterId
FROM [tenant].userActivityDocuments d
INNER JOIN [tenant].Trustee t ON t.domainUser = d.userString AND d.userType = N'DOMAIN'
WHERE  matterId = ''
UNION ALL
SELECT N'DOCUMENTS' as billingType, t.sid as trustee_sid, recordedValue, recordedDate, clientId, matterId
FROM [tenant].userActivityDocuments d
INNER JOIN [tenant].Trustee t ON t.unqualifiedDomainUser = d.userString AND d.userType = N'UNQUALIFIED_DOMAIN'
WHERE  matterId = ''


However, this still leaves us with a filter operator due to the LEFT JOIN + IS NULL filtering:

LEFT JOIN [tenant].Asset matter ON matter.client_sid = client.sid AND
 matter.displayId = data.matterId
     WHERE (matter.sid IS NULL)


We could improve this type of filtering by changing it to a NOT EXISTS statement.

This leaves us with:

```
CREATE VIEW [tenant]._DocumentsBilling AS
SELECT data.billingType as billingType, data.trustee_sid as trustee_sid, data.recordedValue, client.sid as client_sid, client.clientGroup_sid as clientGroup_sid,NULL as matter_sid, NULL as matterGroup_sid, data.recordedDate
FROM (
SELECT N'DOCUMENTS' as billingType, t.sid as trustee_sid, recordedValue, recordedDate, clientId, matterId
FROM [tenant].userActivityDocuments d
INNER JOIN [tenant].Trustee t ON t.emailAddress = d.userString AND d.userType = N'EMAIL'
WHERE matterId = ''
UNION ALL
SELECT N'DOCUMENTS' as billingType, t.sid as trustee_sid, recordedValue, recordedDate, clientId, matterId
FROM [tenant].userActivityDocuments d
INNER JOIN [tenant].Trustee t ON t.foreignId = d.userString AND d.userType = N'FOREIGNID'
WHERE matterId = ''
UNION ALL
SELECT N'DOCUMENTS' as billingType, t.sid as trustee_sid, recordedValue, recordedDate, clientId, matterId
FROM [tenant].userActivityDocuments d
INNER JOIN [tenant].Trustee t ON t.domainUser = d.userString AND d.userType = N'DOMAIN'
WHERE matterId = ''
UNION ALL
SELECT N'DOCUMENTS' as billingType, t.sid as trustee_sid, recordedValue, recordedDate, clientId, matterId
FROM [tenant].userActivityDocuments d
INNER JOIN [tenant].Trustee t ON t.unqualifiedDomainUser = d.userString AND d.userType = N'UNQUALIFIED_DOMAIN'
WHERE matterId = ''
) data
INNER JOIN [tenant].Asset client ON client.displayId = data.clientId

WHERE NOT EXISTS (SELECT * FROM [tenant].Asset matter WHERE matter.client_sid = client.sid AND matter.displayId = data.matterId)

UNION
SELECT data.billingType as billingType, data.trustee_sid as trustee_sid, data.recordedValue, client.sid as client_sid, client.clientGroup_sid as clientGroup_sid, matter.sid as matter_sid, matter.matterGroup_sid as matterGroup_sid, data.recordedDate
FROM (
SELECT N'DOCUMENTS' as billingType, t.sid as trustee_sid, recordedValue, recordedDate, clientId, matterId
FROM [tenant].userActivityDocuments d
INNER JOIN [tenant].Trustee t ON t.emailAddress = d.userString AND d.userType = N'EMAIL'
UNION ALL
SELECT N'DOCUMENTS' as billingType, t.sid as trustee_sid, recordedValue, recordedDate, clientId, matterId
FROM [tenant].userActivityDocuments d
INNER JOIN [tenant].Trustee t ON t.foreignId = d.userString AND d.userType = N'FOREIGNID'
UNION ALL
SELECT N'DOCUMENTS' as billingType, t.sid as trustee_sid, recordedValue, recordedDate, clientId, matterId
FROM [tenant].userActivityDocuments d
INNER JOIN [te

Code Snippets

WHERE (matter.sid IS NULL AND data.matterId = '') OR matter.sid IS NOT NULL
WHERE (matter.sid IS NULL AND data.matterId = '')
SELECT N'DOCUMENTS' as billingType, t.sid as trustee_sid, recordedValue, recordedDate, clientId, matterId
FROM [tenant].userActivityDocuments d
INNER JOIN [tenant].Trustee t ON t.emailAddress = d.userString AND d.userType = N'EMAIL'
WHERE  matterId = ''
UNION ALL
SELECT N'DOCUMENTS' as billingType, t.sid as trustee_sid, recordedValue, recordedDate, clientId, matterId
FROM [tenant].userActivityDocuments d
INNER JOIN [tenant].Trustee t ON t.foreignId = d.userString AND d.userType = N'FOREIGNID'
WHERE  matterId = ''
UNION ALL
SELECT N'DOCUMENTS' as billingType, t.sid as trustee_sid, recordedValue, recordedDate, clientId, matterId
FROM [tenant].userActivityDocuments d
INNER JOIN [tenant].Trustee t ON t.domainUser = d.userString AND d.userType = N'DOMAIN'
WHERE  matterId = ''
UNION ALL
SELECT N'DOCUMENTS' as billingType, t.sid as trustee_sid, recordedValue, recordedDate, clientId, matterId
FROM [tenant].userActivityDocuments d
INNER JOIN [tenant].Trustee t ON t.unqualifiedDomainUser = d.userString AND d.userType = N'UNQUALIFIED_DOMAIN'
WHERE  matterId = ''
LEFT JOIN [tenant].Asset matter ON matter.client_sid = client.sid AND
 matter.displayId = data.matterId
     WHERE (matter.sid IS NULL)
CREATE VIEW [tenant]._DocumentsBilling  AS
      SELECT data.billingType as billingType, data.trustee_sid as trustee_sid, data.recordedValue, client.sid as client_sid, client.clientGroup_sid as clientGroup_sid,NULL  as matter_sid, NULL as matterGroup_sid, data.recordedDate
    FROM (
        SELECT N'DOCUMENTS' as billingType, t.sid as trustee_sid, recordedValue, recordedDate, clientId, matterId
        FROM [tenant].userActivityDocuments d
        INNER JOIN [tenant].Trustee t ON t.emailAddress = d.userString AND d.userType = N'EMAIL'
        WHERE  matterId = ''
        UNION ALL
        SELECT N'DOCUMENTS' as billingType, t.sid as trustee_sid, recordedValue, recordedDate, clientId, matterId
        FROM [tenant].userActivityDocuments d
        INNER JOIN [tenant].Trustee t ON t.foreignId = d.userString AND d.userType = N'FOREIGNID'
        WHERE  matterId = ''
        UNION ALL
        SELECT N'DOCUMENTS' as billingType, t.sid as trustee_sid, recordedValue, recordedDate, clientId, matterId
        FROM [tenant].userActivityDocuments d
        INNER JOIN [tenant].Trustee t ON t.domainUser = d.userString AND d.userType = N'DOMAIN'
        WHERE  matterId = ''
        UNION ALL
        SELECT N'DOCUMENTS' as billingType, t.sid as trustee_sid, recordedValue, recordedDate, clientId, matterId
        FROM [tenant].userActivityDocuments d
        INNER JOIN [tenant].Trustee t ON t.unqualifiedDomainUser = d.userString AND d.userType = N'UNQUALIFIED_DOMAIN'
        WHERE  matterId = ''
    ) data
    INNER JOIN [tenant].Asset client ON client.displayId = data.clientId

    WHERE NOT EXISTS (SELECT * FROM  [tenant].Asset matter WHERE matter.client_sid = client.sid AND matter.displayId = data.matterId)
    
    UNION
            SELECT data.billingType as billingType, data.trustee_sid as trustee_sid, data.recordedValue, client.sid as client_sid, client.clientGroup_sid as clientGroup_sid, matter.sid as matter_sid, matter.matterGroup_sid as matterGroup_sid, data.recordedDate
    FROM (
            SELECT N'DOCUMENTS' as billingType, t.sid as trustee_sid, recordedValue, recordedDate, clientId, matterId
        FROM [tenant].userActivityDocuments d
        INNER JOIN [tenant].Trustee t ON t.emailAddress = d.userString AND d.userType = N'EMAIL'
        UNION ALL
        SELECT N'DOCUMENTS' as billingType, t.sid as trustee_sid, recordedValue, recordedDate, clientId, matterId
        FROM [tenant].userActivityDocuments d
        INNER JOIN [tenant].Trustee t ON t.foreignId = d.userString AND d.userType = N'FOREIGNID'
        UNION ALL
        SELECT N'DOCUMENTS' as billingType, t.sid as trustee_sid, recordedValue, recordedDate, clientId, matterId
        FROM [tenant].userActivityDocuments d
        INNER JOIN [tenant].Trustee t ON t.domainUser = d.userString AND d.userType = N'DOMAIN'
        UNION ALL
        SELECT N'DOCUMENTS' as billingType, t.sid as trustee_sid, recordedValue, recordedDate, clientId, matterId
        FROM [tenant].userActivityDocuments d

Context

StackExchange Database Administrators Q#281131, answer score: 5

Revisions (0)

No revisions yet.