HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Reducing impact of SUM(DATALENGTH()) over linked server

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
impactreducingdatalengthsumserverlinkedover

Problem

I have a query that reports back to me how much time elapsed for records to be added. I also in this query track the size of an image that was added to the database. From what I can see it looks like it is returning the whole image from the linked server and then calculating the size. Is there an easy way I can get the calculation of size to occur on the other side of the link so that the return can be faster?

SELECT
    [Date] = CAST(DATEADD(DAY,-@DaysBack,GETDATE()) AS DATE)
    ,NOTES = N.RPMID
    ,TimeTaken = DATEDIFF("SECOND",SN.PickedUpDate,SN.ProcessedDate)
    ,PictureSize = SUM(CAST(DATALENGTH(SI.[Image])AS BIGINT))
    ,RecordsAdded = CAST(COUNT(*) AS BIGINT)
FROM
    DBServer1.DB1.dbo.Notes AS N WITH (NOLOCK)
    INNER JOIN DBServer1.DB1.dbo.NoteDistribution AS ND WITH (NOLOCK) ON N.Notes=ND.Notes
    INNER JOIN dbo.Submission_Notes AS SN WITH (NOLOCK) ON SN.RetailPlanGuid=N.RPMID
    LEFT JOIN DBServer1.DB1.dbo.SavedImage SI WITH (NOLOCK) ON SI.Notes=N.Notes
WHERE
    1=1
    AND N.RecordStampWhenAdd > CAST(GETDATE()-@DaysBack AS DATE)
    AND N.RecordStampWhenAdd  0
GROUP BY
    N.RPMID
    ,DATEDIFF("SECOND",SN.PickedUpDate,SN.ProcessedDate)


The part in question is the picturesize that is returned. I want the calculations in there to be done on the server where the data is stored and just have the number return to me. Any help on this would be great. Thanks.

Solution

I am not sure exactly how you have determined that all of the data is going across the wire (or why that should be the case) but you could always have a computed column:

ALTER TABLE dbo.SavedImage ADD PictureSize 
  AS CAST(DATALENGTH([Image]) AS BIGINT);


Or just add a column and calculate that yourself whenever you insert/update a row, either by forcing inserts/updates through a stored procedure, or using a trigger.

And persist and/or index that as necessary, and then just use SUM(PictureSize) in the query.

Of course the big impact is still going to be on the fact that you have to scan the entire table or index in order to generate the aggregate.

This method could also be used to maintain the total size of all columns in some other table, but I'm largely against materializing redundant data unless there is a demonstrated advantage to doing so.

In any case, I really don't think this query is slow because it is moving the whole varbinary data over the wire and then calculating the length at the wrong end; if you have evidence to suggest that is the case, please update the question and we'll try to address it.

Code Snippets

ALTER TABLE dbo.SavedImage ADD PictureSize 
  AS CAST(DATALENGTH([Image]) AS BIGINT);

Context

StackExchange Database Administrators Q#111325, answer score: 6

Revisions (0)

No revisions yet.