HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Improving query response speed when joining a non-spatial table with a spatial table on a non-spatial column

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
nontableresponsequeryspatialwithcolumnwhenspeedimproving

Problem

I have a table of frequently updating events that affect certain geometries. The geometry areas are fixed, but there may be an event affecting multiple areas at once, so there's a many-to-many relationship between the two. A single area may also be affected by multiple events at the same time.

For example:

Event ID
Area ID

1
15

1
31

2
46

3
46

3
55

4
15

The areas are represented in my table as ID values. I need to represent these spatially, so I need to join the event_id and area_id values to a lookup table that contains the area_ids and the geometry. Unfortunately, the tables are in two separate databases on the same SQL Server 2014 instance.

The problem is that this join process is quite slow; it takes anywhere from 90 seconds to 3 minutes to return 18k-24k rows, depending on how many events are active at the moment. The geometry lookup table contains 84k rows.

The events table structure (some additional columns removed for simplicity's sake):

USE Events_DB;
CREATE TABLE [dbo].[Events](
    [EVENTS_ID] [bigint] NOT NULL,
    [AREA_ID] [bigint] NOT NULL,
    [START_DATE_TIME] [smalldatetime] NULL,
    [END_DATE_TIME] [smalldatetime] NULL,
 CONSTRAINT [PK_Events] PRIMARY KEY CLUSTERED 
(
    [EVENTS_ID] ASC,
    [AREA_ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]


The geometry lookup table structure:

USE Geom_DB;
CREATE TABLE [dbo].[Area_Geom](
    [AREA_ID] [int] NOT NULL,
    [Geom] [geometry] NOT NULL,
 CONSTRAINT [Area_Geom] PRIMARY KEY CLUSTERED 
(
    [AREA_ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]


My query is below. I'm using a CTE for the event and area IDs because they're being used in elsewhere in the full query, but I've narrowed down the bottleneck to this particular portion.

Solution

⛈️⛈️⛈️

Good news! Your query finishes rather quickly!

Bad news! It takes a long time to get do other stuff.

Given the long elapsed time here when compared to CPU, you can bet that the query was either blocked, or receiving the results took a long time.

If you run the same query and dump it into a #temp table so that you're not returning results to SSMS, that should help determine if the bottleneck is server-side or client-side.

WITH 
    active_ids AS 
(
    SELECT DISTINCT 
        e.AREA_ID
    FROM [EVENTS] AS e
    WHERE END_DATE_TIME > GETUTCDATE()
)
SELECT
    a.*
INTO #active_ids
FROM active_ids AS a;

SELECT 
    a.geom 
INTO #Area_Geom
FROM Area_Geom AS a
WHERE area_id IN (SELECT ai.area_id FROM #active_ids AS ai);

SELECT
    ag.*
FROM #Area_Geom AS ag;

Code Snippets

WITH 
    active_ids AS 
(
    SELECT DISTINCT 
        e.AREA_ID
    FROM [EVENTS] AS e
    WHERE END_DATE_TIME > GETUTCDATE()
)
SELECT
    a.*
INTO #active_ids
FROM active_ids AS a;

SELECT 
    a.geom 
INTO #Area_Geom
FROM Area_Geom AS a
WHERE area_id IN (SELECT ai.area_id FROM #active_ids AS ai);

SELECT
    ag.*
FROM #Area_Geom AS ag;

Context

StackExchange Database Administrators Q#323331, answer score: 6

Revisions (0)

No revisions yet.