HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

SQL Server 2019: Memory performance with graph queries (possible memory leak)

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
graphsqlwith2019possibleperformancememoryleakserverqueries

Problem

I'm currently working on implementing an upgrade to SQL Server 2019 in order to make use of the graph features that are available in it. Our database stores records of files and their children, and the graph features allow us to quickly find all of a file's relations in either direction. Our current dev environment is using SQL Server 2019 Standard (15.0.4023.6) on a Linux server.

I'm noticing a concerning problem when I run my graph queries. The server's 'internal' resource pool appears to not free up all resources after a graph query. Left unchecked, this fills up the resource pool. Larger queries will fail until the SQL Server process to be restarted. Depending on server load, this could happen in as little as 1-2 hours. This can also fill up the tempdb and threaten to fill the storage drive. The files for the tempdb also cannot be shrunk/truncated significantly until the server is restarted. In configuration, 'memory.memorylimitmb' is not set, so this problem happens when the resource pool starts to have used the better part of the default 80% of system memory (12.8 GB, with 16GB of system memory)

To set up the tables within a demo database:

CREATE TABLE FileNode (ID BIGINT NOT NULL CONSTRAINT PK_FileNode PRIMARY KEY) AS NODE

GO

CREATE TABLE FileNodeArchiveEdge AS EDGE

GO

CREATE INDEX [IX_FileNodeArchiveEdge_ChildFile] ON [dbo].[FileNodeArchiveEdge] ($from_id)

GO

CREATE INDEX [IX_FileNodeArchiveEdge_ParentFile] ON [dbo].[FileNodeArchiveEdge] ($to_id)

GO


To populate the demo database tables:

`INSERT INTO [FileNode] (ID) VALUES
(1),(2),(3),(4),(5),
(6),(7),(8),(9),(10),
(11),(12),(13),(14),(15)

-- Convenient intermediate table
DECLARE @bridge TABLE (f BIGINT, t BIGINT)
INSERT INTO @bridge (f, t) VALUES
(1,4),
(4,9),
(4,10),
(1,5),
(5,11),
(11,12),
(2,5),
(2,6),
(6,13),
(6,14),
(13,15),
(14,15),
(15,12),
(7,14),
(3,7),
(3,8)

INSERT INTO FileNodeArchi

Solution

I am a Program Manager with the SQL team at Microsoft, overseeing the SQL Graph functionality. Thank you for the details, and sincere apologies for the issue. This issue has been escalated to our engineering team and is being investigated on a priority basis. We will keep you posted on the outcome of the investigation.

UPDATED August 4th, 2021: This issue has been fixed in SQL Server 2019 CU12.

P.S. in the meantime,As a reference, here are some tips on how you can also do some further self-investigation (if you'd like) on what's happening under the hood. Firstly, any memory allocation in SQL Server can generally be traced in detail down to a call stack level, using the Extended Events (XE) mechanism. The first step is to define an XE session, filtering on the specific memory clerk name, which in this case is USERSTORE_SCHEMAMGR:
CREATE EVENT SESSION [TraceMemObj] ON SERVER
ADD EVENT sqlos.page_allocated(
ACTION(package0.callstack)
WHERE (memory_clerk_name = 'USERSTORE_SCHEMAMGR')
)
,
ADD EVENT sqlos.page_freed(
ACTION(package0.callstack)
WHERE (memory_clerk_name = 'USERSTORE_SCHEMAMGR')
)
ADD TARGET package0.histogram
(SET source_type=1,
source=N'package0.callstack')
WITH (MAX_MEMORY=32768 KB,EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,
MAX_DISPATCH_LATENCY=5 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=PER_CPU,TRACK_CAUSALITY=OFF,STARTUP_STATE=OFF)


Start the Extended Events session:
ALTER EVENT SESSION [TraceMemObj] on server state = start


Then, wait for problem to repro, and then run the below query. Click on the XML output from the query.
SELECT event_session_address,
target_name,
execution_count,
CAST (target_data AS XML) AS MemObjData
FROM sys.dm_xe_session_targets AS xst
INNER JOIN
sys.dm_xe_sessions AS xs
ON (xst.event_session_address = xs.address)
WHERE xs.name = 'TraceMemObj';


Using SQLCallStackResolver the above raw call stacks can be resolved to a more readable version. The readable call stacks will give you an idea where the majority of the allocations are originating from.

Be sure to stop the XE session after a minute or two of the problem reproducing:
ALTER EVENT SESSION [TraceMemObj] ON SERVER
STATE = STOP;


In this case, for the reproduction scenario you have shared, you will notice some functions on the top allocation related call stacks, which seem to indicate a correlation to SQL Graph query execution.

Context

StackExchange Database Administrators Q#262219, answer score: 6

Revisions (0)

No revisions yet.