HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlModerate

AppDomain 2 SSISDB is Marked for Unload due to Memory Pressure

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
duessisdbpressureappdomainunloadmarkedformemory

Problem

I have a stored procedure that joins three large tables together (about 20 million records each) and loads records into a temp table. Data in the temp table is then merged into an existing table of about 60 million records.

The server went offline with an error message of:

AppDomain 2 (SSISDB.dbo.[runtime].1] is marked for unload due to memory pressure.

Once I got the server back online, I restarted SQL services to clear out any processes that may have been lingering. Kicked off the job again and it completed without issue.

I’m running SQL Server 2019 with 128GB of RAM. Maximum server memory is 117964MB on a 64-bit virtual server. Someone told me in Task Manager the memory usage was at 94% and that could be the issue. But doesn’t SQL take all of the memory available and hold on to it? So that seems like it’s operating as intended.

sp_WhoIsActive revealed some queries with a status of 'Suspended' and 'Awaiting_Command', but I don’t believe those had much impact. 128GB of ram seems adequate, but I guess it’s relative to the job it’s being asked to do. Any idea how to troubleshoot or prevent from happening again?

The data drive for the server is about 1.6TB. The two larger databases in the join are 10GB with 19 million rows and 13GB with 20 million rows. Those go into a temp table and then MERGE into a 26GB table with 53 million rows.

The requested memory grant is 45GB and the actual is 30GB. The job ran outside of regular business hours so there should have been no competing queries, but I can't 100% confirm that someone wasn't working late.

I did notice that it is having to do a CONVERT_IMPLICIT as well. Does that have a significant impact on required memory?

Link to the query plan:
https://www.brentozar.com/pastetheplan/?id=SyXaty7xK

Solution

About your server

This query spills badly in two places, despite a 29 GB memory grant. That's about the high end of what a query can ask for on your system with max server memory set to 115 GB.

The cause for a memory grant of that size is the optimizer estimating it will need to sort 46 GB of data:

You can see the memory grant percentage details by looking at Resource Governor -- a single query can ask for ~25% of your Max Server Memory setting -- and up to three queries can ask for the full grant concurrently.

That you get a grant of that size and still see spills of this magnitude and time consumption likely means that your server is far under provisioned as far as memory goes.

Since both operators execute in Batch Mode, the timing you see there is per operator.

  • Operator Times



I'm sure if you looked at wait stats for this server (given the 1.6 TB of data), PAGEIOLATCH_ would be up there, along with potentially SLEEP_TASK or IO_COMPLETION if spills of this size are common. Though both waits can be associated with other things as well, I often see them pile up in situations like yours.

I'd first suggest adding a more realistic amount of memory to your server. I don't know what that number is (and I can't tell you that here), but I'd probably aim for 512 GB or higher based on size of data and nothing else. You should also turn on lock pages in memory, if it's not already.
About your query

I can't see the full text, since all that appears in the query plan is the insert and part of the select list, but given what that looks like I can make some reasonable assumptions about overall code quality.

For example, several of the joins are on expressions, which probably indicates that you're wrapping join columns in functions like rtrim or isnull or something.

But your main problem is at the end of the query and the spills pictured above.

Let's talk about those!
Hash Join

The Hash Join is there for the DISTINCT you've thrown on your 143 column query. That gives me the creeps, and whoever wrote that should strongly consider some training if they're going to continue working with SQL Server.

I'd recommend finding a smaller combination of columns that produce unique rows, and using row_number to mark them like I show in this video:

  • DISTINCTly Bad



A pseudo-code example looks like so:

WITH 
   cte AS
(
   SELECT
       *,
        n = 
            ROW_NUMBER..
    FROM ...
)
SELECT
    c.*
FROM cte AS c
WHERE c.n = 1;


Just make sure your indexes are appropriate to support the windowing function.

  • Common Query Plan Patterns For Windowing Functions: Row vs Batch Mode



  • Common Query Plan Patterns For Windowing Functions: Row Store vs Column Store Indexes



  • When Index Sort Direction Matters



Sort

The sort is there because your #temp table has a clustered index on it.

If you create the table as a heap instead, and add the index later, you can avoid that unpleasantness when the query runs, though adding the index may not be fun depending on various local factors.

  • When Should You Index Temp Tables?



I'd also suggest adding a tablock hint to your insert to encourage a fully parallel insert if possible.

  • Making The Most Of Temp Tables Part 1: Parallel Inserts



Right now, your query goes single threaded prior to the insert, which can certainly hurt performance at this row count.

Given the overall state of the query, it may be worth exploring rewrites to break the query into smaller pieces that identify the unique set of keys you're interested in, and then getting the full set of columns you're interested in:

  • Informational vs. Relational

Code Snippets

WITH 
   cte AS
(
   SELECT
       *,
        n = 
            ROW_NUMBER..
    FROM ...
)
SELECT
    c.*
FROM cte AS c
WHERE c.n = 1;

Context

StackExchange Database Administrators Q#297929, answer score: 11

Revisions (0)

No revisions yet.