patternsqlMinor

Performance of an In-Memory Table is worse than a disk-based table

Submitted by: @import:stackexchange-dba·Mar 10, 2026·

Viewed 0 times

diskthanworseperformancememorybasedtable

Problem

I have a table in SQL Server 2014 that looks like the following:

CREATE TABLE dbo.MyTable
(
[id1] [bigint] NOT NULL,
[id2] [bigint] NOT NULL,
[col1] [int] NOT NULL default(0),
[col2] [int] NOT NULL default(0)
)

with (id1,id2) being the PK. Basically, id1 is an identifier to group a set of results (id2, col1, col2), whose pk is id2.

I'm trying to use an in-memory table to get rid of an existing disk-based table which is my bottleneck.

The data in the table are written -> read -> deleted once.

Each id1 value has several (tens/hundreds of) thousands of id2.

Data are stored in the table for a very short amount of time, e.g. 20 seconds.

The queries performed on this table are the following:

-- INSERT (can vary from 10s to 10,000s of records):
INSERT INTO MyTable
  SELECT @fixedValue, id2, col1, col2 FROM AnotherTable

-- READ:
SELECT id2, col1
FROM MyTable INNER JOIN OtherTbl ON MyTable.id2 = OtherTbl.pk
WHERE id1 = @value
ORDER BY col1

-- DELETE:
DELETE FROM MyTable WHERE id1 = @value

Here's the current definition that I used for the table:

CREATE TABLE dbo.SearchItems
(
  [id1] [bigint] NOT NULL,
  [id2] [bigint] NOT NULL,
  [col1] [int] NOT NULL default(0),
  [col2] [int] NOT NULL default(0)

  CONSTRAINT PK_Mem PRIMARY KEY NONCLUSTERED (id1,id2),
  INDEX idx_Mem HASH (id1,id2) WITH (BUCKET_COUNT = 131072)
) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_ONLY)

Unfortunately, this definition results in a degradation of performance with respect to the previous situation with a disk-based table. The order of magnitude is more or less 10% higher (that in some cases reach 100%, so double time).

Most of all, I was expecting to gain a super-advantage in high-concurrency scenarios, given the lock-free architecture advertised by Microsoft. Instead, the worst performances are exactly when there are several concurrent users running several queries on the table.
Questions:

what is the correct BUCKET_COUNT to set?

what kind of ind

Solution

While this post won't be a complete answer due to lacking information, it should be able to point you in the proper direction or otherwise gain insight which you can later share with the community.

Unfortunately, this definition results in a degradation of performance with respect to the previous situation with a disk-based table. The order of magnitude is more or less 10% higher (that in some cases reach 100%, so double time).

Most of all, I was expecting to gain a super-advantage in high-concurrency scenarios, given the lock-free architecture advertised by Microsoft. Instead, the worst performances are exactly when there are several concurrent users running several queries on the table.

This is troubling as it should definitely not be the case. Certain workloads are not for in memory tables (SQL 2014) and some workloads lend themselves to it. In most situations there can be a minimal bump in performance just by migrating and choosing the proper indexes.

Originally I was thinking very narrow about your questions regarding this:

Questions:

what is the correct BUCKET_COUNT to set?

what kind of index should I use?

why the performance are worse than with the disk-based table?

Initially I believed there to be an issue with the actual in memory table and indexes not being optimal. While there are some issues with the memory optimized hash index definition I believe the real issue to be with the queries used.

-- INSERT (can vary from 10s to 10,000s of records):
INSERT INTO MyTable
  SELECT @fixedValue, id2, col1, col2 FROM AnotherTable

This insert should be extremely fast if it were only involving the in memory table. It, however, also involves a disk based table and is subject to all of the locking and blocking associated with that. Thus, the real time waste here is on the disk based table.

When I did a quick test against 100,000 row insert from the disk based table after loading the data into memory - it was sub-second response times. However, most of your data is only kept for a very short amount of time, less than 20 seconds. This doesn't give it much time to really live in cache. Additionally I'm unsure how large AnotherTable really is and don't know if the values are being read off of disk or not. We have to rely on you for these answers.

With the Select query:

SELECT id2, col1
FROM MyTable INNER JOIN OtherTbl ON MyTable.id2 = OtherTbl.pk
WHERE id1 = @value
ORDER BY col1

Again, we're at the mercy of the interop + disk based table performance. Additionally, sorts are not cheap on HASH indexes and a nonclustered index should be used. This is called out in the Index guide I linked in the comments.

To give some actual research based facts, I loaded the SearchItems in memory table with 10 million rows and AnotherTable with 100,000 as I did not know the actual size or statistics of it. I then used the select query above to execute. Additionally I created an extended events session on wait_completed and put it into a ring buffer. It was cleaned after each run. I also ran DBCC DROPCLEANBUFFERS to simulate an environment where all of the data may not be memory resident.

The results weren't anything spectacular when looking at them in a vacuum. Since the laptop I'm testing this on is using a higher grade SSD, I artificially turned the disk based performance down for the VM I am using.

The results came in with no wait info after 5 runs of the query on just the in-memory based table (removing the join and no sub-queries). This is pretty much as expected.

When using the original query, however, I did have waits. In this case it was PAGEIOLATCH_SH which makes sense as the data is being read off disk. Since I am the only user in this system and didn't spend time to create a massive test environment for inserts, updates, deletes against the joined table I didn't expect any locking or blocking to come into effect.

In this case, once again, the significant portion of time was spent on the disk based table.

Finally the delete query. Finding the rows based off just ID1 is not extremely efficient with a has index. While it is true that equality predicates are what hash indexes are proper for, the bucket that the data falls into is based off the entire hashed columns. Thus id1, id2 where id1 = 1, id2 = 2, and id1 = 1, id2 = 3 will hash into different buckets as the hash will be across (1,2) and (1,3). This won't be a simple B-Tree range scan as hash indexes are not structured the same way. I would then expect this to not be the ideal index for this operation, however I wouldn't expect it to take orders of magnitude longer as experienced. I would be interested in seeing the wait_info on this.

Most of all, I was expecting to gain a super-advantage in high-concurrency scenarios, given the lock-free architecture advertised by Microsoft. Instead, the worst performances are exactly when there are several concurrent users running several queries on the tab

Code Snippets

-- INSERT (can vary from 10s to 10,000s of records):
INSERT INTO MyTable
  SELECT @fixedValue, id2, col1, col2 FROM AnotherTable

SELECT id2, col1
FROM MyTable INNER JOIN OtherTbl ON MyTable.id2 = OtherTbl.pk
WHERE id1 = @value
ORDER BY col1

Context

StackExchange Database Administrators Q#132735, answer score: 7

Revisions (0)

No revisions yet.