HiveBrain v1.2.0
Get Started
← Back to all entries
snippetsqlMinor

How to select records from two tables transactionally

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
tablesrecordstransactionallytwohowselectfrom

Problem

I have table Messages that has fields, and another table called SentMessages, which has a one-to-one relationship to the Messages table.

Messages stores information about the message, including its text, and subject, and the recipient, etc.

create table Messages
(
    Id bigint not null identity(1, 1) primary key,
    Text nvarchar(max),
    Subject nvarchar(400)
    -- other fields
)


SentMessages stores information related to the transportation of the message to the clients.

create table SentMessages
(
    Id bigint not null primary key,
    SentOn datetime not null,
    DeliveredOn datetime null,
    -- other fields
)
go
alter table SentMessages with check add constrait [FK_SentMessages_Messages] 
foreign key([Id])
references Messages ([Id])


They are designed to be separate, so please no advice on merging them.

Now the logic is to find all messages that are not sent yet, and send them. Basically a simple not in clause in a single-threaded scenario would do the trick:

select *
from Messages
where Id not in 
(
    select Id 
    from SentMessages
)


But this database is used by many concurrent threads, therefore it's possible for a message to be taken by more than one thread, hence be sent twice or even more.

If they were a single table, I could use update select statement to select unsent messages in a single transaction.

What options do I have though to make selection transactional, so that a message won't be sent twice?

Solution

You can't have the cake and leave it whole; on one hand, you want to allow maximal concurrency, and on the other hand, you want each process to be 'isolated' and not visible to other processes.
SQL Server offers Service Broker exactly for these queuing / asynchronous processing challenges. Check it out, it is an awesome infrastructure that will most likely work better than any custom user application solution.

Before we had Service Broker, what we used to do for similar challenges was create a 3rd table, call it "Email Send Queue" or something like that.
Let it hold just the IDs of the messages that need to be sent, so every time you enter it into 'messages', also enter the ID to the queue.

When a process needs to pick a message to send, you open an explicit transaction in the REPEATABLE READ isolation level, and hold it open until the sending process completes.
Every new process that needs to pick a job, uses the

SELECT TOP 1 ID FROM EmailSendQueue WITH (READPAST) ORDER BY ID ASC


This way, the reading process will simply skip locked rows, and take the next one that is not yet locked, i.e. - not yet being processed.
Even if you were able (and you can BTW...) hold explicit locks on the original tables, without the READPAST trick, it would be a recipe for blocking hell.
Here is a sample, simplified, untested code to do the trick.

CREATE TABLE [EmailSendQueue]
(
[MessageID] BIGINT NOT NULL 
-- Better use a heap here to prevent page level resrouce contention 
PRIMARY KEY NONCLUSTERED
REFERENCES [Messages] (ID)
-- In case someone deletes a message that has not been sent yet, 
-- and to prevent deleting a message which is in process
ON DELETE CASCADE 
ON UPDATE NO ACTION
);

-- New messages into queue when ready to send - allows delayed send logic as well...
INSERT INTO [EmailSendQueue]([MessageID]) 
VALUES (1), (2), (3);

-- First Process picks up #1 
-- Second process will pick up #2, if #1 is still locked, or is completed...
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN TRANSACTION;

DECLARE @MessageID BIGINT;

SELECT TOP (1) @MessageID = [MessageID] 
FROM [EmailSendQueue] WITH (READPAST, XLOCK)
ORDER BY [MessageID] ASC;

DECLARE @StartTime DATETIME = GETDATE();

-- Process and send message
EXECUTE [SendMessageProcedure] @MessageID; 

-- Send successful
IF @@Error = 0
BEGIN
-- Now the message has been sent, and can go into SentMessages
    INSERT INTO [SentMessages] ([ID], [SendOn], [DeliveredOn])
    VALUES (@MessageID, @StartTime, GETDATE());
    DELETE FROM [EmailSendQueue] WHERE [MessageID] = @MessageID
END;

-- If send unsuccesful, you can do nothing, or introduce some kind of retry logic
-- You can add a 'Retry' column to the queue, and increment it on every attempt 
-- A scheduled backgrond process can delete all messages that  exceed a retry threshold
-- This will prevent 'poisoning' the queue with error messages

COMMIT TRANSACTION;

Code Snippets

SELECT TOP 1 ID FROM EmailSendQueue WITH (READPAST) ORDER BY ID ASC
CREATE TABLE [EmailSendQueue]
(
[MessageID] BIGINT NOT NULL 
-- Better use a heap here to prevent page level resrouce contention 
PRIMARY KEY NONCLUSTERED
REFERENCES [Messages] (ID)
-- In case someone deletes a message that has not been sent yet, 
-- and to prevent deleting a message which is in process
ON DELETE CASCADE 
ON UPDATE NO ACTION
);

-- New messages into queue when ready to send - allows delayed send logic as well...
INSERT INTO [EmailSendQueue]([MessageID]) 
VALUES (1), (2), (3);

-- First Process picks up #1 
-- Second process will pick up #2, if #1 is still locked, or is completed...
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN TRANSACTION;

DECLARE @MessageID BIGINT;

SELECT TOP (1) @MessageID = [MessageID] 
FROM [EmailSendQueue] WITH (READPAST, XLOCK)
ORDER BY [MessageID] ASC;

DECLARE @StartTime DATETIME = GETDATE();

-- Process and send message
EXECUTE [SendMessageProcedure] @MessageID; 

-- Send successful
IF @@Error = 0
BEGIN
-- Now the message has been sent, and can go into SentMessages
    INSERT INTO [SentMessages] ([ID], [SendOn], [DeliveredOn])
    VALUES (@MessageID, @StartTime, GETDATE());
    DELETE FROM [EmailSendQueue] WHERE [MessageID] = @MessageID
END;

-- If send unsuccesful, you can do nothing, or introduce some kind of retry logic
-- You can add a 'Retry' column to the queue, and increment it on every attempt 
-- A scheduled backgrond process can delete all messages that  exceed a retry threshold
-- This will prevent 'poisoning' the queue with error messages

COMMIT TRANSACTION;

Context

StackExchange Database Administrators Q#207064, answer score: 4

Revisions (0)

No revisions yet.