snippetsqlMinor
How to select records from two tables transactionally
Viewed 0 times
tablesrecordstransactionallytwohowselectfrom
Problem
I have table
They are designed to be separate, so please no advice on merging them.
Now the logic is to find all messages that are not sent yet, and send them. Basically a simple
But this database is used by many concurrent threads, therefore it's possible for a message to be taken by more than one thread, hence be sent twice or even more.
If they were a single table, I could use
What options do I have though to make selection transactional, so that a message won't be sent twice?
Messages that has fields, and another table called SentMessages, which has a one-to-one relationship to the Messages table.Messages stores information about the message, including its text, and subject, and the recipient, etc.create table Messages
(
Id bigint not null identity(1, 1) primary key,
Text nvarchar(max),
Subject nvarchar(400)
-- other fields
)SentMessages stores information related to the transportation of the message to the clients.create table SentMessages
(
Id bigint not null primary key,
SentOn datetime not null,
DeliveredOn datetime null,
-- other fields
)
go
alter table SentMessages with check add constrait [FK_SentMessages_Messages]
foreign key([Id])
references Messages ([Id])They are designed to be separate, so please no advice on merging them.
Now the logic is to find all messages that are not sent yet, and send them. Basically a simple
not in clause in a single-threaded scenario would do the trick:select *
from Messages
where Id not in
(
select Id
from SentMessages
)But this database is used by many concurrent threads, therefore it's possible for a message to be taken by more than one thread, hence be sent twice or even more.
If they were a single table, I could use
update select statement to select unsent messages in a single transaction.What options do I have though to make selection transactional, so that a message won't be sent twice?
Solution
You can't have the cake and leave it whole; on one hand, you want to allow maximal concurrency, and on the other hand, you want each process to be 'isolated' and not visible to other processes.
SQL Server offers Service Broker exactly for these queuing / asynchronous processing challenges. Check it out, it is an awesome infrastructure that will most likely work better than any custom user application solution.
Before we had Service Broker, what we used to do for similar challenges was create a 3rd table, call it "Email Send Queue" or something like that.
Let it hold just the IDs of the messages that need to be sent, so every time you enter it into 'messages', also enter the ID to the queue.
When a process needs to pick a message to send, you open an explicit transaction in the REPEATABLE READ isolation level, and hold it open until the sending process completes.
Every new process that needs to pick a job, uses the
This way, the reading process will simply skip locked rows, and take the next one that is not yet locked, i.e. - not yet being processed.
Even if you were able (and you can BTW...) hold explicit locks on the original tables, without the READPAST trick, it would be a recipe for blocking hell.
Here is a sample, simplified, untested code to do the trick.
SQL Server offers Service Broker exactly for these queuing / asynchronous processing challenges. Check it out, it is an awesome infrastructure that will most likely work better than any custom user application solution.
Before we had Service Broker, what we used to do for similar challenges was create a 3rd table, call it "Email Send Queue" or something like that.
Let it hold just the IDs of the messages that need to be sent, so every time you enter it into 'messages', also enter the ID to the queue.
When a process needs to pick a message to send, you open an explicit transaction in the REPEATABLE READ isolation level, and hold it open until the sending process completes.
Every new process that needs to pick a job, uses the
SELECT TOP 1 ID FROM EmailSendQueue WITH (READPAST) ORDER BY ID ASCThis way, the reading process will simply skip locked rows, and take the next one that is not yet locked, i.e. - not yet being processed.
Even if you were able (and you can BTW...) hold explicit locks on the original tables, without the READPAST trick, it would be a recipe for blocking hell.
Here is a sample, simplified, untested code to do the trick.
CREATE TABLE [EmailSendQueue]
(
[MessageID] BIGINT NOT NULL
-- Better use a heap here to prevent page level resrouce contention
PRIMARY KEY NONCLUSTERED
REFERENCES [Messages] (ID)
-- In case someone deletes a message that has not been sent yet,
-- and to prevent deleting a message which is in process
ON DELETE CASCADE
ON UPDATE NO ACTION
);
-- New messages into queue when ready to send - allows delayed send logic as well...
INSERT INTO [EmailSendQueue]([MessageID])
VALUES (1), (2), (3);
-- First Process picks up #1
-- Second process will pick up #2, if #1 is still locked, or is completed...
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN TRANSACTION;
DECLARE @MessageID BIGINT;
SELECT TOP (1) @MessageID = [MessageID]
FROM [EmailSendQueue] WITH (READPAST, XLOCK)
ORDER BY [MessageID] ASC;
DECLARE @StartTime DATETIME = GETDATE();
-- Process and send message
EXECUTE [SendMessageProcedure] @MessageID;
-- Send successful
IF @@Error = 0
BEGIN
-- Now the message has been sent, and can go into SentMessages
INSERT INTO [SentMessages] ([ID], [SendOn], [DeliveredOn])
VALUES (@MessageID, @StartTime, GETDATE());
DELETE FROM [EmailSendQueue] WHERE [MessageID] = @MessageID
END;
-- If send unsuccesful, you can do nothing, or introduce some kind of retry logic
-- You can add a 'Retry' column to the queue, and increment it on every attempt
-- A scheduled backgrond process can delete all messages that exceed a retry threshold
-- This will prevent 'poisoning' the queue with error messages
COMMIT TRANSACTION;Code Snippets
SELECT TOP 1 ID FROM EmailSendQueue WITH (READPAST) ORDER BY ID ASCCREATE TABLE [EmailSendQueue]
(
[MessageID] BIGINT NOT NULL
-- Better use a heap here to prevent page level resrouce contention
PRIMARY KEY NONCLUSTERED
REFERENCES [Messages] (ID)
-- In case someone deletes a message that has not been sent yet,
-- and to prevent deleting a message which is in process
ON DELETE CASCADE
ON UPDATE NO ACTION
);
-- New messages into queue when ready to send - allows delayed send logic as well...
INSERT INTO [EmailSendQueue]([MessageID])
VALUES (1), (2), (3);
-- First Process picks up #1
-- Second process will pick up #2, if #1 is still locked, or is completed...
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN TRANSACTION;
DECLARE @MessageID BIGINT;
SELECT TOP (1) @MessageID = [MessageID]
FROM [EmailSendQueue] WITH (READPAST, XLOCK)
ORDER BY [MessageID] ASC;
DECLARE @StartTime DATETIME = GETDATE();
-- Process and send message
EXECUTE [SendMessageProcedure] @MessageID;
-- Send successful
IF @@Error = 0
BEGIN
-- Now the message has been sent, and can go into SentMessages
INSERT INTO [SentMessages] ([ID], [SendOn], [DeliveredOn])
VALUES (@MessageID, @StartTime, GETDATE());
DELETE FROM [EmailSendQueue] WHERE [MessageID] = @MessageID
END;
-- If send unsuccesful, you can do nothing, or introduce some kind of retry logic
-- You can add a 'Retry' column to the queue, and increment it on every attempt
-- A scheduled backgrond process can delete all messages that exceed a retry threshold
-- This will prevent 'poisoning' the queue with error messages
COMMIT TRANSACTION;Context
StackExchange Database Administrators Q#207064, answer score: 4
Revisions (0)
No revisions yet.