patternsqlMinor
Is Service Broker the best choice for auditing data changes on SQL Server Express?
Viewed 0 times
theexpressauditingsqlbrokerchoiceservicechangesforserver
Problem
My project is to audit 5 to 10 existing tables in our system without prolonging transactions. Whatever method is used, it has to work on SQL Server Express 2005 to (eventually) 2016.
I've done research on Change Data Capture (CDC) and Change Tracking. Change Tracking doesn't capture the specific changes, and CDC is only available in Enterprise Edition.
Then I stumbled upon Service Broker. I was intrigued by Service Broker, so I started creating a prototype. Service Broker is working fine, but answers received in two of my other posts lead me to believe that this might not be the right way to go. Too complicated for nothing. I'm still in the analysis stage and trying different things as part of my analysis.
Right now the results of the service broker are not convincing... a bulk update of 105000 items to a price table, takes 38 seconds while the processing of the Queue (the auditing part) takes 17 seconds... but the 38 seconds includes the double processing of inserting into 2 #temp tables that are then used to insert into the TMPins and TMPdel. So I figure that I can cut that in half... I'm now questioning the use of the service broker... logically the trigger would probably take the same amount of time just by inserting the info straight into the audit table...
To clarify, when I say Bulk insert, It Isn't the "Bulk Insert" Function. I'm talking about a big chunk of data that is inserted or updated at once. when updating 105000 items in the price table, I want to audit the changes that happened. When I say changes that happen, I decided to insert the new values in the audit table (if it's an insert or update) or insert the primary key with all other fields null (for records that have been deleted)... So yes! it can be after the data is loaded in but I don't want to lose any audit (I don't want a transactions to pass out of order)
The two other posts will help get the context of what I'm trying to do and what I've tried:
I've done research on Change Data Capture (CDC) and Change Tracking. Change Tracking doesn't capture the specific changes, and CDC is only available in Enterprise Edition.
Then I stumbled upon Service Broker. I was intrigued by Service Broker, so I started creating a prototype. Service Broker is working fine, but answers received in two of my other posts lead me to believe that this might not be the right way to go. Too complicated for nothing. I'm still in the analysis stage and trying different things as part of my analysis.
Right now the results of the service broker are not convincing... a bulk update of 105000 items to a price table, takes 38 seconds while the processing of the Queue (the auditing part) takes 17 seconds... but the 38 seconds includes the double processing of inserting into 2 #temp tables that are then used to insert into the TMPins and TMPdel. So I figure that I can cut that in half... I'm now questioning the use of the service broker... logically the trigger would probably take the same amount of time just by inserting the info straight into the audit table...
To clarify, when I say Bulk insert, It Isn't the "Bulk Insert" Function. I'm talking about a big chunk of data that is inserted or updated at once. when updating 105000 items in the price table, I want to audit the changes that happened. When I say changes that happen, I decided to insert the new values in the audit table (if it's an insert or update) or insert the primary key with all other fields null (for records that have been deleted)... So yes! it can be after the data is loaded in but I don't want to lose any audit (I don't want a transactions to pass out of order)
The two other posts will help get the context of what I'm trying to do and what I've tried:
- Trigger to create a variable Tab
Solution
I am guessing that the following post is the basis of what you are currently using: Centralized Asynchronous Auditing with Service Broker.
While I really like Service Broker, I don't feel that it is the best fit to address this particular situation. The main concerns that I have with Service Broker, at least in this particular scenario, are:
My preference is for dumping the changes into a queue table, and then in a separate process that is scheduled to run every X minutes, read Y number of rows and process them.
-
Create a queue table to hold the audit data for a particular table (not for each individual DML operation). The table should have:
-
Create a trigger that simply inserts into the queue table. If you need to pass in both "old" and "new" values, JOIN the
If you are not tracking changes on all fields, then use either UPDATE() or COLUMNS_UPDATED() to determine if the columns being audited have indeed been updated (for
If you are not capturing both "old" and "new" values into the queue table, then this is the only opportunity to eliminate "updated" records where no columns actually changed. When inserting into the queue table, you just filter on
-
Create a Stored Procedure that will run for X seconds, and in that time will process sets of
Here you can remove duplicate modifications. If you are tracking both "old" and "new" values per each row, you shou
While I really like Service Broker, I don't feel that it is the best fit to address this particular situation. The main concerns that I have with Service Broker, at least in this particular scenario, are:
- It is probably over-complicated if used mainly to separate the DML event from the audit event. If not moving the data to another system, it doesn't seem to offer much benefit since you will still need to persist the
INSERTEDandDELETEDtables in the DML trigger.
- It is event-based, and events vary greatly in their size and frequency. You could have a million 1-record operations, or one or two 1 million record operations. And since each event is each individual DML operation, you don't have any opportunity to remove duplicate and/or negating entries across several DML operations.
My preference is for dumping the changes into a queue table, and then in a separate process that is scheduled to run every X minutes, read Y number of rows and process them.
-
Create a queue table to hold the audit data for a particular table (not for each individual DML operation). The table should have:
- the
LOCK_ESCALATIONoption set toDISABLE(viaALTER TABLE {name} SET (LOCK_ESCALATION = DISABLE)) to avoid conflict between the logging of new data by the trigger and the removal of data from the audit processing. This option was introduced in SQL Server 2008, so it cannot be use on 2005 instances, but no reason to not use it in the 2008 and newer instances since it doesn't alter the functionality in either case.
- an
AuditIDPK that is one of the following:
- an
INT IDENTITYstarting at -2147483648
- a
BIGINT IDENTITY
- an
INTwith its value coming from aSEQUENCEthat is set toCYCLE
- only the fields that you are tracking changes for, if not all of them
- fields for both "old" and "new" if you need to keep track of the before and after values each time you process. Depending on how you are auditing changes, if you have an archive of prior values, then you should only need the "new" values from the
INSERTEDtable since the values in theDELETEDtable should already be in your prior audit data.
-
Create a trigger that simply inserts into the queue table. If you need to pass in both "old" and "new" values, JOIN the
INSERTED and DELETED tables here rather than trying to maintain two separate queue tables, one for each of those pseudo tables. JOINing them at this point and inserting both the "old" and "new" values into a single row is a slight performance hit, but will guarantee that each operation stays together and in chronological order (via the incrementing PK).If you are not tracking changes on all fields, then use either UPDATE() or COLUMNS_UPDATED() to determine if the columns being audited have indeed been updated (for
UPDATE operations; these functions return true for all columns in INSERT operations). Please keep in mind that the UPDATE() and COLUMNS_UPDATED() functions do not determine if a change has been made in the value of the column(s)!! They only report if the columns were present in the SET clause of the UPDATE statement. The only way to determine if the value(s) actually changed is to JOIN the INSERTED and DELETED tables. But if not tracking all columns, those functions are great for exiting the Trigger without doing any work if no tracked columns have changed. Meaning, at the beginning of the Trigger, you would do:IF (NOT UPDATE(TrackedColumn1) AND NOT UPDATE(TrackedColumn2))
BEGIN
RETURN; -- no changes so just exit
END;
If you are not capturing both "old" and "new" values into the queue table, then this is the only opportunity to eliminate "updated" records where no columns actually changed. When inserting into the queue table, you just filter on
WHERE IntColOld <> IntColNew OR StringFieldOld <> StringFieldNew COLLATE Latin1_General_BIN2 OR ISNULL(DateFieldOld, '1900-01-01') <> ISNULL(DateFieldNew, '1900-01-01') OR .... It is important to use a _BIN2 collation on string fields to ensure that the two fields are in fact identical. And you need to use INSULL for just the nullable fields so that they can equate if both are NULL.-
Create a Stored Procedure that will run for X seconds, and in that time will process sets of
TOP(@BatchSize) rows, using ORDER BY AuditID ASC. You do this by a WHILE loop that checks GETDATE() against @StartTime which was set at the beginning of the stored procedure. Depending on what processing you need to do, it is sometimes easier to create a local temporary table to either INSERT INTO #TempTable SELECT... the working set (then you will have to DELETE those rows at the end of each loop) or DELETE FROM using OUTPUT INTO #TempTable.Here you can remove duplicate modifications. If you are tracking both "old" and "new" values per each row, you shou
Context
StackExchange Database Administrators Q#120248, answer score: 7
Revisions (0)
No revisions yet.