patternMinor
Building a branched versioning model for relational databases
Viewed 0 times
brancheddatabasesbuildingrelationalforversioningmodel
Problem
I am database designer and at my current project I'm implementing versioning capabilities required to concurrently edit rows of data in RDBMS. The project requirements says, that data editing sessions can go on for several hours or days until performing commit. Also, conflicts are arising during simultaneous modifying of the same data by different users should be handled with possibility of manual and semi-automatic resolution. In other words, desired editing workflow is similar to one used in document-oriented version control systems, such as SVN or Git. Therefore, traditional OLTP approaches and conflict resolution strategies (MVCC, optimistic/pessimistic locks) doesn't satisfy my constraints. I have done some observation of existing tools, that offer possibilities for branched version history and multiversion workflow:
SQL:2011 doesn't solve my problem, as it offers support for "linear" history of edits, not branched I'm looking for. Solutions from ESRI and Oracle are good candidates, but I'm disappointed that both have vendor-specific interfaces for manipulating versions. It seems that at this moment nobody can offer industry standard solution for branched versioning of relational data (as SQL:2011 does for temporal tables and linear version history). As a newcoming database researcher, i want to understand:
- ArcSDE - ESRI's ArcGIS supports versioning for geodatabases through ArcSDE data layer;
- Oracle Workspace Manager - feature of Oracle Database, providing high degree of version isolation and data history management;
- SQL:2011 temporal features, including valid time and transactional time support.
SQL:2011 doesn't solve my problem, as it offers support for "linear" history of edits, not branched I'm looking for. Solutions from ESRI and Oracle are good candidates, but I'm disappointed that both have vendor-specific interfaces for manipulating versions. It seems that at this moment nobody can offer industry standard solution for branched versioning of relational data (as SQL:2011 does for temporal tables and linear version history). As a newcoming database researcher, i want to understand:
- is relational database community interested in developing standard models of branched data versioning and will any contribution or research in this area be valuable? (for example, standartization as it was done for temporal features in SQL2011 in the form of language improvements)
- do developers and database designers lack for database-independent open-s
Solution
IMO this problem lacks sufficient generality to build standards or reusable solutions. It's really just a problem you solve with data modeling,
Something along the lines of:
If the content being versioned is not stored as a blob, but is in separate tables, the pattern still holds. All the tables that store the versioned data need the VERSION_ID.
It would certainly be an interesting research project to explore all the different ways this can be done, and discuss options for conflict resolution, merging changes and sketch out what a general solution might look like.
Something along the lines of:
CREATE TABLE DOC
(
DOC_ID INT PRIMARY KEY,
CURRENT_VERSION_ID INT,
MODIFIED_AT DATETIME,
MODIFIED_BY INT,
CONSTRAINT FK_CURRENT_VERSION
FOREIGN KEY (DOC_ID, CURRENT_VERSION)
REFERENCES DOC_VERSION(DOC_ID,VERSION_ID)
)
CREATE TABLE DOC_VERSION
(
DOC_ID INT REFERENCES DOC,
VERSION_ID INT,
VERSION_OWNER INT,
SOURCE_VERSION_ID INT,
DOC_CONTENT NVARCHAR(MAX),
CONSTRAINT PK_DOC_VERSION PRIMARY KEY (DOC_ID,VERSION_ID)
)If the content being versioned is not stored as a blob, but is in separate tables, the pattern still holds. All the tables that store the versioned data need the VERSION_ID.
It would certainly be an interesting research project to explore all the different ways this can be done, and discuss options for conflict resolution, merging changes and sketch out what a general solution might look like.
Code Snippets
CREATE TABLE DOC
(
DOC_ID INT PRIMARY KEY,
CURRENT_VERSION_ID INT,
MODIFIED_AT DATETIME,
MODIFIED_BY INT,
CONSTRAINT FK_CURRENT_VERSION
FOREIGN KEY (DOC_ID, CURRENT_VERSION)
REFERENCES DOC_VERSION(DOC_ID,VERSION_ID)
)
CREATE TABLE DOC_VERSION
(
DOC_ID INT REFERENCES DOC,
VERSION_ID INT,
VERSION_OWNER INT,
SOURCE_VERSION_ID INT,
DOC_CONTENT NVARCHAR(MAX),
CONSTRAINT PK_DOC_VERSION PRIMARY KEY (DOC_ID,VERSION_ID)
)Context
StackExchange Database Administrators Q#74210, answer score: 3
Revisions (0)
No revisions yet.