HiveBrain v1.2.0
Get Started
← Back to all entries
principleModerate

Best approach for populating date dimension table

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
dimensiondatepopulatingforapproachtablebest

Problem

I am looking to populate a date dimension table in a SQL Server 2008 database.
The fields in the table are as follows:

[DateId]                    INT IDENTITY(1,1) PRIMARY KEY
[DateTime]                  DATETIME
[Date]                      DATE
[DayOfWeek_Number]          TINYINT
[DayOfWeek_Name]            VARCHAR(9)
[DayOfWeek_ShortName]       VARCHAR(3)
[Week_Number]               TINYINT
[Fiscal_DayOfMonth]         TINYINT
[Fiscal_Month_Number]       TINYINT
[Fiscal_Month_Name]         VARCHAR(12)
[Fiscal_Month_ShortName]    VARCHAR(3)
[Fiscal_Quarter]            TINYINT     
[Fiscal_Year]               INT
[Calendar_DayOfMonth]       TINYINT
[Calendar_Month Number]     TINYINT     
[Calendar_Month_Name]       VARCHAR(9)
[Calendar_Month_ShortName]  VARCHAR(3)
[Calendar_Quarter]          TINYINT
[Calendar_Year]             INT
[IsLeapYear]                BIT
[IsWeekDay]                 BIT
[IsWeekend]                 BIT
[IsWorkday]                 BIT
[IsHoliday]                 BIT
[HolidayName]               VARCHAR(255)


I have written a function DateListInRange(D1,D2) that returns all the dates between two parameter dates D1 and D2 inclusive.

ie. parameters '2014-01-01' and '2014-01-03' would return:

2014-01-01
2014-01-02
2014-01-03


I want to populate the DATE_DIM table for all dates within a range, i.e. 2010-01-01 to 2020-01-01. Most of the fields can be populated with the SQL 2008 DATEPART, DATENAME, and YEAR functions.

The fiscal data contains slightly more logic, some of which is dependant on each other. For example:
Fiscal quarter 1 -> Fiscal month must be 1, 2 or 3
Fiscal quarter 2 -> Fiscal month must be 4, 5 or 6

I can easily write a table valued function that accepts a specific date, and then outputs all of the fiscal data, or ALL of the fields even. Then I would just need this function to run on each row of the DateListInRange function.

I am not highly concerned with speed as this will only need to be populated a few time

Solution

UPDATE: for a more generic example of creating and populating a calendar or dimension table, see this tip:

  • Creating a date dimension or calendar table in SQL Server



For the specific question at hand, here's my attempt. I will update this with the magic you use to determine things like Fiscal_MonthNumber and Fiscal_MonthName, because right now they're the only non-intuitive part of your question, and it's the only tangible information you actually didn't include.

The "best" (read:most efficient) way to populate a calendar table, IMHO, is to use a set, rather than a loop. And you can generate this set without burying logic into user-defined functions, which really don't gain you anything but encapsulation - otherwise it's just another object to maintain. I talk about this in a lot more detail in this blog series:

  • http://sqlperformance.com/2013/01/t-sql-queries/generate-a-set-1



  • http://sqlperformance.com/2013/01/t-sql-queries/generate-a-set-2



  • http://sqlperformance.com/2013/01/t-sql-queries/generate-a-set-3



If you want to keep using your function, make sure it's not a multi-statement table-valued function; that's not going to be efficient at all. You want to make sure that it is inline (e.g. has a single RETURN statement and no explicit @table declaration), has WITH SCHEMABINDING, and doesn't use recursive CTEs. Outside of a function, here is how I would do it:

CREATE TABLE dbo.DateDimension
(
  [Date]                      DATE PRIMARY KEY,
  [DayOfWeek_Number]          TINYINT,
  [DayOfWeek_Name]            VARCHAR(9),
  [DayOfWeek_ShortName]       VARCHAR(3),
  [Week_Number]               TINYINT,
  [Fiscal_DayOfMonth]         TINYINT,
  [Fiscal_Month_Number]       TINYINT,
  [Fiscal_Month_Name]         VARCHAR(12),
  [Fiscal_Month_ShortName]    VARCHAR(3),
  [Fiscal_Quarter]            TINYINT,     
  [Fiscal_Year]               SMALLINT,
  [Calendar_DayOfMonth]       TINYINT,
  [Calendar_Month Number]     TINYINT,     
  [Calendar_Month_Name]       VARCHAR(9),
  [Calendar_Month_ShortName]  VARCHAR(3),
  [Calendar_Quarter]          TINYINT,
  [Calendar_Year]             SMALLINT, 
  [IsLeapYear]                BIT,
  [IsWeekDay]                 BIT,
  [IsWeekend]                 BIT,
  [IsWorkday]                 BIT,
  [IsHoliday]                 BIT,
  [HolidayName]               VARCHAR(255)
);
-- add indexes, constraints, etc.


With the table in place, you can perform a single, set-based insert of as many years of data as you want from whatever start date you choose. Just specify the start date and the number of years. I use a "stacked CTE" technique to avoid redundancy and only perform a whole slew of calculations once; the output columns from the earlier CTEs are then subsequently used in further calculations later on.

-- these are important:
SET LANGUAGE US_ENGLISH;
SET DATEFIRST 7;

DECLARE @start DATE = '20100101', @years TINYINT = 20;

;WITH src AS
(
  -- you don't need a function for this...
  SELECT TOP (DATEDIFF(DAY, @start, DATEADD(YEAR, @years, @start)))
    d = DATEADD(DAY, ROW_NUMBER() OVER (ORDER BY s1.number)-1, @start)
   FROM master.dbo.spt_values AS s1
   CROSS JOIN master.dbo.spt_values AS s2
   -- your own numbers table works much better here, but this'll do
),
w AS 
(
  SELECT d, 
    wd      = DATEPART(WEEKDAY,d), 
    wdname  = DATENAME(WEEKDAY,d), 
    wnum    = DATEPART(ISO_WEEK,d),
    qnum    = DATEPART(QUARTER, d),
    y       = YEAR(d),
    m       = MONTH(d),
    mname   = DATENAME(MONTH,d),
    md      = DAY(d)
  FROM src
),
q AS
(
  SELECT *, 
    wdsname   = LEFT(wdname,3),
    msname    = LEFT(mname,3),
    IsWeekday = CASE WHEN wd IN (1,7) THEN 0 ELSE 1 END,
    fq1 = DATEADD(DAY,25,DATEADD(MONTH,2,DATEADD(YEAR,YEAR(d)-1900,0)))
  FROM w
),
q1 AS
(
  SELECT *, 
    -- useless, just inverse of IsWeekday, but okay:
    IsWeekend = CASE WHEN IsWeekday = 1 THEN 0 ELSE 1 END,
    fq = COALESCE(NULLIF(DATEDIFF(QUARTER,DATEADD(DAY,6,fq1),d) 
         + CASE WHEN md >= 26 AND m%3 = 0 THEN 2 ELSE 1 END,0),4)
    FROM q
)
--INSERT dbo.DimWithDateAllPersisted(Date)
SELECT 
  DateKey = d,
  DayOfWeek_Number = wd,
  DayOfWeek_Name = wdname,
  DayOfWeek_ShortName = wdsname,
  Week_Number = wnum,
  -- I'll update these four lines when I have usable info
  Fiscal_DayOfMonth      = 0,--'?magic?',
  Fiscal_Month_Number    = 0,--'?magic?',
  Fiscal_Month_Name      = 0,--'?magic?',
  Fiscal_Month_ShortName = 0,--'?magic?',
  Fiscal_Quarter = fq,
  Fiscal_Year = CASE WHEN fq = 4 AND m < 3 THEN y-1 ELSE y END,
  Calendar_DayOfMonth = md,
  Calendar_Month_Number = m,
  Calendar_Month_Name = mname,
  Calendar_Month_ShortName = msname,
  Calendar_Quarter = qnum,
  Calendar_Year = y,
  IsLeapYear = CASE 
    WHEN (y%4 = 0 AND y%100 != 0) OR (y%400 = 0) THEN 1 ELSE 0 END,
  IsWeekday,
  IsWeekend,
  IsWorkday = CASE WHEN IsWeekday = 1 THEN 1 ELSE 0 END,
  IsHoliday = 0,
  HolidayName = ''
FROM q1;


Now, you still have these "holiday" and "workday" columns lef

Code Snippets

CREATE TABLE dbo.DateDimension
(
  [Date]                      DATE PRIMARY KEY,
  [DayOfWeek_Number]          TINYINT,
  [DayOfWeek_Name]            VARCHAR(9),
  [DayOfWeek_ShortName]       VARCHAR(3),
  [Week_Number]               TINYINT,
  [Fiscal_DayOfMonth]         TINYINT,
  [Fiscal_Month_Number]       TINYINT,
  [Fiscal_Month_Name]         VARCHAR(12),
  [Fiscal_Month_ShortName]    VARCHAR(3),
  [Fiscal_Quarter]            TINYINT,     
  [Fiscal_Year]               SMALLINT,
  [Calendar_DayOfMonth]       TINYINT,
  [Calendar_Month Number]     TINYINT,     
  [Calendar_Month_Name]       VARCHAR(9),
  [Calendar_Month_ShortName]  VARCHAR(3),
  [Calendar_Quarter]          TINYINT,
  [Calendar_Year]             SMALLINT, 
  [IsLeapYear]                BIT,
  [IsWeekDay]                 BIT,
  [IsWeekend]                 BIT,
  [IsWorkday]                 BIT,
  [IsHoliday]                 BIT,
  [HolidayName]               VARCHAR(255)
);
-- add indexes, constraints, etc.
-- these are important:
SET LANGUAGE US_ENGLISH;
SET DATEFIRST 7;

DECLARE @start DATE = '20100101', @years TINYINT = 20;

;WITH src AS
(
  -- you don't need a function for this...
  SELECT TOP (DATEDIFF(DAY, @start, DATEADD(YEAR, @years, @start)))
    d = DATEADD(DAY, ROW_NUMBER() OVER (ORDER BY s1.number)-1, @start)
   FROM master.dbo.spt_values AS s1
   CROSS JOIN master.dbo.spt_values AS s2
   -- your own numbers table works much better here, but this'll do
),
w AS 
(
  SELECT d, 
    wd      = DATEPART(WEEKDAY,d), 
    wdname  = DATENAME(WEEKDAY,d), 
    wnum    = DATEPART(ISO_WEEK,d),
    qnum    = DATEPART(QUARTER, d),
    y       = YEAR(d),
    m       = MONTH(d),
    mname   = DATENAME(MONTH,d),
    md      = DAY(d)
  FROM src
),
q AS
(
  SELECT *, 
    wdsname   = LEFT(wdname,3),
    msname    = LEFT(mname,3),
    IsWeekday = CASE WHEN wd IN (1,7) THEN 0 ELSE 1 END,
    fq1 = DATEADD(DAY,25,DATEADD(MONTH,2,DATEADD(YEAR,YEAR(d)-1900,0)))
  FROM w
),
q1 AS
(
  SELECT *, 
    -- useless, just inverse of IsWeekday, but okay:
    IsWeekend = CASE WHEN IsWeekday = 1 THEN 0 ELSE 1 END,
    fq = COALESCE(NULLIF(DATEDIFF(QUARTER,DATEADD(DAY,6,fq1),d) 
         + CASE WHEN md >= 26 AND m%3 = 0 THEN 2 ELSE 1 END,0),4)
    FROM q
)
--INSERT dbo.DimWithDateAllPersisted(Date)
SELECT 
  DateKey = d,
  DayOfWeek_Number = wd,
  DayOfWeek_Name = wdname,
  DayOfWeek_ShortName = wdsname,
  Week_Number = wnum,
  -- I'll update these four lines when I have usable info
  Fiscal_DayOfMonth      = 0,--'?magic?',
  Fiscal_Month_Number    = 0,--'?magic?',
  Fiscal_Month_Name      = 0,--'?magic?',
  Fiscal_Month_ShortName = 0,--'?magic?',
  Fiscal_Quarter = fq,
  Fiscal_Year = CASE WHEN fq = 4 AND m < 3 THEN y-1 ELSE y END,
  Calendar_DayOfMonth = md,
  Calendar_Month_Number = m,
  Calendar_Month_Name = mname,
  Calendar_Month_ShortName = msname,
  Calendar_Quarter = qnum,
  Calendar_Year = y,
  IsLeapYear = CASE 
    WHEN (y%4 = 0 AND y%100 != 0) OR (y%400 = 0) THEN 1 ELSE 0 END,
  IsWeekday,
  IsWeekend,
  IsWorkday = CASE WHEN IsWeekday = 1 THEN 1 ELSE 0 END,
  IsHoliday = 0,
  HolidayName = ''
FROM q1;
UPDATE dbo.DateDimension
  SET IsWorkday = 0, IsHoliday = 1, HolidayName = 'Christmas'
  WHERE Calendar_Month_Number = 12 AND Calendar_DayOfMonth = 25;
CREATE TABLE dbo.Test
(
  [date] DATE PRIMARY KEY,
  DayOfWeek_Number AS DATEPART(WEEKDAY, [date]) PERSISTED
);
CREATE TABLE dbo.Test
(
  [date] DATE PRIMARY KEY,
  DayOfWeek_Number AS 
    COALESCE(NULLIF(DATEDIFF(DAY,'20000101',[date])%7,0),7) PERSISTED
);

Context

StackExchange Database Administrators Q#74957, answer score: 13

Revisions (0)

No revisions yet.