HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Conversion of varchar data to datetime failing

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
conversionfailingvarchardatadatetime

Problem

We are in the process of moving data from a legacy table (all varchar fields) into a strongly typed counterpart cue cheering

As part of this effort, we are taking data from the base Entity table and dumping it into Entity_New if all the data can be correctly converted to the appropriate type. Otherwise, it goes into a copy of the existing table named Entity_Bad.

We have a rules engine that validated the data and types as it when in so in theory, the data should be clean even though it's stored in character fields. The reality is my posting here because some thing's off and I'll can't find it. The CompletionDate field in Entity is varchar(46) NULL

Environment is

productversion  productlevel   edition
10.0.4064.0     SP2            Enterprise Edition (64-bit)


My scripts demonstrating what I was doing and what is but isn't working

```
SET NOCOUNT ON

DECLARE
@startid int = 0
, @stopid int = 796833

-------------------------------------------------------------------------------
-- Check doesn't find anything wrong with CompletionDate
-------------------------------------------------------------------------------
SELECT
1
FROM
[dbo].[Entity] E
INNER JOIN
dbo.EntityBatch_New PB
ON E.FiscalYear = PB.FiscalYear
AND E.HashCode = PB.HashCode
AND E.AAKey = PB.AAKey
AND PB.ProcessResultCode IN ('A','W','M')
WHERE
PB.EntityBatchId BETWEEN @StartId AND @StopId
AND
(
-- check
(isDate(E.[CompletionDate]) = 0 AND E.[CompletionDate] IS NOT NULL)
AND (isDate(E.[CompletionDate]) = 1 AND CAST(E.[CompletionDate] AS datetime) BETWEEN '1753-01-01T00:00:00.000' AND '9999-12-31T23:59:59.997')
)

-------------------------------------------------------------------------------
-- Only row that shows as non-date is the NULL one, which is expected
-------------------------------------------------------------------------------
SELECT DISTINCT
(E.[CompletionDate] )
, is

Solution

What happens when you do this?

DECLARE @REAL_DATES TABLE
(
    CompletionDate VARCHAR(46)
);

INSERT INTO
    @REAL_DATES
    SELECT CompletionDate
    FROM dbo.Entity;

SELECT 
    CAST(RD.CompletionDate AS datetime) AS casts_fine
    FROM @REAL_DATES RD;


What I was getting at on twitter was that the conversion can be attempted by the optimizer before rows are eliminated, so you can't only consider the CompletionDate values that are returned by the join.

My first suggestion would be to use the right data type. Why are you using VARCHAR(46) to store a date? This is why you have bad data in the table and why you have to explicitly convert when you want rich data that is not a string (and should never have been a string in the first place IMHO).

My next suggestion would be to correct ALL data in that column, and put measures in place so that it won't become invalid again. For example, a check constraint that validates ISDATE(columnname) = 1.

Failing those two, next on my list would be to return the data to the client and let it convert to datetime or for display or what have you. No matter where you filter out the rows that are causing the problem, the optimizer could push that evaluation around so that the convert is attempted before the bad rows are weeded out.

And finally, you could dump the result of the query into a temp table / table variable, and perform the convert as a second step when querying that intermediate object (since you should be confident that the dates here are valid - in fact you can check first and raise an error if your join happened to return some rows with invalid dates).

Bottom line: (a) you can't make any assumptions about where in the stack a conversion attempt will be made and (b) these workarounds and hacks would not be necessary if you used the right data type.

Code Snippets

DECLARE @REAL_DATES TABLE
(
    CompletionDate VARCHAR(46)
);

INSERT INTO
    @REAL_DATES
    SELECT CompletionDate
    FROM dbo.Entity;

SELECT 
    CAST(RD.CompletionDate AS datetime) AS casts_fine
    FROM @REAL_DATES RD;

Context

StackExchange Database Administrators Q#5312, answer score: 8

Revisions (0)

No revisions yet.