HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlModerate

9 digit zip codes are reporting LEN of 12

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
reportingarezipcodeslendigit

Problem

I am attempting to resolve some difficulties in getting zip codes to display properly.

The original spreadsheet has zip codes of mixed 5 and 9 digit formats. After the import process, these 9 digit zip codes are reporting a length of 12 digits. Now when I try to add a hyphen '-' to the 9 digit zip codes I'm getting abnormal results and errors because of the wrong length and various data type conversion issues.

The import was performed using an openrowset method for importing the data from a spreadsheet.

When I query the newly imported data, I see the zip codes showing the same as they were in the spreadsheet, but the length is wrong.

SELECT ZIP,
    LEN(LTRIM(RTRIM(Zip))) AS ZIPLENGTH
  FROM XLS_IMPORT


ZIP ZIPLENGTH
45750 5
432013256 12
441153221 12
44120 5
441351362 12


If I select the left 9 characters of the data, everything gets converted to float and the zip codes are now unreadable.

SELECT LEFT(ZIP,9) FROM XLS_IMPORT
WHERE LEN(LTRIM(RTRIM(ZIP))) = 12


ZIP
4.32013e+
4.42034e+
4.56637e+
4.41153e+
4.36045e+
4.41133e+


How can I get these zip codes back to the correct 9 digits? Or how can I add a hyphen to those 9 digit zip codes which are reporting length of 12? My end goal is to simply get the 9 digit zip code to have a hyphen in the middle.

The datatype of the Zip column is float.

I just discovered that some of my spreadsheets like NJ and NY have an apostrophe before the leading 0 in the zip code. I will need to investigate how to handle the '0xxxx zip codes to get this to work on some of my spreadsheet imports.

Solution

Zip codes are strings, not numbers. Some of them have 1 or even 2 (but not more than 2) leading zeroes. The datatype in the import table should be VARCHAR(10) so that it can hold 5 digit and 9 digit + hyphen zip codes. Even if you never have to store postal codes of other countries, and even if the values will only ever have numeric digits (i.e. 0 - 9), this data is still string data, just like phone numbers.

Based on the import query shown in your other question (automate import and export of process EXCEL -> SQL SERVER - > EXCEL without using SSIS ):

SELECT * INTO XLS_IMPORT
FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',
'Excel 12.0; Database=C:\RSG_ETL_Tool\Ohio\OH.xls; HDR=YES; IMEX=1',
'SELECT * FROM [OH$]');


I would suggest not relying upon the SELECT INTO construct to create the XLS_IMPORT table, but instead create the import table(s) manually, and then use the INSERT INTO ... SELECT FROM OPENROWSET() construct. Doing this would allow you to do the following things to improve this situation:

  • Create the ZipCode field as VARCHAR(10)



  • Convert the values on the way in using the STR function which, for an initial FLOAT value of 432013256, will return 432013256 instead of 4.32013e+008 (which is what you get when you convert to VARCHAR).



-
Fix any missing leading zeroes using something like the following:

CASE
   WHEN LEN(LTRIM(STR(@ZipColumn))) BETWEEN 3 AND 4
          THEN RIGHT('0000' + LTRIM(STR(@ZipColumn)), 5)
   WHEN LEN(LTRIM(STR(@ZipColumn))) BETWEEN 7 AND 8
          THEN RIGHT('0000' + LTRIM(STR(@ZipColumn)), 9)
   WHEN LEN(LTRIM(STR(@ZipColumn))) IN (5, 9) THEN LTRIM(STR(@ZipColumn))
   ELSE 'BadZipCode'
END


Example:

DECLARE @ZipColumn FLOAT = 032013256.000000;

SELECT CASE
          WHEN LEN(LTRIM(STR(@ZipColumn))) BETWEEN 3 AND 4
                 THEN RIGHT('0000' + LTRIM(STR(@ZipColumn)), 5)
          WHEN LEN(LTRIM(STR(@ZipColumn))) BETWEEN 7 AND 8
                 THEN RIGHT('0000' + LTRIM(STR(@ZipColumn)), 9)
          WHEN LEN(LTRIM(STR(@ZipColumn))) IN (5, 9) THEN LTRIM(STR(@ZipColumn))
          ELSE 'BadZipCode'
       END;


Returns:

032013256


Ideally, you would fix the column definition in the spreadsheet to be a string. But even if you do that, it might still be a good idea to keep this code around.


My end goal is to simply get the 9 digit zip code to have a hyphen in the middle.

With that goal in mind, the following inline TVF can be used to both convert the FLOAT value to VARCHAR, and to add in the hyphen for ZIP + 4 values.

Code for iTVF:

CREATE FUNCTION dbo.FormatZIPCode(@NumericZIPCode FLOAT)
RETURNS TABLE
WITH SCHEMABINDING
AS RETURN

WITH string AS
(
    SELECT  LTRIM(STR(@NumericZIPCode)) AS [Value],
            LEN(LTRIM(STR(@NumericZIPCode))) AS [Size]
), converted AS
(
SELECT  CASE
                WHEN st.[Value] IS NULL THEN NULL
                WHEN st.[Size] BETWEEN 3 AND 4
                    THEN RIGHT('0000' + st.[Value], 5)
                WHEN st.[Size] BETWEEN 7 AND 8
                    THEN RIGHT('0000' + st.[Value], 9)
                WHEN st.[Size] IN (5, 9)
                    THEN st.[Value]
                ELSE 'BadZipCode'
            END AS [ZIP],
            st.[Size] AS [OriginalSize]
    FROM        string st
)
SELECT  IIF(cnv.[OriginalSize] >= 7, STUFF(cnv.[ZIP], 6, 0, '-'), cnv.[ZIP])
               AS [FormattedZIPCode]
FROM        converted cnv;


Test:

SELECT  *
FROM    (VALUES (CONVERT(FLOAT, NULL)), (1), (12), (123), (1234), (12345),
                (123456), (1234567), (12345678), (123456789)) src(val)
CROSS APPLY dbo.FormatZIPCode(src.[val]) frmt;


Returns:

val FormattedZIPCode
--------- ----------------
NULL NULL
1 BadZipCode
12 BadZipCode
123 00123
1234 01234
12345 12345
123456 BadZipCode
1234567 00123-4567
12345678 01234-5678
123456789 12345-6789


In order to be clearer about what is being suggested, the following shows all of the suggestions noted above put together:

CREATE TABLE dbo.XLS_IMPORT
(
  Col1     DataTypeForCol1,
  Col2     DataTypeForCol2,
  ZIPCode  VARCHAR(10),
  ...
);

INSERT INTO dbo.XLS_IMPORT (Col1, Col2, ZIPCode, ...)
  SELECT xls.Col1, xls.Col2, zip.[FormattedZIPCode], ...
  FROM   OPENROWSET('Microsoft.ACE.OLEDB.12.0',
         'Excel 12.0; Database=C:\RSG_ETL_Tool\Ohio\OH.xls; HDR=YES; IMEX=1',
         'SELECT * FROM [OH$]')
  CROSS APPLY dbo.FormatZIPCode(xls.[ZIP]) zip;


Some of the spreadsheets are "aware" of there being leading zeroes, and so prefixed the field in Excel with a single apostrophe so that Excel would treat the value as a string instead of as numeric (e.g. '01234). In that case, you can use the REPLACE function to strip out that apostrophe.

```
-- Test incoming string data (potentially prefixed with a single apostrophe)
SELECT src.[val], frmt.[FormattedZIPCode],CHARINDEX(N'''', src.[val])
FROM (VALUES (NULL), (N'''01234

Code Snippets

SELECT * INTO XLS_IMPORT
FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',
'Excel 12.0; Database=C:\RSG_ETL_Tool\Ohio\OH.xls; HDR=YES; IMEX=1',
'SELECT * FROM [OH$]');
CASE
   WHEN LEN(LTRIM(STR(@ZipColumn))) BETWEEN 3 AND 4
          THEN RIGHT('0000' + LTRIM(STR(@ZipColumn)), 5)
   WHEN LEN(LTRIM(STR(@ZipColumn))) BETWEEN 7 AND 8
          THEN RIGHT('0000' + LTRIM(STR(@ZipColumn)), 9)
   WHEN LEN(LTRIM(STR(@ZipColumn))) IN (5, 9) THEN LTRIM(STR(@ZipColumn))
   ELSE 'BadZipCode'
END
DECLARE @ZipColumn FLOAT = 032013256.000000;

SELECT CASE
          WHEN LEN(LTRIM(STR(@ZipColumn))) BETWEEN 3 AND 4
                 THEN RIGHT('0000' + LTRIM(STR(@ZipColumn)), 5)
          WHEN LEN(LTRIM(STR(@ZipColumn))) BETWEEN 7 AND 8
                 THEN RIGHT('0000' + LTRIM(STR(@ZipColumn)), 9)
          WHEN LEN(LTRIM(STR(@ZipColumn))) IN (5, 9) THEN LTRIM(STR(@ZipColumn))
          ELSE 'BadZipCode'
       END;
CREATE FUNCTION dbo.FormatZIPCode(@NumericZIPCode FLOAT)
RETURNS TABLE
WITH SCHEMABINDING
AS RETURN

WITH string AS
(
    SELECT  LTRIM(STR(@NumericZIPCode)) AS [Value],
            LEN(LTRIM(STR(@NumericZIPCode))) AS [Size]
), converted AS
(
SELECT  CASE
                WHEN st.[Value] IS NULL THEN NULL
                WHEN st.[Size] BETWEEN 3 AND 4
                    THEN RIGHT('0000' + st.[Value], 5)
                WHEN st.[Size] BETWEEN 7 AND 8
                    THEN RIGHT('0000' + st.[Value], 9)
                WHEN st.[Size] IN (5, 9)
                    THEN st.[Value]
                ELSE 'BadZipCode'
            END AS [ZIP],
            st.[Size] AS [OriginalSize]
    FROM        string st
)
SELECT  IIF(cnv.[OriginalSize] >= 7, STUFF(cnv.[ZIP], 6, 0, '-'), cnv.[ZIP])
               AS [FormattedZIPCode]
FROM        converted cnv;
SELECT  *
FROM    (VALUES (CONVERT(FLOAT, NULL)), (1), (12), (123), (1234), (12345),
                (123456), (1234567), (12345678), (123456789)) src(val)
CROSS APPLY dbo.FormatZIPCode(src.[val]) frmt;

Context

StackExchange Database Administrators Q#138357, answer score: 10

Revisions (0)

No revisions yet.