patternsqlModerate
9 digit zip codes are reporting LEN of 12
Viewed 0 times
reportingarezipcodeslendigit
Problem
I am attempting to resolve some difficulties in getting zip codes to display properly.
The original spreadsheet has zip codes of mixed 5 and 9 digit formats. After the import process, these 9 digit zip codes are reporting a length of 12 digits. Now when I try to add a hyphen '-' to the 9 digit zip codes I'm getting abnormal results and errors because of the wrong length and various data type conversion issues.
The import was performed using an openrowset method for importing the data from a spreadsheet.
When I query the newly imported data, I see the zip codes showing the same as they were in the spreadsheet, but the length is wrong.
If I select the left 9 characters of the data, everything gets converted to float and the zip codes are now unreadable.
How can I get these zip codes back to the correct 9 digits? Or how can I add a hyphen to those 9 digit zip codes which are reporting length of 12? My end goal is to simply get the 9 digit zip code to have a hyphen in the middle.
The datatype of the
I just discovered that some of my spreadsheets like NJ and NY have an apostrophe before the leading 0 in the zip code. I will need to investigate how to handle the '0xxxx zip codes to get this to work on some of my spreadsheet imports.
The original spreadsheet has zip codes of mixed 5 and 9 digit formats. After the import process, these 9 digit zip codes are reporting a length of 12 digits. Now when I try to add a hyphen '-' to the 9 digit zip codes I'm getting abnormal results and errors because of the wrong length and various data type conversion issues.
The import was performed using an openrowset method for importing the data from a spreadsheet.
When I query the newly imported data, I see the zip codes showing the same as they were in the spreadsheet, but the length is wrong.
SELECT ZIP,
LEN(LTRIM(RTRIM(Zip))) AS ZIPLENGTH
FROM XLS_IMPORTZIP ZIPLENGTH
45750 5
432013256 12
441153221 12
44120 5
441351362 12
If I select the left 9 characters of the data, everything gets converted to float and the zip codes are now unreadable.
SELECT LEFT(ZIP,9) FROM XLS_IMPORT
WHERE LEN(LTRIM(RTRIM(ZIP))) = 12ZIP
4.32013e+
4.42034e+
4.56637e+
4.41153e+
4.36045e+
4.41133e+
How can I get these zip codes back to the correct 9 digits? Or how can I add a hyphen to those 9 digit zip codes which are reporting length of 12? My end goal is to simply get the 9 digit zip code to have a hyphen in the middle.
The datatype of the
Zip column is float.I just discovered that some of my spreadsheets like NJ and NY have an apostrophe before the leading 0 in the zip code. I will need to investigate how to handle the '0xxxx zip codes to get this to work on some of my spreadsheet imports.
Solution
Zip codes are strings, not numbers. Some of them have 1 or even 2 (but not more than 2) leading zeroes. The datatype in the import table should be
Based on the import query shown in your other question (automate import and export of process EXCEL -> SQL SERVER - > EXCEL without using SSIS ):
I would suggest not relying upon the
-
Fix any missing leading zeroes using something like the following:
Example:
Returns:
Ideally, you would fix the column definition in the spreadsheet to be a string. But even if you do that, it might still be a good idea to keep this code around.
My end goal is to simply get the 9 digit zip code to have a hyphen in the middle.
With that goal in mind, the following inline TVF can be used to both convert the
Code for iTVF:
Test:
Returns:
In order to be clearer about what is being suggested, the following shows all of the suggestions noted above put together:
Some of the spreadsheets are "aware" of there being leading zeroes, and so prefixed the field in Excel with a single apostrophe so that Excel would treat the value as a string instead of as numeric (e.g.
```
-- Test incoming string data (potentially prefixed with a single apostrophe)
SELECT src.[val], frmt.[FormattedZIPCode],CHARINDEX(N'''', src.[val])
FROM (VALUES (NULL), (N'''01234
VARCHAR(10) so that it can hold 5 digit and 9 digit + hyphen zip codes. Even if you never have to store postal codes of other countries, and even if the values will only ever have numeric digits (i.e. 0 - 9), this data is still string data, just like phone numbers.Based on the import query shown in your other question (automate import and export of process EXCEL -> SQL SERVER - > EXCEL without using SSIS ):
SELECT * INTO XLS_IMPORT
FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',
'Excel 12.0; Database=C:\RSG_ETL_Tool\Ohio\OH.xls; HDR=YES; IMEX=1',
'SELECT * FROM [OH$]');I would suggest not relying upon the
SELECT INTO construct to create the XLS_IMPORT table, but instead create the import table(s) manually, and then use the INSERT INTO ... SELECT FROM OPENROWSET() construct. Doing this would allow you to do the following things to improve this situation:- Create the
ZipCodefield asVARCHAR(10)
- Convert the values on the way in using the STR function which, for an initial
FLOATvalue of432013256, will return432013256instead of4.32013e+008(which is what you get when you convert toVARCHAR).
-
Fix any missing leading zeroes using something like the following:
CASE
WHEN LEN(LTRIM(STR(@ZipColumn))) BETWEEN 3 AND 4
THEN RIGHT('0000' + LTRIM(STR(@ZipColumn)), 5)
WHEN LEN(LTRIM(STR(@ZipColumn))) BETWEEN 7 AND 8
THEN RIGHT('0000' + LTRIM(STR(@ZipColumn)), 9)
WHEN LEN(LTRIM(STR(@ZipColumn))) IN (5, 9) THEN LTRIM(STR(@ZipColumn))
ELSE 'BadZipCode'
ENDExample:
DECLARE @ZipColumn FLOAT = 032013256.000000;
SELECT CASE
WHEN LEN(LTRIM(STR(@ZipColumn))) BETWEEN 3 AND 4
THEN RIGHT('0000' + LTRIM(STR(@ZipColumn)), 5)
WHEN LEN(LTRIM(STR(@ZipColumn))) BETWEEN 7 AND 8
THEN RIGHT('0000' + LTRIM(STR(@ZipColumn)), 9)
WHEN LEN(LTRIM(STR(@ZipColumn))) IN (5, 9) THEN LTRIM(STR(@ZipColumn))
ELSE 'BadZipCode'
END;Returns:
032013256
Ideally, you would fix the column definition in the spreadsheet to be a string. But even if you do that, it might still be a good idea to keep this code around.
My end goal is to simply get the 9 digit zip code to have a hyphen in the middle.
With that goal in mind, the following inline TVF can be used to both convert the
FLOAT value to VARCHAR, and to add in the hyphen for ZIP + 4 values.Code for iTVF:
CREATE FUNCTION dbo.FormatZIPCode(@NumericZIPCode FLOAT)
RETURNS TABLE
WITH SCHEMABINDING
AS RETURN
WITH string AS
(
SELECT LTRIM(STR(@NumericZIPCode)) AS [Value],
LEN(LTRIM(STR(@NumericZIPCode))) AS [Size]
), converted AS
(
SELECT CASE
WHEN st.[Value] IS NULL THEN NULL
WHEN st.[Size] BETWEEN 3 AND 4
THEN RIGHT('0000' + st.[Value], 5)
WHEN st.[Size] BETWEEN 7 AND 8
THEN RIGHT('0000' + st.[Value], 9)
WHEN st.[Size] IN (5, 9)
THEN st.[Value]
ELSE 'BadZipCode'
END AS [ZIP],
st.[Size] AS [OriginalSize]
FROM string st
)
SELECT IIF(cnv.[OriginalSize] >= 7, STUFF(cnv.[ZIP], 6, 0, '-'), cnv.[ZIP])
AS [FormattedZIPCode]
FROM converted cnv;Test:
SELECT *
FROM (VALUES (CONVERT(FLOAT, NULL)), (1), (12), (123), (1234), (12345),
(123456), (1234567), (12345678), (123456789)) src(val)
CROSS APPLY dbo.FormatZIPCode(src.[val]) frmt;Returns:
val FormattedZIPCode
--------- ----------------
NULL NULL
1 BadZipCode
12 BadZipCode
123 00123
1234 01234
12345 12345
123456 BadZipCode
1234567 00123-4567
12345678 01234-5678
123456789 12345-6789
In order to be clearer about what is being suggested, the following shows all of the suggestions noted above put together:
CREATE TABLE dbo.XLS_IMPORT
(
Col1 DataTypeForCol1,
Col2 DataTypeForCol2,
ZIPCode VARCHAR(10),
...
);
INSERT INTO dbo.XLS_IMPORT (Col1, Col2, ZIPCode, ...)
SELECT xls.Col1, xls.Col2, zip.[FormattedZIPCode], ...
FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',
'Excel 12.0; Database=C:\RSG_ETL_Tool\Ohio\OH.xls; HDR=YES; IMEX=1',
'SELECT * FROM [OH$]')
CROSS APPLY dbo.FormatZIPCode(xls.[ZIP]) zip;Some of the spreadsheets are "aware" of there being leading zeroes, and so prefixed the field in Excel with a single apostrophe so that Excel would treat the value as a string instead of as numeric (e.g.
'01234). In that case, you can use the REPLACE function to strip out that apostrophe.```
-- Test incoming string data (potentially prefixed with a single apostrophe)
SELECT src.[val], frmt.[FormattedZIPCode],CHARINDEX(N'''', src.[val])
FROM (VALUES (NULL), (N'''01234
Code Snippets
SELECT * INTO XLS_IMPORT
FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',
'Excel 12.0; Database=C:\RSG_ETL_Tool\Ohio\OH.xls; HDR=YES; IMEX=1',
'SELECT * FROM [OH$]');CASE
WHEN LEN(LTRIM(STR(@ZipColumn))) BETWEEN 3 AND 4
THEN RIGHT('0000' + LTRIM(STR(@ZipColumn)), 5)
WHEN LEN(LTRIM(STR(@ZipColumn))) BETWEEN 7 AND 8
THEN RIGHT('0000' + LTRIM(STR(@ZipColumn)), 9)
WHEN LEN(LTRIM(STR(@ZipColumn))) IN (5, 9) THEN LTRIM(STR(@ZipColumn))
ELSE 'BadZipCode'
ENDDECLARE @ZipColumn FLOAT = 032013256.000000;
SELECT CASE
WHEN LEN(LTRIM(STR(@ZipColumn))) BETWEEN 3 AND 4
THEN RIGHT('0000' + LTRIM(STR(@ZipColumn)), 5)
WHEN LEN(LTRIM(STR(@ZipColumn))) BETWEEN 7 AND 8
THEN RIGHT('0000' + LTRIM(STR(@ZipColumn)), 9)
WHEN LEN(LTRIM(STR(@ZipColumn))) IN (5, 9) THEN LTRIM(STR(@ZipColumn))
ELSE 'BadZipCode'
END;CREATE FUNCTION dbo.FormatZIPCode(@NumericZIPCode FLOAT)
RETURNS TABLE
WITH SCHEMABINDING
AS RETURN
WITH string AS
(
SELECT LTRIM(STR(@NumericZIPCode)) AS [Value],
LEN(LTRIM(STR(@NumericZIPCode))) AS [Size]
), converted AS
(
SELECT CASE
WHEN st.[Value] IS NULL THEN NULL
WHEN st.[Size] BETWEEN 3 AND 4
THEN RIGHT('0000' + st.[Value], 5)
WHEN st.[Size] BETWEEN 7 AND 8
THEN RIGHT('0000' + st.[Value], 9)
WHEN st.[Size] IN (5, 9)
THEN st.[Value]
ELSE 'BadZipCode'
END AS [ZIP],
st.[Size] AS [OriginalSize]
FROM string st
)
SELECT IIF(cnv.[OriginalSize] >= 7, STUFF(cnv.[ZIP], 6, 0, '-'), cnv.[ZIP])
AS [FormattedZIPCode]
FROM converted cnv;SELECT *
FROM (VALUES (CONVERT(FLOAT, NULL)), (1), (12), (123), (1234), (12345),
(123456), (1234567), (12345678), (123456789)) src(val)
CROSS APPLY dbo.FormatZIPCode(src.[val]) frmt;Context
StackExchange Database Administrators Q#138357, answer score: 10
Revisions (0)
No revisions yet.