patternsqlMinor
patindex parser
Viewed 0 times
parserpatindexstackoverflow
Problem
I am trying to parse a string column matching it against several patterns. I have code that does not produce the desired results.
This is a sample table:
And this is my query:
It is looking for an
This is a sample table:
-- demo data
IF OBJECT_ID('dbo.Input', 'U') IS NOT NULL DROP TABLE dbo.Input
GO
CREATE TABLE [dbo].[Input]
(
[PromotionCode] [char](10) NOT NULL,
[Description] [char](30) NULL
CONSTRAINT [PK_Input] PRIMARY KEY CLUSTERED (PromotionCode)
)
GO
INSERT INTO dbo.Input (PromotionCode, Description) Values ('1709106-01', 'Good Gums 10% /M100/'), ('1708162-01', 'PURE Santevia 6% /M33/E33/R33/'), ( '1709164-01', 'Sovereign 10% /M50/E50/'),( '6031709-04', 'VS JF Powder 240g /M7/R3/E10/')And this is my query:
-- parse description using patindex
select PromotionCode, [Description],
--patindex('%/M[0-9][0-9]/%', [Description]) as 'WWW',
--substring([Description], patindex( '%/M[0-9][0-9]/%', [Description]) + 2, 2) as 'XXX',
--parse(substring([Description], patindex( '%/M[0-9][0-9]/%', [Description]) + 2, 2) as float using 'en-US') as 'ZZZ',
CASE
WHEN patindex('%/M[0-9][0-9][0-9]/%', [Description]) = 0 then 0
WHEN patindex('%/M[0-9][0-9][0-9]/%', [Description]) > 0 then parse(substring([Description], patindex( '%/M[0-9][0-9][0-9]/%', [Description]) + 2, 3) as float using 'en-US')
ELSE (
CASE
WHEN patindex('%/M[0-9][0-9]/%', [Description]) = 0 then 0
WHEN patindex('%/M[0-9][0-9]/%', [Description]) > 0 then parse(substring([Description], patindex( '%/M[0-9][0-9]/%', [Description]) + 2, 2) as float using 'en-US')
ELSE (
CASE
WHEN patindex('%/M[0-9]/%', [Description]) = 0 then 0
WHEN patindex('%/M[0-9]/%', [Description]) > 0 then parse(substring([Description], patindex( '%/M[0-9]/%', [Description]) + 2, 1) as float using 'en-US')
ELSE NULL
END)
END)
END [MCB]
from dbo.InputIt is looking for an
Mxxx token where xxx is a number, and is supposed to extract and returSolution
Your code does not work because of how the conditions are arranged in your nested CASEs. Take, for example, the first
and, consequently produce the result of 0, even though it also matches this condition further down:
The easiest fix is to remove all the branches that check for equality to 0, because they are just superfluous:
The nesting, however, is superfluous too. If one pattern is not matched, SQL Server will continue testing the next one, until it reaches the ELSE clause. Therefore, your CASE expression could look simply like this:
It is also possible to avoid having the nearly duplicate code in the CASE's multiple branches and have a single expression instead. This is one way of doing that:
The values
In case the description does not have an
Description value in your output, PURE Santevia 6% /M33/E33/R33/. When passed through your set of conditions, it will match the very first one:WHEN patindex('%/M[0-9][0-9][0-9]/%', [Description]) = 0and, consequently produce the result of 0, even though it also matches this condition further down:
WHEN patindex('%/M[0-9][0-9]/%', [Description]) > 0The easiest fix is to remove all the branches that check for equality to 0, because they are just superfluous:
CASE
WHEN patindex('%/M[0-9][0-9][0-9]/%', [Description]) > 0 then parse(substring([Description], patindex( '%/M[0-9][0-9][0-9]/%', [Description]) + 2, 3) as float using 'en-US')
ELSE (
CASE
WHEN patindex('%/M[0-9][0-9]/%', [Description]) > 0 then parse(substring([Description], patindex( '%/M[0-9][0-9]/%', [Description]) + 2, 2) as float using 'en-US')
ELSE (
CASE
WHEN patindex('%/M[0-9]/%', [Description]) > 0 then parse(substring([Description], patindex( '%/M[0-9]/%', [Description]) + 2, 1) as float using 'en-US')
ELSE NULL
END)
END)
END [MCB]The nesting, however, is superfluous too. If one pattern is not matched, SQL Server will continue testing the next one, until it reaches the ELSE clause. Therefore, your CASE expression could look simply like this:
CASE
WHEN patindex('%/M[0-9][0-9][0-9]/%', [Description]) > 0 then parse(substring([Description], patindex( '%/M[0-9][0-9][0-9]/%', [Description]) + 2, 3) as float using 'en-US')
WHEN patindex('%/M[0-9][0-9]/%', [Description]) > 0 then parse(substring([Description], patindex( '%/M[0-9][0-9]/%', [Description]) + 2, 2) as float using 'en-US')
WHEN patindex('%/M[0-9]/%', [Description]) > 0 then parse(substring([Description], patindex( '%/M[0-9]/%', [Description]) + 2, 1) as float using 'en-US')
ELSE NULL -- or ELSE 0, depending on the desired result when there is no match
END [MCB]It is also possible to avoid having the nearly duplicate code in the CASE's multiple branches and have a single expression instead. This is one way of doing that:
SELECT
PromotionCode,
Description,
MCB = CASE
WHEN Description LIKE '%/M[0-9][0-9][0-9]/%'
OR Description LIKE '%/M[0-9][0-9]/%'
OR Description LIKE '%/M[0-9]/%'
THEN SUBSTRING(Description, p1.Pos, p2.Pos - p1.Pos)
ELSE NULL -- or ELSE 0
END
FROM
dbo.Input
CROSS APPLY (SELECT PATINDEX('%/M[0-9]%/%', Description) + 2) AS p1 (Pos)
CROSS APPLY (SELECT CHARINDEX('/', Description, p1.Pos)) AS p2 (Pos)
;The values
p1.Pos and p2.Pos are calculated for every row but used only when the description matches any of the patterns specified. The p1.Pos is the position two characters after the position of the M in the Mxxx token, and p2.Pos is the first slash encountered starting from p1.Pos – in other words, the slash immediately after the token.In case the description does not have an
Mxxx token, the values of p1.Pos and p2.Pos will make no sense. However, as already explained, they will not be used in that case but the ELSE clause will work instead.Code Snippets
WHEN patindex('%/M[0-9][0-9][0-9]/%', [Description]) = 0WHEN patindex('%/M[0-9][0-9]/%', [Description]) > 0CASE
WHEN patindex('%/M[0-9][0-9][0-9]/%', [Description]) > 0 then parse(substring([Description], patindex( '%/M[0-9][0-9][0-9]/%', [Description]) + 2, 3) as float using 'en-US')
ELSE (
CASE
WHEN patindex('%/M[0-9][0-9]/%', [Description]) > 0 then parse(substring([Description], patindex( '%/M[0-9][0-9]/%', [Description]) + 2, 2) as float using 'en-US')
ELSE (
CASE
WHEN patindex('%/M[0-9]/%', [Description]) > 0 then parse(substring([Description], patindex( '%/M[0-9]/%', [Description]) + 2, 1) as float using 'en-US')
ELSE NULL
END)
END)
END [MCB]CASE
WHEN patindex('%/M[0-9][0-9][0-9]/%', [Description]) > 0 then parse(substring([Description], patindex( '%/M[0-9][0-9][0-9]/%', [Description]) + 2, 3) as float using 'en-US')
WHEN patindex('%/M[0-9][0-9]/%', [Description]) > 0 then parse(substring([Description], patindex( '%/M[0-9][0-9]/%', [Description]) + 2, 2) as float using 'en-US')
WHEN patindex('%/M[0-9]/%', [Description]) > 0 then parse(substring([Description], patindex( '%/M[0-9]/%', [Description]) + 2, 1) as float using 'en-US')
ELSE NULL -- or ELSE 0, depending on the desired result when there is no match
END [MCB]SELECT
PromotionCode,
Description,
MCB = CASE
WHEN Description LIKE '%/M[0-9][0-9][0-9]/%'
OR Description LIKE '%/M[0-9][0-9]/%'
OR Description LIKE '%/M[0-9]/%'
THEN SUBSTRING(Description, p1.Pos, p2.Pos - p1.Pos)
ELSE NULL -- or ELSE 0
END
FROM
dbo.Input
CROSS APPLY (SELECT PATINDEX('%/M[0-9]%/%', Description) + 2) AS p1 (Pos)
CROSS APPLY (SELECT CHARINDEX('/', Description, p1.Pos)) AS p2 (Pos)
;Context
StackExchange Database Administrators Q#188271, answer score: 4
Revisions (0)
No revisions yet.