patternsqlMinor
TSQL Query to match different lengths of strings with each other
Viewed 0 times
tsqleachwithquerymatchdifferentlengthsotherstrings
Problem
I'm writing a TVF to query a large table (tens of millions of rows) where a string (Postcode) in that table matches a string (a partial postcode (outcode/sector portion)) from another table.
I'm hitting an edge case that I can't resolve.
For those unfamiliar with UK Postcodes
Here's some sample data.
And my current query
These are my results.
The data on rows 1 and 4 shouldn't be in the results.
Row 1, The Coverage or outcode "E1" isn't the same as the outcode portion ("E12") of the Postcode "E12 5HH".
The same for Row 4, The Coverage or outcode "SW1" isn't the same as the outcode ("SW1X") portion of the Postcode "SW1X 6AA".
I'm hitting an edge case that I can't resolve.
For those unfamiliar with UK Postcodes
Here's some sample data.
DECLARE @tab1 TABLE (Sk INT, Postcode VARCHAR(8))
DECLARE @tab2 TABLE (Sk INT, Coverage VARCHAR(8))
INSERT INTO @tab1 (Sk, Postcode) VALUES (1, 'E12 5HH'), (6, 'SW1X 6AA')
INSERT INTO @tab2 (Sk, Coverage) VALUES (1, 'E12'), (1, 'E12 5'),
(2, 'E1'), (2, 'E11'), (2, 'E13'),
(3, 'E12 6'),
(4, 'E12 5') ,
(5, 'E12') ,
(7, 'SW1') ,
(8, 'SW1X')And my current query
SELECT S.Sk,
S.Postcode,
CoverageSk = X.Sk,
X.Coverage
FROM @tab1 S
OUTER APPLY (
SELECT Sk ,
Coverage ,
[Length] = LEN(Coverage)
FROM @tab2
) X
WHERE S.Sk <> X.Sk
AND LEFT(S.Postcode,X.[Length] ) = X.CoverageThese are my results.
The data on rows 1 and 4 shouldn't be in the results.
Row 1, The Coverage or outcode "E1" isn't the same as the outcode portion ("E12") of the Postcode "E12 5HH".
The same for Row 4, The Coverage or outcode "SW1" isn't the same as the outcode ("SW1X") portion of the Postcode "SW1X 6AA".
Solution
Ideally, the source tables would have the component parts broken out so you could match them directly.
If that's not possible, this works for the sample data:
Sk
Postcode
CoverageSk
Coverage
1
E12 5HH
4
E12 5
1
E12 5HH
5
E12
6
SW1X 6AA
8
SW1X
SQL Server can use an index on Postcode via a dynamic seek:
If that's not possible, this works for the sample data:
SELECT
T1.Sk,
T1.Postcode,
CoverageSk = T2.Sk,
T2.Coverage
FROM @tab1 AS T1
JOIN @tab2 AS T2
ON T2.Sk <> T1.Sk
AND T1.Postcode LIKE
CASE
WHEN CHARINDEX(SPACE(1), T2.Coverage) > 0
THEN T2.Coverage
ELSE T2.Coverage + SPACE(1)
END + '%';Sk
Postcode
CoverageSk
Coverage
1
E12 5HH
4
E12 5
1
E12 5HH
5
E12
6
SW1X 6AA
8
SW1X
SQL Server can use an index on Postcode via a dynamic seek:
Code Snippets
SELECT
T1.Sk,
T1.Postcode,
CoverageSk = T2.Sk,
T2.Coverage
FROM @tab1 AS T1
JOIN @tab2 AS T2
ON T2.Sk <> T1.Sk
AND T1.Postcode LIKE
CASE
WHEN CHARINDEX(SPACE(1), T2.Coverage) > 0
THEN T2.Coverage
ELSE T2.Coverage + SPACE(1)
END + '%';Context
StackExchange Database Administrators Q#306934, answer score: 5
Revisions (0)
No revisions yet.