HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Finding longest matching prefix

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
prefixlongestmatchingfinding

Problem

I have two tables:

Sources:

(SName)  (SId)
Apple1   100
Apple2   200
Banks    300
BanksB   400
Bankerly 500


Prefixes:

(PName) (PId)
App     1
Bank    2
Banker  3


My goal: To find the longest prefix that matches each of sources. The results would look like this:

(SName)  (SId) (PName) (PId)
Apple1   100   App     1
Apple2   200   App     1
Banks    300   Bank    2
BanksB   400   Bank    2
Bankerly 500   Banker  3


Constraints: I am using SQL Server 2000 and cannot upgrade. I know how to solve this problem using analytic functions, but they are not available in SQL Server 2000, as far as I know.

Solution

One way (SQL Fiddle)

SELECT SName,
       SId,
       LEFT(SName, MAX(LEN(PName)))                                          AS PName,
       CAST(SUBSTRING(MAX(STR(LEN(PName), 10) + LTRIM(PId)), 11, 10) AS INT) AS PId
FROM   Sources S
       JOIN Prefixes P
         ON S.SName LIKE P.PName + '%'
GROUP  BY SId,
          SName 
ORDER BY SId


I am assuming that none of the prefixes are allowed to contain characters such as % that would mess up a LIKE search.

It uses a variant of the method in listing 5 in this article on TOP N Per Group Queries to bundle the PId along with the MAX(length). Once the MAX(length) is known it can be used to re-calculate what the prefix must have been so there is no need to add PName in to the concatenation.

Code Snippets

SELECT SName,
       SId,
       LEFT(SName, MAX(LEN(PName)))                                          AS PName,
       CAST(SUBSTRING(MAX(STR(LEN(PName), 10) + LTRIM(PId)), 11, 10) AS INT) AS PId
FROM   Sources S
       JOIN Prefixes P
         ON S.SName LIKE P.PName + '%'
GROUP  BY SId,
          SName 
ORDER BY SId

Context

StackExchange Database Administrators Q#30788, answer score: 5

Revisions (0)

No revisions yet.