HiveBrain v1.2.0
Get Started
← Back to all entries
snippetsqlModerate

How to determine if a hyphen (-) exists inside a column

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
howhyphencolumndetermineexistsinside

Problem

In a CASE expression, I'm trying to search inside a text column to identify a hyphen (-):

CASE 
     WHEN SUBSTRING(al.ALT_ADDRESS,1,1) IN('1','5','7') 
      AND al.NEW_ADDRESS CONTAINS '-' 
     THEN CONCAT(al.ALT_ADDRESS,al.NEW_ADDRESS)


The hyphen can be located anywhere in the column, and so I just need to know if it exists, regardless of where it actually is in the column.

I'm currently using the exact same code that @Josh provided (LIKE '%-%'), but it doesn't work, because it doesn't return the correct data for several specific instances where I know that it "should" be. The exact text in ALT_ADDRESS is: "2754 Churchill Circle". The exact text in NEW_ADDRESS is: "O-89421". However, the results that are returned, does not include the NEW_ADDRESS (O-89421).

I have confirmed that dash in NEW_ADDRESS really matches the dash I'm using the search (ASCII 45).

Solution

You didn't mention why the code you provided doesn't work. CONTAINS is for use with SQL Server's full text search feature. If you're not using this, then you need to use a LIKE clause with wildcards:

CASE WHEN SUBSTRING(al.ALT_ADDRESS,1,1) IN('1','5','7') AND al.NEW_ADDRESS LIKE '%-%' 
THEN CONCAT(al.ALT_ADDRESS,al.NEW_ADDRESS)


Even if you are using full text search, the matching behavior with dashes can be unexpected and Microsoft recommends using LIKE instead.

You might double check that the character is really a hyphen, something like this should work:

SELECT 
    ASCII('-') as RealHypen,
    ASCII(SUBSTRING(NEW_ADDRESS, 1, 1))
FROM YourTable
WHERE ALT_ADDRESS = '2754 Churchill Circle';


As an aside, the LIKE expression above is not able to make use of the seeking abilities of a b-tree index in SQL Server, so performance may suffer on large tables if a large scanning operation is required. The best way to mitigate that (should it apply to you or others in a similar situation) depends heavily on context, but in general the main alternatives are:

  • Indexing a computed column that evaluates the search condition



  • Using triggers to persist the search result in advance



  • Use an external tool that's more suited to searching text (Elasticsearch is popular)



  • Use n-grams (typically for substrings of 3 characters or more)



  • Use Full Text Search (this won't help with wildcards per se, or this specific case, but can work for word-based searching).



If you are interested in exploring the performance aspect, please ask a follow up question.

Code Snippets

CASE WHEN SUBSTRING(al.ALT_ADDRESS,1,1) IN('1','5','7') AND al.NEW_ADDRESS LIKE '%-%' 
THEN CONCAT(al.ALT_ADDRESS,al.NEW_ADDRESS)
SELECT 
    ASCII('-') as RealHypen,
    ASCII(SUBSTRING(NEW_ADDRESS, 1, 1))
FROM YourTable
WHERE ALT_ADDRESS = '2754 Churchill Circle';

Context

StackExchange Database Administrators Q#238714, answer score: 14

Revisions (0)

No revisions yet.