HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Matching left and right single-quotes used as apostophes

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
leftusedapostophesquotessingleandmatchingright

Problem

I have four columns containing names and want to search these using a LIKE in a Microsoft SQL Server environment.

The complication comes that names may include left and right single-quotes / angled apostrophes (i.e. and , char(145) and char(146) respectively), which should match a straight apostrophe (i.e. ', char(39))

Doing the following is very slow:

SELECT person_id
FROM person
WHERE REPLACE(
          REPLACE(
              person_name, 
              CHAR(145), 
              CHAR(39)
          ), 
          CHAR(146), 
          CHAR(39)
      ) LIKE '{USER_INPUT}'


As explained in SQL replace statement too slow on Stack Overflow, this is because the use of REPLACE makes the statement unsargable.

Is there a way that SQL Server can handle situations like this in a better way?

One solution which has been proposed is to have the application generate a 'searchable' value which concatenates all of the fields (person_name, person_surname, person_nickname, etc.) and converts the problematic characters at the point of editing. This could be effectively indexed and searched. Storing this data in a separate SQL table/column would require less application rewrite than implementing a full NoSQL solution like Lucene.

The example above is a simplification: the query doesn't literally get built as I explained above and we do implement SQL injection (and other) protections.

The question is how to replace the angled-apostrophes with straight ones in the table data. To clarify:

  • User supplies O‘Malley - this should match both O‘Malley or O'Malley



  • User supplies O'Malley - this should match both O‘Malley or O'Malley



We need to replace the SQL data, not the user's input. We can convert the user input on the way through the application so that if they input angled apostrophes we change them to simple apostrophes before passing in to SQL. It's the data in SQL we need to standardise.

Unfortunately the data must stay in

Solution

The best way to handle your issue (and avoid SQL injection) is to pass in your user input as a variable. Since you are using a LIKE you can do something like this:

CREATE TABLE #person (person_name nvarchar(50))
INSERT INTO #person VALUES (N'Bob'),(N'Bo''b'),(N'Bo‘b'),(N'Bo’b'),(N'Bo#b'),(N'Bo^b')

DECLARE @user_input nvarchar(50) = 'Bo’b'

SET @user_input = REPLACE(
                        REPLACE(
                                REPLACE(@user_input, N'‘', N''''), 
                                N'’', N''''), 
                                N'''', N'[‘’'']')

-- @user_input now == Bo[‘’']b

SELECT person_name
FROM #person
WHERE person_name LIKE @user_input


Basically this replaces all of the different 's with a single type (the ') and then puts []'s around all three so that they get used in the LIKE.

Code Snippets

CREATE TABLE #person (person_name nvarchar(50))
INSERT INTO #person VALUES (N'Bob'),(N'Bo''b'),(N'Bo‘b'),(N'Bo’b'),(N'Bo#b'),(N'Bo^b')

DECLARE @user_input nvarchar(50) = 'Bo’b'

SET @user_input = REPLACE(
                        REPLACE(
                                REPLACE(@user_input, N'‘', N''''), 
                                N'’', N''''), 
                                N'''', N'[‘’'']')

-- @user_input now == Bo[‘’']b

SELECT person_name
FROM #person
WHERE person_name LIKE @user_input

Context

StackExchange Database Administrators Q#150899, answer score: 8

Revisions (0)

No revisions yet.