HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Matching a ] (closing square bracket) with PATINDEX using the "[ ]" wildcard

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
thewildcardwithpatindexusingsquarebracketclosingmatching

Problem

I am writing a custom JSON parser in T-SQL†.

For the purpose of my parser, I am using the PATINDEX function that calculates the position of a token from a list of tokens. The tokens in my case are all single characters and they include these:
{ } [ ] : ,

Usually, when I need to find the (first) position of any of several given characters, I use the PATINDEX function like this:

PATINDEX('%[abc]%', SourceString)


The function will then give me the first position of a or b or c – whichever happens to be found first – in SourceString.

Now the problem in my case seems to be connected with the ] character. As soon as I specify it in the character list, e.g. like this:

PATINDEX('%[[]{}:,]%', SourceString)


my intended pattern apparently becomes broken, because the function never finds a match. It looks like I need a way to escape the first ] so that PATINDEX treats it as one of the lookup characters rather than a special symbol.

I have found this question asking about a similar problem:

  • Need help with LIKE operator and square brackets



However, in that case the ] simply does not need to be specified in brackets, because it is just one character and it can be specified without brackets around them. The alternative solution, which does use escaping, works only for LIKE and not for PATINDEX, because it uses an ESCAPE subclause, supported by the former and not by the latter.

So, my question is, is there any way to look for a ] with PATINDEX using the [ ] wildcard? Or is there a way to emulate that functionality using other Transact-SQL tools?
Additional Information

Here is an example of a query where I need to use PATINDEX with the […] pattern as above. The pattern here works (albeit somewhat) because it does not include the ] character. I need it to work with ] as well:

```
WITH
data AS (SELECT CAST('{"f1":["v1","v2"],"f2":"v3"}' AS varchar(max)) AS ResponseJSON),
parser AS
(
SELECT
Level

Solution

My own solution, which is more of a workaround, consisted in specifying a character range that included the ] and using that range along with the other characters in the [ ] wildcard. I used a range based on the ASCII table. According to that table, the ] character is located in the following neighbourhood:

Hex Dec Char
--- --- ----

5A 90 Z
5B 91 [
5C 92 \
5D 93 ]
5E 94 ^
5F 95 _


My range, therefore, took the form of [-^, i.e. it included four characters: [, \, ], ^. I also specified that the pattern use a Binary collation, to match the ASCII range exactly. The resulting PATINDEX expression ended up looking like this:

PATINDEX('%[[-^{}:,]%' COLLATE Latin1_General_BIN2, MyJSONString)


The obvious problem with this approach is that the range at the beginning of the pattern includes two unwanted characters, \ and ^. The solution worked for me simply because the extra characters could never occur in the specific JSON strings I needed to parse. Naturally, this cannot be true in general, so I am still interested in other methods, hopefully more universal than mine.

Code Snippets

PATINDEX('%[[-^{}:,]%' COLLATE Latin1_General_BIN2, MyJSONString)

Context

StackExchange Database Administrators Q#206481, answer score: 8

Revisions (0)

No revisions yet.