HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMajor

STRING_SPLIT with a multiple-character separator?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
withstring_splitcharacterseparatormultiple

Problem

SQL Server 2016 introduced STRING_SPLIT which is very fast and an excellent replacement for any homemade implementation people would roll before 2016.

Unfortunately, STRING_SPLIT only supports a single-character separator, which isn't always enough. Does anyone know of a good implementation that allows for using multiple characters in the separator?

Solution

Well, you can always use REPLACE to add a single-character delimiter to the argument before passing it in. You just need to choose a character that is unlikely/impossible to appear in the actual data. In this example, let's say your original data uses three pipes as a delimiter; I chose a Unicode character at random to substitute:

DECLARE 
  @olddelim nvarchar(32) = N'|||', 
  @newdelim nchar(1)     = NCHAR(9999); -- pencil (✏)

DECLARE @x nvarchar(max) = N'foo|||bar|||blat|||splunge';

SELECT * FROM STRING_SPLIT(REPLACE(@x, @olddelim, @newdelim), @newdelim);


I blogged about this in more detail here:

  • Dealing with the single-character delimiter in STRING_SPLIT



Addressing a comment:


bad solution. what if original string is like 'abc||pqr|||rst||123' (dynamic and can contain anything). desired o/p is 'abc||pqr' and 'rst||123' but your solution will give 'abc' 'pqr' 'rst' '123'

Okay, let's take your input and see if my solution gets the wrong output.

DECLARE 
  @olddelim nvarchar(32) = N'|||', 
  @newdelim nchar(1)     = NCHAR(9999); -- pencil (✏)

DECLARE @x nvarchar(max) = N'abc||pqr|||rst||123';

SELECT * FROM STRING_SPLIT(REPLACE(@x, @olddelim, @newdelim), @newdelim);


Result is:

abc||pqr
rst||123


And not what you must have assumed (but didn't test) this:

abc
pqr
rst
123


If your data is in a table, you could create a view so that you don't have to factor that expression into all of your queries.

If that doesn't work, because you might have a pencil () in your data, and you can't find a single character in the 1,111,998 available Unicode characters that won't be in your data set, you'll have to skip STRING_SPLIT(), since it is hard-coded to accept a single character delimiter (separator
Is a single character expression
).

Alternatives have been answered here dozens of times before, many before STRING_SPLIT() existed. Those methods still work.

I go over many alternatives, and also discuss the limitations in STRING_SPLIT(), in this series (I also discuss why you might consider not doing this in T-SQL using any method at all):

  • Split strings the right way – or the next best way



  • Splitting Strings : A Follow-Up



  • Splitting Strings : Now with less T-SQL



  • Comparing string splitting / concatenation methods



  • Performance Surprises and Assumptions : STRING_SPLIT()



  • STRING_SPLIT() in SQL Server 2016 : Follow-Up #1



  • STRING_SPLIT() in SQL Server 2016 : Follow-Up #2

Code Snippets

DECLARE 
  @olddelim nvarchar(32) = N'|||', 
  @newdelim nchar(1)     = NCHAR(9999); -- pencil (✏)

DECLARE @x nvarchar(max) = N'foo|||bar|||blat|||splunge';

SELECT * FROM STRING_SPLIT(REPLACE(@x, @olddelim, @newdelim), @newdelim);
DECLARE 
  @olddelim nvarchar(32) = N'|||', 
  @newdelim nchar(1)     = NCHAR(9999); -- pencil (✏)

DECLARE @x nvarchar(max) = N'abc||pqr|||rst||123';

SELECT * FROM STRING_SPLIT(REPLACE(@x, @olddelim, @newdelim), @newdelim);
abc||pqr
rst||123
abc
pqr
rst
123

Context

StackExchange Database Administrators Q#175764, answer score: 30

Revisions (0)

No revisions yet.