HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMajor

T SQL Table Valued Function to Split a Column on commas

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
commascolumnsqlvaluedfunctionsplittable

Problem

I wrote a Table Valued Function in Microsoft SQL Server 2008 to take a comma delimited column in a database to spit out separate rows for each value.

Ex: "one,two,three,four" would return a new table with only one column containing the following values:

one
two
three
four


Does this code look error prone to you guys? When I test it with

SELECT * FROM utvf_Split('one,two,three,four',',')


it just runs forever and never returns anything. This is getting really disheartening especially since there are no built in split functions on MSSQL server (WHY WHY WHY?!) and all the similar functions I've found on the web are absolute trash or simply irrelevant to what I'm trying to do.

Here is the function:

USE *myDBname*
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER FUNCTION [dbo].[utvf_SPlit] (@String VARCHAR(MAX), @delimiter CHAR)

RETURNS @SplitValues TABLE
(
    Asset_ID VARCHAR(MAX) NOT NULL
)

AS
BEGIN
            DECLARE @FoundIndex INT
            DECLARE @ReturnValue VARCHAR(MAX)

            SET @FoundIndex = CHARINDEX(@delimiter, @String)

            WHILE (@FoundIndex <> 0)
            BEGIN
                  DECLARE @NextFoundIndex INT
                  SET @NextFoundIndex = CHARINDEX(@delimiter, @String, @FoundIndex+1)
                  SET @ReturnValue = SUBSTRING(@String, @FoundIndex,@NextFoundIndex-@FoundIndex)
                  SET @FoundIndex = CHARINDEX(@delimiter, @String)
                  INSERT @SplitValues (Asset_ID) VALUES (@ReturnValue)
            END

            RETURN
END

Solution

I wouldn't do this with a loop; there are much better alternatives. By far the best, when you have to split, is CLR, and Adam Machanic's approach is the fastest I've tested.

Next best approach IMHO, if you can't implement CLR, is a numbers table:

SET NOCOUNT ON;

DECLARE @UpperLimit INT = 1000000;

WITH n AS
(
    SELECT
        x = ROW_NUMBER() OVER (ORDER BY s1.[object_id])
    FROM       sys.all_objects AS s1
    CROSS JOIN sys.all_objects AS s2
    CROSS JOIN sys.all_objects AS s3
)
SELECT Number = x
  INTO dbo.Numbers
  FROM n
  WHERE x BETWEEN 1 AND @UpperLimit
  OPTION (MAXDOP 1); -- protecting from Paul White's observation

GO
CREATE UNIQUE CLUSTERED INDEX n ON dbo.Numbers(Number) 
    --WITH (DATA_COMPRESSION = PAGE);
GO


... which allows this function:

CREATE FUNCTION dbo.SplitStrings_Numbers
(
   @List       NVARCHAR(MAX),
   @Delimiter  NVARCHAR(255)
)
RETURNS TABLE
WITH SCHEMABINDING
AS
   RETURN
   (
       SELECT Item = SUBSTRING(@List, Number, 
         CHARINDEX(@Delimiter, @List + @Delimiter, Number) - Number)
       FROM dbo.Numbers
       WHERE Number <= CONVERT(INT, LEN(@List))
         AND SUBSTRING(@Delimiter + @List, Number, 1) = @Delimiter
   );
GO


I believe all of these will perform better than the function you have, when you get it working, especially since they are inline instead of multi-statement. I haven't investigated why yours isn't working, because I don't think it's worth it to get that function working.

But that all said...

Since you are using SQL Server 2008, is there a reason you need to split in the first place? I would rather use a TVP for this:

CREATE TYPE dbo.strings AS TABLE
(
  string NVARCHAR(4000)
);


Now you can accept this as a parameter to your stored procedures, and use the contents just like you would use a TVF:

CREATE PROCEDURE dbo.foo
  @strings dbo.strings READONLY
AS
BEGIN
  SET NOCOUNT ON;

  SELECT Asset_ID = string FROM @strings;
  -- SELECT Asset_ID FROM dbo.utvf_split(@other_param, ',');
END


And you can pass a TVP directly from C# etc. as a DataTable. This will almost certainly outperform any of the solutions above, especially if you are building a comma-separated string in your app specifically so that your stored procedure can call a TVP to split it apart again. For a lot more info on TVPs see Erland Sommarskog's great article.

More recently, I've written a series on splitting strings:

  • http://sqlperformance.com/2012/07/t-sql-queries/split-strings



  • http://sqlperformance.com/2012/08/t-sql-queries/splitting-strings-follow-up



  • http://sqlperformance.com/2012/08/t-sql-queries/splitting-strings-now-with-less-t-sql



And if you are using SQL Server 2016 or newer (or Azure SQL Database), there is a new STRING_SPLIT function, which I blogged about here:

  • Performance Surprises and Assumptions : STRING_SPLIT()



  • STRING_SPLIT() in SQL Server 2016 : Follow-Up #1



  • STRING_SPLIT() in SQL Server 2016 : Follow-Up #2

Code Snippets

SET NOCOUNT ON;

DECLARE @UpperLimit INT = 1000000;

WITH n AS
(
    SELECT
        x = ROW_NUMBER() OVER (ORDER BY s1.[object_id])
    FROM       sys.all_objects AS s1
    CROSS JOIN sys.all_objects AS s2
    CROSS JOIN sys.all_objects AS s3
)
SELECT Number = x
  INTO dbo.Numbers
  FROM n
  WHERE x BETWEEN 1 AND @UpperLimit
  OPTION (MAXDOP 1); -- protecting from Paul White's observation

GO
CREATE UNIQUE CLUSTERED INDEX n ON dbo.Numbers(Number) 
    --WITH (DATA_COMPRESSION = PAGE);
GO
CREATE FUNCTION dbo.SplitStrings_Numbers
(
   @List       NVARCHAR(MAX),
   @Delimiter  NVARCHAR(255)
)
RETURNS TABLE
WITH SCHEMABINDING
AS
   RETURN
   (
       SELECT Item = SUBSTRING(@List, Number, 
         CHARINDEX(@Delimiter, @List + @Delimiter, Number) - Number)
       FROM dbo.Numbers
       WHERE Number <= CONVERT(INT, LEN(@List))
         AND SUBSTRING(@Delimiter + @List, Number, 1) = @Delimiter
   );
GO
CREATE TYPE dbo.strings AS TABLE
(
  string NVARCHAR(4000)
);
CREATE PROCEDURE dbo.foo
  @strings dbo.strings READONLY
AS
BEGIN
  SET NOCOUNT ON;

  SELECT Asset_ID = string FROM @strings;
  -- SELECT Asset_ID FROM dbo.utvf_split(@other_param, ',');
END

Context

StackExchange Database Administrators Q#21078, answer score: 20

Revisions (0)

No revisions yet.