patternsqlMajor
T SQL Table Valued Function to Split a Column on commas
Viewed 0 times
commascolumnsqlvaluedfunctionsplittable
Problem
I wrote a Table Valued Function in Microsoft SQL Server 2008 to take a comma delimited column in a database to spit out separate rows for each value.
Ex: "one,two,three,four" would return a new table with only one column containing the following values:
Does this code look error prone to you guys? When I test it with
it just runs forever and never returns anything. This is getting really disheartening especially since there are no built in split functions on MSSQL server (WHY WHY WHY?!) and all the similar functions I've found on the web are absolute trash or simply irrelevant to what I'm trying to do.
Here is the function:
Ex: "one,two,three,four" would return a new table with only one column containing the following values:
one
two
three
fourDoes this code look error prone to you guys? When I test it with
SELECT * FROM utvf_Split('one,two,three,four',',')it just runs forever and never returns anything. This is getting really disheartening especially since there are no built in split functions on MSSQL server (WHY WHY WHY?!) and all the similar functions I've found on the web are absolute trash or simply irrelevant to what I'm trying to do.
Here is the function:
USE *myDBname*
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER FUNCTION [dbo].[utvf_SPlit] (@String VARCHAR(MAX), @delimiter CHAR)
RETURNS @SplitValues TABLE
(
Asset_ID VARCHAR(MAX) NOT NULL
)
AS
BEGIN
DECLARE @FoundIndex INT
DECLARE @ReturnValue VARCHAR(MAX)
SET @FoundIndex = CHARINDEX(@delimiter, @String)
WHILE (@FoundIndex <> 0)
BEGIN
DECLARE @NextFoundIndex INT
SET @NextFoundIndex = CHARINDEX(@delimiter, @String, @FoundIndex+1)
SET @ReturnValue = SUBSTRING(@String, @FoundIndex,@NextFoundIndex-@FoundIndex)
SET @FoundIndex = CHARINDEX(@delimiter, @String)
INSERT @SplitValues (Asset_ID) VALUES (@ReturnValue)
END
RETURN
ENDSolution
I wouldn't do this with a loop; there are much better alternatives. By far the best, when you have to split, is CLR, and Adam Machanic's approach is the fastest I've tested.
Next best approach IMHO, if you can't implement CLR, is a numbers table:
... which allows this function:
I believe all of these will perform better than the function you have, when you get it working, especially since they are inline instead of multi-statement. I haven't investigated why yours isn't working, because I don't think it's worth it to get that function working.
But that all said...
Since you are using SQL Server 2008, is there a reason you need to split in the first place? I would rather use a TVP for this:
Now you can accept this as a parameter to your stored procedures, and use the contents just like you would use a TVF:
And you can pass a TVP directly from C# etc. as a DataTable. This will almost certainly outperform any of the solutions above, especially if you are building a comma-separated string in your app specifically so that your stored procedure can call a TVP to split it apart again. For a lot more info on TVPs see Erland Sommarskog's great article.
More recently, I've written a series on splitting strings:
And if you are using SQL Server 2016 or newer (or Azure SQL Database), there is a new
Next best approach IMHO, if you can't implement CLR, is a numbers table:
SET NOCOUNT ON;
DECLARE @UpperLimit INT = 1000000;
WITH n AS
(
SELECT
x = ROW_NUMBER() OVER (ORDER BY s1.[object_id])
FROM sys.all_objects AS s1
CROSS JOIN sys.all_objects AS s2
CROSS JOIN sys.all_objects AS s3
)
SELECT Number = x
INTO dbo.Numbers
FROM n
WHERE x BETWEEN 1 AND @UpperLimit
OPTION (MAXDOP 1); -- protecting from Paul White's observation
GO
CREATE UNIQUE CLUSTERED INDEX n ON dbo.Numbers(Number)
--WITH (DATA_COMPRESSION = PAGE);
GO... which allows this function:
CREATE FUNCTION dbo.SplitStrings_Numbers
(
@List NVARCHAR(MAX),
@Delimiter NVARCHAR(255)
)
RETURNS TABLE
WITH SCHEMABINDING
AS
RETURN
(
SELECT Item = SUBSTRING(@List, Number,
CHARINDEX(@Delimiter, @List + @Delimiter, Number) - Number)
FROM dbo.Numbers
WHERE Number <= CONVERT(INT, LEN(@List))
AND SUBSTRING(@Delimiter + @List, Number, 1) = @Delimiter
);
GOI believe all of these will perform better than the function you have, when you get it working, especially since they are inline instead of multi-statement. I haven't investigated why yours isn't working, because I don't think it's worth it to get that function working.
But that all said...
Since you are using SQL Server 2008, is there a reason you need to split in the first place? I would rather use a TVP for this:
CREATE TYPE dbo.strings AS TABLE
(
string NVARCHAR(4000)
);Now you can accept this as a parameter to your stored procedures, and use the contents just like you would use a TVF:
CREATE PROCEDURE dbo.foo
@strings dbo.strings READONLY
AS
BEGIN
SET NOCOUNT ON;
SELECT Asset_ID = string FROM @strings;
-- SELECT Asset_ID FROM dbo.utvf_split(@other_param, ',');
ENDAnd you can pass a TVP directly from C# etc. as a DataTable. This will almost certainly outperform any of the solutions above, especially if you are building a comma-separated string in your app specifically so that your stored procedure can call a TVP to split it apart again. For a lot more info on TVPs see Erland Sommarskog's great article.
More recently, I've written a series on splitting strings:
- http://sqlperformance.com/2012/07/t-sql-queries/split-strings
- http://sqlperformance.com/2012/08/t-sql-queries/splitting-strings-follow-up
- http://sqlperformance.com/2012/08/t-sql-queries/splitting-strings-now-with-less-t-sql
And if you are using SQL Server 2016 or newer (or Azure SQL Database), there is a new
STRING_SPLIT function, which I blogged about here:- Performance Surprises and Assumptions : STRING_SPLIT()
- STRING_SPLIT() in SQL Server 2016 : Follow-Up #1
- STRING_SPLIT() in SQL Server 2016 : Follow-Up #2
Code Snippets
SET NOCOUNT ON;
DECLARE @UpperLimit INT = 1000000;
WITH n AS
(
SELECT
x = ROW_NUMBER() OVER (ORDER BY s1.[object_id])
FROM sys.all_objects AS s1
CROSS JOIN sys.all_objects AS s2
CROSS JOIN sys.all_objects AS s3
)
SELECT Number = x
INTO dbo.Numbers
FROM n
WHERE x BETWEEN 1 AND @UpperLimit
OPTION (MAXDOP 1); -- protecting from Paul White's observation
GO
CREATE UNIQUE CLUSTERED INDEX n ON dbo.Numbers(Number)
--WITH (DATA_COMPRESSION = PAGE);
GOCREATE FUNCTION dbo.SplitStrings_Numbers
(
@List NVARCHAR(MAX),
@Delimiter NVARCHAR(255)
)
RETURNS TABLE
WITH SCHEMABINDING
AS
RETURN
(
SELECT Item = SUBSTRING(@List, Number,
CHARINDEX(@Delimiter, @List + @Delimiter, Number) - Number)
FROM dbo.Numbers
WHERE Number <= CONVERT(INT, LEN(@List))
AND SUBSTRING(@Delimiter + @List, Number, 1) = @Delimiter
);
GOCREATE TYPE dbo.strings AS TABLE
(
string NVARCHAR(4000)
);CREATE PROCEDURE dbo.foo
@strings dbo.strings READONLY
AS
BEGIN
SET NOCOUNT ON;
SELECT Asset_ID = string FROM @strings;
-- SELECT Asset_ID FROM dbo.utvf_split(@other_param, ',');
ENDContext
StackExchange Database Administrators Q#21078, answer score: 20
Revisions (0)
No revisions yet.