HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Separate letters from digits in alphanumeric string

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
digitsalphanumericseparatefromstringletters

Problem

I have an alphanumeric string as input and I want to get two results out of it:

-
a string where all the numerals are removed

and

-
an integer that is the sum of all the digits in the input string.

For instance, for this input:

GR35hc7vdH35


I want the following output:

| Col1.         |        Col2    |
----------------------------------
| GRhcvdH       |        23      |


How can do this?

Solution

SQL Server doesn't support replacing patterns of multiple characters - so to do it via REPLACE would take potentially 10 operations.

With that in mind one way of doing it would be a recursive CTE to process the digits 0-9 sequentially.

It does the replacement and then checks the length of the before and after strings to know how many characters of that number there were and what needs to be added on to the total.

DECLARE @Input VARCHAR(8000) = 'GR35hc7vdH35';

WITH R(Level,Input,Accumulator,StringLength)
     AS (SELECT 0,
                Input,
                0,
                DATALENGTH(Input)
         FROM   (SELECT REPLACE(@Input, '0', '')) D(Input)
         UNION ALL
         SELECT NewLevel,
                NewInput,
                Accumulator + NewLevel * ( StringLength - NewStringLength ),
                NewStringLength
         FROM   R
                CROSS APPLY (SELECT Level + 1) C(NewLevel)
                CROSS APPLY (SELECT REPLACE(Input, NewLevel, '')) C2(NewInput)
                CROSS APPLY (SELECT DATALENGTH(NewInput)) C3(NewStringLength)
         WHERE  NewLevel <= 9)
SELECT Input       AS Col1,
       Accumulator AS Col2
FROM   R
WHERE  Level = 9;


Or you could use CLR and regular expressions (SQL Server 2012 compatible version).

using System;
using System.Data.SqlTypes;
using System.Collections;
using System.Text.RegularExpressions;

public partial class UserDefinedFunctions
{
private static readonly Regex digitRegex = new Regex(@"[\d]", RegexOptions.Compiled);

[Microsoft.SqlServer.Server.SqlFunction(FillRowMethodName = "FillRow",
TableDefinition = @"Stripped NVARCHAR(MAX),
Total INT")]

public static IEnumerable ReplaceAndTotalise(SqlString input)
{
if (!input.IsNull)
{
int total = 0;
string stripped = digitRegex.Replace((string)input, match =>
{
total += int.Parse(match.Value);
return string.Empty;
});

yield return new Tuple(stripped, total);
}
}

public static void FillRow(object resultObject, out SqlString stripped, out SqlInt32 total)
{
var result = (Tuple)resultObject;
stripped = result.Item1;
total = result.Item2;

}
}


Example Usage

SELECT Stripped,
       Total
FROM   [dbo].[ReplaceAndTotalise]('GR35hc7vdH35')

Code Snippets

DECLARE @Input VARCHAR(8000) = 'GR35hc7vdH35';

WITH R(Level,Input,Accumulator,StringLength)
     AS (SELECT 0,
                Input,
                0,
                DATALENGTH(Input)
         FROM   (SELECT REPLACE(@Input, '0', '')) D(Input)
         UNION ALL
         SELECT NewLevel,
                NewInput,
                Accumulator + NewLevel * ( StringLength - NewStringLength ),
                NewStringLength
         FROM   R
                CROSS APPLY (SELECT Level + 1) C(NewLevel)
                CROSS APPLY (SELECT REPLACE(Input, NewLevel, '')) C2(NewInput)
                CROSS APPLY (SELECT DATALENGTH(NewInput)) C3(NewStringLength)
         WHERE  NewLevel <= 9)
SELECT Input       AS Col1,
       Accumulator AS Col2
FROM   R
WHERE  Level = 9;
SELECT Stripped,
       Total
FROM   [dbo].[ReplaceAndTotalise]('GR35hc7vdH35')

Context

StackExchange Database Administrators Q#137752, answer score: 8

Revisions (0)

No revisions yet.