patterncsharpModerate
Optimizing and improving a username regex
Viewed 0 times
usernameoptimizingandregeximproving
Problem
I have created this regular expression to validate usernames which I need in my projects:
It works just fine. But I'm wondering if there is any improvement and optimization for it, since I'm not exactly a regex-guy.
The regex and tests are available here.
Rules are:
Tests:
^(?=.{3,32}$)(?!.[._-]{2})(?!.[0-9]{5,})a-z[a-z0-9]$
It works just fine. But I'm wondering if there is any improvement and optimization for it, since I'm not exactly a regex-guy.
The regex and tests are available here.
Rules are:
- usernames should start with
[a-z]
- usernames should end with
[a-z0-9]
- usernames can have a length between 3 and 32
- usernames can contain any of
[a-z0-9\._-]
- Numbers should not be in the vicinity of each other more than 4 times. I mean
p1234is a match andp12345is not.
- each username can contains only one of
[\._-]. I mean a username can contain.or-or_
- each
.,-, and_should be followed by an alpha-numeric. I mean a.can not be followed by another.. They should not be in the vicinity of each other.
Tests:
j1vad-amiry match
j1vad-ami-ry match
ja23d_am8ry match
ja_23d_am8ry match
jav5d2.am3y match
jav.ad.amiry match
jav.ad.ami.ry.2 match
ja3fd4 match
page2491 match
page24915 not match
jav-ad_amiry not match
javad_am-iry not match
jav.ami-ry not match
jav.ami_ry not match
jav.ami__ry not match
2jav not match
2jav_ad not match
2jav_ad3 not match
Solution
I would very strongly recommend against using a regular expression for this. There is no clear mapping between the list of requirements you posted and the code.
Imagine another developer looking at this. Are they able to deduce the list of requirements? Given the list of requirements, are they able to verify that the regular expression is correct? How long would it take to convince them that it's correct?
In fact, it's not correct. The character class
Now suppose one of your requirements changes -- this code is going to be tricky to maintain.
Instead, I would recommend writing a series of tests, each corresponding to one of your requirements. Here is the code I came up with. It still uses regular expressions where appropriate, but very simple ones.
Now,
And then just check our rules like this:
Imagine another developer looking at this. Are they able to deduce the list of requirements? Given the list of requirements, are they able to verify that the regular expression is correct? How long would it take to convince them that it's correct?
In fact, it's not correct. The character class
\d is not the same as [0-9] (unless you specify ECMAScript-compliant behaviour), and your regexp matches the username j1vad-a٠miry (that's an ARABIC-INDIC DIGIT ZERO). (It doesn't match on the link you posted, but that's a Ruby, not .NET, regexp tester.)Now suppose one of your requirements changes -- this code is going to be tricky to maintain.
Instead, I would recommend writing a series of tests, each corresponding to one of your requirements. Here is the code I came up with. It still uses regular expressions where appropriate, but very simple ones.
public static bool IsValidUsername(string username)
{
if (username == null)
{
return false;
}
var length = username.Length;
if (length 32)
{
return false;
}
if (!IsLowerAlpha(username[0]))
{
return false;
}
if (!IsLowerAlphanumeric(username[length - 1]))
{
return false;
}
if (!Regex.IsMatch(username, "^[a-z0-9._-]*$"))
{
return false;
}
if (Regex.IsMatch(username, "[0-9]{5,}"))
{
return false;
}
// Each username can contain only one of '.', '_', '-'.
var punctuation = new [] { '.', '_', '-' };
if (punctuation.Count(c => username.Contains(c)) > 1)
{
return false;
}
// Each '.', '_', and '-' should be followed by an alpha-numeric.
for (var i = 0; i = 'a' && c = '0' && c <= '9');
}Now,
IsValidUsername is getting a bit long. We could split out each check into a separate static method, for instance,private static bool IsNotNull(string username)
{
return username != null;
}
private static bool IsInLengthRange(string username)
{
var length = username.Length;
return length >= 3 && length <= 32;
}And then just check our rules like this:
private static readonly Predicate[] Rules = new Predicate[]
{
IsNotNull,
IsInLengthRange,
StartsWithLowerAlpha,
...
};
public static bool IsValidUsername(string username)
{
return Rules.All(rule => rule(username));
}Code Snippets
public static bool IsValidUsername(string username)
{
if (username == null)
{
return false;
}
var length = username.Length;
if (length < 3 || length > 32)
{
return false;
}
if (!IsLowerAlpha(username[0]))
{
return false;
}
if (!IsLowerAlphanumeric(username[length - 1]))
{
return false;
}
if (!Regex.IsMatch(username, "^[a-z0-9._-]*$"))
{
return false;
}
if (Regex.IsMatch(username, "[0-9]{5,}"))
{
return false;
}
// Each username can contain only one of '.', '_', '-'.
var punctuation = new [] { '.', '_', '-' };
if (punctuation.Count(c => username.Contains(c)) > 1)
{
return false;
}
// Each '.', '_', and '-' should be followed by an alpha-numeric.
for (var i = 0; i < length - 1; i++)
{
if (punctuation.Contains(username[i]) && !IsLowerAlphanumeric(username[i + 1]))
{
return false;
}
}
return true;
}
private static bool IsLowerAlpha(char c)
{
return c >= 'a' && c <= 'z';
}
private static bool IsLowerAlphanumeric(char c)
{
return IsLowerAlpha(c) || (c >= '0' && c <= '9');
}private static bool IsNotNull(string username)
{
return username != null;
}
private static bool IsInLengthRange(string username)
{
var length = username.Length;
return length >= 3 && length <= 32;
}private static readonly Predicate<string>[] Rules = new Predicate<string>[]
{
IsNotNull,
IsInLengthRange,
StartsWithLowerAlpha,
...
};
public static bool IsValidUsername(string username)
{
return Rules.All(rule => rule(username));
}Context
StackExchange Code Review Q#55841, answer score: 18
Revisions (0)
No revisions yet.