patterncsharpMinor
Negative Lookbehind Regex
Viewed 0 times
lookbehindregexnegative
Problem
I have the following code which attempts to match all strings like "SOMESTRING" (which can include numeric values), but not "SOMESTRING". For this I am using a negative lookahead as follows;
Output:
*SEX
*AN01ZORA
This seems to produce the correct output, but feels nasty and not correct. Is this right and what could I do to make the
SEX and AN01ZORA should match, \PCCL\* should not match.string s = " if 'L,....' MDC = '13' Then " +
" if 'B,960.' SEX NOT = '2' AND *SEX NOT = '3' Then " +
" DRG = 960Z (UNGROUPABLE) " +
" GoTo MDC FldErr " +
"Else if 'B,N01.' SRG IN TABLE(*AN01ZORA) Then " +
" if '.,N01.' *PCCL* > 2 Then ";
Regex rr = new Regex(@"(?i)(?!\*\w+\*)\*\w+");
MatchCollection mc = rr.Matches(s);
foreach (Match m in mc)
m.ToString().Dump();Output:
*SEX
*AN01ZORA
This seems to produce the correct output, but feels nasty and not correct. Is this right and what could I do to make the
Regex better?Solution
Your regex is overly complicated, I must admit. The negative lookahead is going to do a lot of work to identify all the negative cases before even looking for (nearly) positive matches.
I think the trick you are missing is the word-boundary anchor. Consider the following regex:
This looks for an asterisk, followed by characters, and then a (zero length) word-boundary. Now, both
Look for
Here's a little demonstration ....
Edit: Note, there is no reason to add the case-insensitive switch (
I think the trick you are missing is the word-boundary anchor. Consider the following regex:
\*\w+\bThis looks for an asterisk, followed by characters, and then a (zero length) word-boundary. Now, both
SOME and *SOME match that, since the \b happens before the asterisk. The negative lookahead would be useful after the word-boundary. Consider the following:\*\w+\b(?!\*)Look for
*SOME where the SOME is a complete word not followed by an asterisk.Here's a little demonstration ....
Edit: Note, there is no reason to add the case-insensitive switch (
(?i)) because your regular expression has no specific case-based characters.Code Snippets
\*\w+\b(?!\*)Context
StackExchange Code Review Q#54795, answer score: 5
Revisions (0)
No revisions yet.