patternphpMinor
Finding an exact "phrase" with a given string (as typed/in order)
Viewed 0 times
exactorderwithtypedphrasefindingstringgiven
Problem
$input, 'FOUND' => 1,
'VALUEofFOUND' => $phrase[$i]];
} else {
$out[] = 'Not found';
}
$i++;
}
print_r($out);
} //end function
$this->filterExactPhrase("this is a test, foobar", "foobar");Calling the function like this yields
Array ( [0] => Array ( [INPUT] => this is a test, foobar [FOUND] => 1 [VALUEofFOUND] => foobar ) )
Giving the function
foo as $input instead of foobar yields$this->filterExactPhrase("this is a test, foo", "foobar");Array ( [0] => Not found )
I thought this was quite interesting to find as I was looking for a solution to find a very specific phrase with spaces, in a extensively long, log file to remove from it.
Solution
Basically, Alex has already made some good suggestions (especially the bit about using
The string could contain special regex chars (
Yes, it's evil, but people still seem hellbent on using regex's to consume markup, so your code should either check and throw an exception if it's used to do that, or you should defend against it. The example above will generate an error, because the string `
For all you know, I might want to call this method, store the data somewhere, and send something entirely different to the output stream. In short: a function/method should return data, not print/echo it
This dogma doesn't apply for methods in a renderer component or view class, of course. But it holds true for data-processing code units
\b). However, your function will fail in certain cases like this one://looking for chars like *, +, ? and such
filterExactPhrase("some *markdown* string", "*markdown*");The string could contain special regex chars (
. \ + * ? [ ^ ] $ ( ) { } = ! | : -), or the delimiter you use://this is operator error (regex + markup don't mix)
filterExactPhrase('foobar', '');Yes, it's evil, but people still seem hellbent on using regex's to consume markup, so your code should either check and throw an exception if it's used to do that, or you should defend against it. The example above will generate an error, because the string `
is concatenated into the regex raw, so you end up with this:
/\b()\b/
/\b( faulty regex
hi>\b/ -> unknown and invalid flags
Another thing to think of is that people, once they find out the function uses a regex, will start passing regular expressions instead of a string to it. Kind of like people entering SQL wildcards in search forms (stuff like foo%).
filterExactPhrase("some string with words and 123 numbers", "[\w\d]+");
So how do you go about this? simple: preg_quote filters the input for you, and escapes whatever chars need escaping. Basically, all I'm trying to say is change this:
$numFound = preg_match_all("/\b(" . $phrase . ")\b/", $input);
to this:
$escaped = preg_quote($phrase, '/');//second param is the delimiter
$numFound = preg_match_all("/\b(" . $phrase . ")\b/", $input);
Now chars like + or * are escaped properly, and so are the delimiters.
The other thing I'd suggest is to remove the print_r` from your function/method. I realize that it's probably there for debugging purposes, but still: a function/method does one thing. In this case its job is to process a piece of string, and find exact matches of another string. Whether or not that data should be shown (displayed, echoed or printed or whatever) is not a call this method should make. It's not aware of output buffers, headers that might be set later on, so it shouldn't forcibly generate output.For all you know, I might want to call this method, store the data somewhere, and send something entirely different to the output stream. In short: a function/method should return data, not print/echo it
This dogma doesn't apply for methods in a renderer component or view class, of course. But it holds true for data-processing code units
Code Snippets
//looking for chars like *, +, ? and such
filterExactPhrase("some *markdown* string", "*markdown*");//this is operator error (regex + markup don't mix)
filterExactPhrase('<h1>foobar</h1>', '</h1>');/\b(</h1>)\b/
/\b(</ -> faulty regex
hi>\b/ -> unknown and invalid flagsfilterExactPhrase("some string with words and 123 numbers", "[\w\d]+");$numFound = preg_match_all("/\b(" . $phrase . ")\b/", $input);Context
StackExchange Code Review Q#92939, answer score: 4
Revisions (0)
No revisions yet.