patternphpMinor
Capturing optional regex segment with PHP
Viewed 0 times
capturingwithphpsegmentoptionalregex
Problem
I need to check the end of a URL for the possible existence of /news_archive or /news_archive/5 in PHP. The below snippet does exactly what I want, but I know that I could achieve this with one
preg_match rather than two. How can I improve this code to treat the /5 as an optional segment and capture it if it exists?if (preg_match('~/[0-9A-Za-z_-]+_archive/[0-9]+$~', $_SERVER['HTTP_REFERER'], $matches) || preg_match('~/[0-9A-Za-z_-]+_archive$~', $_SERVER['HTTP_REFERER'], $matches)) {
$page_info['parent_page']['page_label'] = ltrim($matches[0], '/');
}Solution
Consider your first pattern:
Let's break it down:
So basically you want to make #4 and #5 optional. To be more specific, you want either both 4 and 5, or neither 4 nor 5.
Consider this:
This means that you have one
Letting a be #4 and b be digits like in #5, we're left with:
Or:
This will capture the entire group though, like
You can just add another group to remedy that though:
Example:
You could technically make the outside group a non-capturing group (like
(By the way, sorry if you're familiar with regex and you found this excessive. I tend to take a very verbose approach to any regex related question :).)
~/[0-9A-Za-z_-]+_archive/[0-9]+$~Let's break it down:
/a literal string/
[0-9A-Za-z_-]+one or more of0-9,A-Z,a-z,_or-
_archivea literal string_archive
/literal slash again
[0-9]+one or more digits
$the end of the string must follow the one or more digits
So basically you want to make #4 and #5 optional. To be more specific, you want either both 4 and 5, or neither 4 nor 5.
Consider this:
(a[b]+)?This means that you have one
a followed by one or more b, and that this grouped a/b entity is optional.Letting a be #4 and b be digits like in #5, we're left with:
(/[0-9]+)?Or:
~/[0-9A-Za-z_-]+_archive(/[0-9]+)?$~This will capture the entire group though, like
/5:php -r "preg_match('~/[0-9A-Za-z_-]+_archive(/([0-9]+))?$~', '/news_archive/5', $m); var_dump($m);"
array(2) {
[0] =>
string(15) "/news_archive/5"
[1] =>
string(2) "/5"
}You can just add another group to remedy that though:
~/[0-9A-Za-z_-]+_archive(/([0-9]+))?$~Example:
php -r "preg_match('~/[0-9A-Za-z_-]+_archive(/([0-9]+))?$~', '/news_archive/44', $m); var_dump($m);"
array(3) {
[0] =>
string(16) "/news_archive/44"
[1] =>
string(3) "/44"
[2] =>
string(2) "44"
}You could technically make the outside group a non-capturing group (like
(?:/([0-9]+))?), but I don't think the added complication is worth not grabbing the / part too.(By the way, sorry if you're familiar with regex and you found this excessive. I tend to take a very verbose approach to any regex related question :).)
Code Snippets
~/[0-9A-Za-z_-]+_archive/[0-9]+$~~/[0-9A-Za-z_-]+_archive(/[0-9]+)?$~php -r "preg_match('~/[0-9A-Za-z_-]+_archive(/([0-9]+))?$~', '/news_archive/5', $m); var_dump($m);"
array(2) {
[0] =>
string(15) "/news_archive/5"
[1] =>
string(2) "/5"
}~/[0-9A-Za-z_-]+_archive(/([0-9]+))?$~php -r "preg_match('~/[0-9A-Za-z_-]+_archive(/([0-9]+))?$~', '/news_archive/44', $m); var_dump($m);"
array(3) {
[0] =>
string(16) "/news_archive/44"
[1] =>
string(3) "/44"
[2] =>
string(2) "44"
}Context
StackExchange Code Review Q#16230, answer score: 3
Revisions (0)
No revisions yet.