HiveBrain v1.2.0
Get Started
← Back to all entries
patternphpModerate

Regex to validate font names

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
validateregexfontnames

Problem

/((([\w -]+)|("[\w -]+"))( *, *)?)+/


http://refiddle.com/18ql

I'm trying to use a PHP regex to sanitize a user input for a list of fonts. The above one seems to work nicely, but also feels a bit long and redundant, having two [\w -]*s, and allowing a trailing comma, and possibly stuff I can't see. Can this be more efficient (smaller, less redundant, more secure)?

Examples

The regex should match these:

Roboto
"Roboto"
"Roboto Condensed"
Roboto Condensed
Roboto Condensed, Roboto
"Roboto Condensed", Roboto
Roboto Condensed, "Roboto"
"Roboto Condensed", "Roboto", sans-serif


or any string that matches the W3C specification for a CSS font-family property

It must be able to match these, but not any preceding (e.g. /*) or succeeding (e.g. ; DROP TABLE foo) string that might cause errors and open up exploits. So, it should match the font-family list (bold) in the following example, but not the surrounding potentially malicious characters.


/*"Roboto Slab", "Helvetica Neue", "Arial", sans-serif; DROP TABLE foo --

Similarly, I want it to also remove strings that someone might not know would produce broken code:


font-family:"Roboto Slab", "Helvetica Neue", "Arial", sans-serif; / My fonts /

My input-handling function:

$userFontName = sanitizeFontName($_GET['font'])

function sanitizeFontName($fontName)
{
$match = null;
preg_match('/((([\w -]+)|("[\w -]+"))( , )?)+/', $fontName, $match); // Only get a W3C font list. Proven here: http://refiddle.com/18ql
return $match[0]; // only one
}


Demo

Solution

Your approach is completely wrong.

There is a difference between escaping and validation

Validation is the act of making sure a piece of input conforms to certain rules (be it logical, or business rules). It should not be used for security. You can't catch all possible edge cases with a blacklist, don't try.

Escaping (or Sanitization) is the act of escaping characters with special meaning and replacing them with another character or a string that will have the meaning of the literal character in the target context (for example, replacing < with &lt; in HTML). Escaping can and should be used for making sure nothing with special meaning enters the target context (HTML, MySQL query, JavaScript, etc).

In your case, the context is a MySQL query, you're trying to prevent SQL injection. The solution to SQL injection is not validation. It's prepared statements. So:

$pdo = //Instantiate PDO connection.
$stmt = $pdo->prepare("INSERT INTO fonts(id) VALUES (:id)");
$stmt->bindValue(":id", "1; DROP TABLE fonts -- "); //Unsafe input
$stmt->execute(); //Will not drop the table. No results will be returned.


With an insert, the entire string 1; DROP TABLE fonts -- will be inserted.

Sometime in the future (when you try to fetch the font from the database), and before you output it into CSS, you'll want to escape the string for this new context. CSS escaping is something that hasn't a native PHP API, but there are probably libraries around.

Code Snippets

$pdo = //Instantiate PDO connection.
$stmt = $pdo->prepare("INSERT INTO fonts(id) VALUES (:id)");
$stmt->bindValue(":id", "1; DROP TABLE fonts -- "); //Unsafe input
$stmt->execute(); //Will not drop the table. No results will be returned.

Context

StackExchange Code Review Q#57528, answer score: 10

Revisions (0)

No revisions yet.