patternphpMinor
Filtering data from users
Viewed 0 times
fromdatausersfiltering
Problem
I have a small function for filtering data from users. I'd like a review on logical mistakes I've made and if I'm somewhat protected using it.
```
public static function filtru($str, $filtre = []){
if(is_array($str)){//verifica daca datele trimise spre curatare sant array
$str_curat = [];
foreach($str as $key => $value){//selectam fiecare valoare din array-ul trimis spre curatare
foreach($filtre as $fitru){//aplicam fiecare filtru, daca se folosesc mai multe
switch($fitru){
case 'trim':
$value = htmlentities(trim(strip_tags($value)), ENT_QUOTES, 'utf-8');
break;
case 'int':
$value = intval($value);
break;
case 'nu':
$text = $value;$striptags = true;
$search = ["40","41","58","65","66","67","68","69","70",
"71","72","73","74","75","76","77","78","79","80","81",
"82","83","84","85","86","87","88","89","90","97","98",
"99","100","101","102","103","104","105","106","107",
"108","109","110","111","112","113","114","115","116",
"117","118","119","120","121","122"
];
$replace = ["(",")",":","a","b","c","d","e","f","g","h",
"i","j","k","l","m","n","o","p","q","r","s","t","u",
"v","w","x","y","z","a","b","c","d","e","f","g","h",
"i","j","k","l","m","n","o","p","q","r","s","t","u",
"v","w","x","y","z"
];
$entities = count($search);
for ($i=0; $i ]+[/\"\'\s])(onmouseover|onmousedown|onmouseup|onmouseout|onmousemove|onclick|ondblclick|onfocus|onload|xmlns)[^>]
```
public static function filtru($str, $filtre = []){
if(is_array($str)){//verifica daca datele trimise spre curatare sant array
$str_curat = [];
foreach($str as $key => $value){//selectam fiecare valoare din array-ul trimis spre curatare
foreach($filtre as $fitru){//aplicam fiecare filtru, daca se folosesc mai multe
switch($fitru){
case 'trim':
$value = htmlentities(trim(strip_tags($value)), ENT_QUOTES, 'utf-8');
break;
case 'int':
$value = intval($value);
break;
case 'nu':
$text = $value;$striptags = true;
$search = ["40","41","58","65","66","67","68","69","70",
"71","72","73","74","75","76","77","78","79","80","81",
"82","83","84","85","86","87","88","89","90","97","98",
"99","100","101","102","103","104","105","106","107",
"108","109","110","111","112","113","114","115","116",
"117","118","119","120","121","122"
];
$replace = ["(",")",":","a","b","c","d","e","f","g","h",
"i","j","k","l","m","n","o","p","q","r","s","t","u",
"v","w","x","y","z","a","b","c","d","e","f","g","h",
"i","j","k","l","m","n","o","p","q","r","s","t","u",
"v","w","x","y","z"
];
$entities = count($search);
for ($i=0; $i ]+[/\"\'\s])(onmouseover|onmousedown|onmouseup|onmouseout|onmousemove|onclick|ondblclick|onfocus|onload|xmlns)[^>]
Solution
- Your function should be broken into many more functions. Each different type of escaping should be its own function.
- You should either use separate functions where the array version calls the non-array version, or youo should try to fold your array and non array logic into one
$wasArray = is_array($str); if (!$wasArray) { $str = array($str); } foreach ($str as $s) { ... } return ($wasArray) ? $str : array_shift($str);
- Your
trimmethod is a lie -- it doesn't just trim. What if you want to trim without html escaping?
- If you did leave them all in one method, consider using class constants instead of strings to control the behavior. That's a lot less error prone, and it makes it easier to trace usage through code (and if you rename something later, it will be a million times easier).
- What is your
nusanitation doing? It's like it's processing HTML, PHP and JavaScript?
- With HTML, you're typically better off with a whitelist than trying to sanitize, and then escaping everything else.
- With PHP, you're typically better off just not running code that you don't 100% trust
- JS falls into the HTML category. If HTML is whitelisted, it should be impossible for a user to get JS through.
You should consider if you really need all of this. In a lot of situations, it's sufficient to simply escape and not sanitize (though one could argue that escape is a form of sanitizing).
(There are of course situations where you really do have to process the hell out of user input though. Consider the StackExchange questions/answers/comments. They allow a very limited subset of HTML which means that their processor has to be aware of what to escape, what to leave alone.)
Context
StackExchange Code Review Q#18471, answer score: 3
Revisions (0)
No revisions yet.