patternphpMinor
HTML Compressor with regex
Viewed 0 times
withcompressorregexhtml
Problem
I would like to compress a Magento HTML page using some regex, and this is what I have written:
```
function html_compress($string){
global $idarray;
$idarray=array();
//Replace PRE and TEXTAREA tags
$search=array(
'@(]?)(>)([\s\S]?)()@', //Find PRE Tag
'@(]?)(>)([\s\S]?)()@' //Find TEXTAREA
);
$string=preg_replace_callback($search,
function($m){
$id='';
global $idarray;
$idarray[]=array($id,$m[0]);
return $id;
},
$string
);
//Remove blank useless space
$search = array(
'@( |\t|\f)+@', // Shorten multiple whitespace sequences
'@(^[\r\n]|[\r\n]+)[\s\t][\r\n]+@', //Remove blank lines
'@^(\s)+|( |\t|\0|\r\n)+$@' //Trim Lines
);
$replace = array(' ',"\\1",'');
$string = preg_replace($search, $replace, $string);
//Replace IE COMMENTS, SCRIPT, STYLE and CDATA tags
$search=array(
'@))*@', //Find IE Comments
'@(]?)(>)([\s\S]?)()@', //Find SCRIPT Tag
'@(]?)(>)([\s\S]?)()@', //Find STYLE Tag
'@(//)@', //Find commented CDATA
'@()@' //Find CDATA
);
$string=preg_replace_callback($search,
function($m){
$id='';
global $idarray;
$idarray[]=array($id,$m[0]);
return $id;
},
$string
);
//Remove blank useless space
$search =
```
function html_compress($string){
global $idarray;
$idarray=array();
//Replace PRE and TEXTAREA tags
$search=array(
'@(]?)(>)([\s\S]?)()@', //Find PRE Tag
'@(]?)(>)([\s\S]?)()@' //Find TEXTAREA
);
$string=preg_replace_callback($search,
function($m){
$id='';
global $idarray;
$idarray[]=array($id,$m[0]);
return $id;
},
$string
);
//Remove blank useless space
$search = array(
'@( |\t|\f)+@', // Shorten multiple whitespace sequences
'@(^[\r\n]|[\r\n]+)[\s\t][\r\n]+@', //Remove blank lines
'@^(\s)+|( |\t|\0|\r\n)+$@' //Trim Lines
);
$replace = array(' ',"\\1",'');
$string = preg_replace($search, $replace, $string);
//Replace IE COMMENTS, SCRIPT, STYLE and CDATA tags
$search=array(
'@))*@', //Find IE Comments
'@(]?)(>)([\s\S]?)()@', //Find SCRIPT Tag
'@(]?)(>)([\s\S]?)()@', //Find STYLE Tag
'@(//)@', //Find commented CDATA
'@()@' //Find CDATA
);
$string=preg_replace_callback($search,
function($m){
$id='';
global $idarray;
$idarray[]=array($id,$m[0]);
return $id;
},
$string
);
//Remove blank useless space
$search =
Solution
See: https://stackoverflow.com/a/6225706/736079
For comments on enabling content compression for your HTML pages, that usually is enough to reduce the payload by more than 50%.
You can also make use of output buffering and combine it with the HTMLMinify function:
See: https://github.com/mrclay/minify/blob/master/min/lib/Minify/HTML.php
This still uses Regex at the core, which is still not ideal, but it has been tested by a larger audience and looks quite solid (test to make sure). If you are hosting on IIS, you might be able to use a .NET HttpModule or an ISAPI filter as well. This isn't limited to PHP only, sometimes the Web Server itself has plugins that can help you, like Apache's mod_pagespeed.
For comments on enabling content compression for your HTML pages, that usually is enough to reduce the payload by more than 50%.
You can also make use of output buffering and combine it with the HTMLMinify function:
See: https://github.com/mrclay/minify/blob/master/min/lib/Minify/HTML.php
This still uses Regex at the core, which is still not ideal, but it has been tested by a larger audience and looks quite solid (test to make sure). If you are hosting on IIS, you might be able to use a .NET HttpModule or an ISAPI filter as well. This isn't limited to PHP only, sometimes the Web Server itself has plugins that can help you, like Apache's mod_pagespeed.
Code Snippets
<?php
function sanitize_output($content) {
$content = Minify_HTML::minify($content);
}
ob_start("sanitize_output");
?>Context
StackExchange Code Review Q#74261, answer score: 3
Revisions (0)
No revisions yet.