HiveBrain v1.2.0
Get Started
← Back to all entries
patternphpMinor

HTML Compressor with regex

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
withcompressorregexhtml

Problem

I would like to compress a Magento HTML page using some regex, and this is what I have written:

```
function html_compress($string){

global $idarray;
$idarray=array();

//Replace PRE and TEXTAREA tags
$search=array(
'@(]?)(>)([\s\S]?)()@', //Find PRE Tag
'@(]?)(>)([\s\S]?)()@' //Find TEXTAREA
);
$string=preg_replace_callback($search,
function($m){
$id='';
global $idarray;
$idarray[]=array($id,$m[0]);
return $id;
},
$string
);

//Remove blank useless space
$search = array(
'@( |\t|\f)+@', // Shorten multiple whitespace sequences
'@(^[\r\n]|[\r\n]+)[\s\t][\r\n]+@', //Remove blank lines
'@^(\s)+|( |\t|\0|\r\n)+$@' //Trim Lines
);
$replace = array(' ',"\\1",'');
$string = preg_replace($search, $replace, $string);

//Replace IE COMMENTS, SCRIPT, STYLE and CDATA tags
$search=array(
'@))*@', //Find IE Comments
'@(]?)(>)([\s\S]?)()@', //Find SCRIPT Tag
'@(]?)(>)([\s\S]?)()@', //Find STYLE Tag
'@(//)@', //Find commented CDATA
'@()@' //Find CDATA
);
$string=preg_replace_callback($search,
function($m){
$id='';
global $idarray;
$idarray[]=array($id,$m[0]);
return $id;
},
$string
);

//Remove blank useless space
$search =

Solution

See: https://stackoverflow.com/a/6225706/736079

For comments on enabling content compression for your HTML pages, that usually is enough to reduce the payload by more than 50%.

You can also make use of output buffering and combine it with the HTMLMinify function:



See: https://github.com/mrclay/minify/blob/master/min/lib/Minify/HTML.php

This still uses Regex at the core, which is still not ideal, but it has been tested by a larger audience and looks quite solid (test to make sure). If you are hosting on IIS, you might be able to use a .NET HttpModule or an ISAPI filter as well. This isn't limited to PHP only, sometimes the Web Server itself has plugins that can help you, like Apache's mod_pagespeed.

Code Snippets

<?php
function sanitize_output($content) {
     $content = Minify_HTML::minify($content);
}

ob_start("sanitize_output");
?>

Context

StackExchange Code Review Q#74261, answer score: 3

Revisions (0)

No revisions yet.