TalkPHP
 
 
Account Login
Latest Articles
» The basic usage of PHPTAL, a XML/XHTML template library for PHP
» Vulnerable methods and the areas they are commonly trusted in.
» Simple way to protect a form from bot
» The Basics On: How Session Stealing Works
» How to keep your forms from double posting data
IRC Channel
IRC Speech Bubble Join the friendly bunch on IRC...
(#TalkPHP on Freenode)

...Also available via a web interface.

See this thread for information on the TalkPHP Free Hugs Initiative™. Subject to availability.
Associates
Associates
CSS Tutorials
Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old 05-26-2008, 08:00 PM   #1 (permalink)
The Frequenter
Zend Certified 
 
Join Date: Sep 2007
Location: Denmark
Posts: 352
Thanks: 8
Kalle is on a distinguished road
Default PHP Compressor

Hi Talk'ing PHP'ers :P

This is my first script for the script giveaway! Its called PHP Compressor and simply compress PHP source code into a lower file size by removing whitespace and comments. Theres also an option to enable GZIP compression on it.

Source:
PHP Code:
<?php
    
/**
     * PHP Compressor
     * ========================================================================
     *
     * @author    Kalle Sommer Nielsen <kalle@php.net>
     * @package    PHP_Compressor
     * @version    1.0
     * @license    http://www.php.net/license/ The PHP License v3.01
     * @copyright    2002+
     *
     * ========================================================================
     */


    /**
     * Standard compression with no options
     *
     * @var        integer
     */
    
define('COMPRESS_STANDARD',        0);

    
/**
     * Compression without comments
     *
     * @var        integer
     */
    
define('COMPRESS_STRIP_COMMENTS',     1);

    
/**
     * Compression with GZIP
     *
     * @var        integer
     */
    
define('COMPRESS_GZIP',            2);

    
/**
     * Compression with all options
     *
     * @var        integer
     */
    
define('COMPRESS_ALL',            3);


    
/**
     * Compress PHP code into a lower size wheres possible
     *
     * Example of usage:
     * <code>
     * <?php
     *    // Include the compression function
     *    require_once './phpcompress.php';
     *
     *    echo htmlentities(php_compress(file_get_contents(__FILE__), COMPRESS_STRIP_COMMENTS));
     * ?>
     * </code>
     *
     * @param    string        Code to compress
     * @param    integer        Options bitfield
     * @return    string        Returns compressed string
     *
     * @see        COMPRESS_STANDARD
     * @see        COMPRESS_STRIP_COMMENTS
     * @see        COMPRESS_GZIP
     * @see        COMPRESS_ALL
     */
    
function php_compress($code$flags COMPRESS_STANDARD)
    {
        static 
$magic_defines;

        
$code         = (string) $code;
        
$strip_comments = (boolean) ($flags COMPRESS_STRIP_COMMENTS);

        if(empty(
$code))
        {
            return(
'');
        }

        
$tokens token_get_all($code);

        if(!
sizeof($tokens))
        {
            return(
'');
        }

        
/** Magic defines for older versions */
        
if(!$magic_defines)
        {
            
$magic_defines = Array();

            
/** PHP 5.0 */
            
if(!defined('T_DOC_COMMENT'))
            {
                
$magic_defines['abstract']     = 345;
                
$magic_defines['clone']        = 298;
                
$magic_defines['const']        = 334;
                
$magic_defines['final']        = 344;
                
$magic_defines['implements']    = 355;
                
$magic_defines['instanceof']    = 288;
                
$magic_defines['interface']    = 353;
                
$magic_defines['private']    = 343;
                
$magic_defines['protected']    = 342;
                
$magic_defines['public']    = 341;
                
$magic_defines['throw']        = 338;
            }
        }

        
$in_php     false;
        
$compiled_code     '';
        
$last_token    $tokens[0];

        foreach(
$tokens as $no => $token)
        {
            
$is_char = !is_array($token);

            if(
$no)
            {
                
$last_token $tokens[($no -1)];
            }

            if(
$in_php)
            {
                if(!
$is_char)
                {
                    if(
$token[0] == T_STRING)
                    {
                        
/**
                         * This provides compability for older versions of PHP, note
                         * that line numbers aren't currently 
                         */
                        
if(array_key_exists(strtolower($token[1]), $magic_defines))
                        {
                            
$token = Array(
                                    
$magic_defines[strtolower($token[1])], 
                                    
$token[1]
                                    );
                        }
                    }

                    if(!
defined('T_DOC_COMMENT') && $token[0] == T_ML_COMMENT)
                    {
                        
/**
                         * Cross version patch for multi line comments / document block
                         * comments
                         */
                        
$token[0] = 366;
                    }


                    
/**
                     * Note that numbers are used here where tokens aren't available from 
                     * PHP 4.0 in order to prevent defining them and break other scripts 
                     * that may rely on them being / not being defined
                     */
                    
switch($token[0])
                    {
                        case(
T_CLOSE_TAG):
                        {
                            
$in_php     false;
                            
$compiled_code     .= ' ' $token[1];

                            continue 
2;
                        }
                        break;
                        case(
T_WHITESPACE):
                        {
                            
/**
                             * We do not need to count whitespace tokens 
                             * in the last tokens array
                             */
                            
continue 2;
                        }
                        break;
                        case(
T_EXTENDS):
                        case(
T_FUNCTION):
                        case(
355):
                        case(
288):
                        case(
T_AS):
                        case(
T_LOGICAL_OR):
                        {
                            
/** 
                             * These needs a space infront and behind to 
                             * prevent a parse error
                             */
                            
$token[1] = ' ' $token[1] . ' ';
                        }
                        break;
                        case(
345):
                        case(
T_CASE):
                        case(
T_CLASS):
                        case(
298):
                        case(
334):
                        case(
344):
                        case(
T_GLOBAL):
                        case(
353):
                        case(
T_NEW):
                        case(
343):
                        case(
342):
                        case(
341):
                        case(
T_RETURN):
                        case(
T_STATIC):
                        case(
338):
                        {
                            
/**
                             * All these just needs a space behind them to 
                             * prevent a parse error
                             */
                            
$token[1] .= ' ';
                        }
                        break;
                        case(
T_COMMENT):
                        case(
366):
                        {
                            
/**
                             * For comments
                             */
                            
if($strip_comments)
                            {
                                continue 
2;
                            }
                            elseif(!
$strip_comments && $token[0] == T_COMMENT && (substr($token[1], 02) == '//' || $token[1]{0} == '#'))
                            {
                                
/**
                                 * C++/Perl style comments needs a new line after them
                                 */
                                
$token[1] .= "\r\n";
                            }
                        }
                        break;
                    }
                }
            }

            if(
$in_php)
            {
                
/**
                 * Optimzation when its best, truncate the space added to the return 
                 * to save one byte if the space aren't needed there
                 */
                
if($last_token && $last_token[0] == T_RETURN && $is_char && $token == ';')
                {
                    
$compiled_code substr($compiled_code0, -1);
                }

                
$compiled_code .= ($is_char $token $token[1]);
            }


            if(!
$in_php && (!$is_char && $token[0] != T_CLOSE_TAG))
            {
                
$compiled_code .= trim($token[1]);

                if(
$token[0] != T_OPEN_TAG)
                {
                    continue;
                }

                
$in_php true;
                
$compiled_code .= ' ';
            }
        }

        
/**
         * Compress if possible
         */


        
if(($flags COMPRESS_GZIP) && function_exists('gzdeflate'))
        {
            
$compiled_code '<?php ob_start(); ?>' str_replace('<?''&lt;?'gzdeflate('?>' $compiled_code '<?php '9)) . '<?php eval(gzinflate(str_replace(\'&lt;?\', \'<?\', ob_get_clean()))); ?>';
        }

        return(
$compiled_code);
    }

    
/**
     * Compresses a PHP file into a lower size wheres possible
     *
     * Example of usage:
     * <code>
     * <?php
     *    // Include the compression function
     *    require_once './phpcompress.php';
     *
     *    php_compress_file(__FILE__, COMPRESS_STRIP_COMMENTS) or die('Compression failed!');
     *
     *    echo htmlentities(file_get_contents(__FILE__));
     * ?>
     * </code>
     *
     * @param    string        PHP file to compress
     * @param    integer        Options bitfield
     * @return    boolean        True if all operations was successful otherwise false
     *
     * @see        php_compress()
     */
    
function php_compress_file($filename$flags COMPRESS_STANDARD)
    {
        
$code = @file_get_contents($filename);

        if(!
$code)
        {
            return(
false);
        }

        return((boolean) @
file_put_contents($filenamephp_compress($code$flags)));
    }
?>
All documentation are placed in the docblocks and should pass if you run it though a program like PHPDocumentor.

An example output of a compression where whitespace and comments are removed will look someway similar to this:

PHP Code:
<?php define('COMPRESS_STANDARD',0);define('COMPRESS_STRIP_COMMENTS',1);define('COMPRESS_GZIP',2);define('COMPRESS_ALL',3); function php_compress($code,$flags=COMPRESS_STANDARD){static $magic_defines;$code=(string)$code;$strip_comments=(boolean)($flags&COMPRESS_STRIP_COMMENTS);if(empty($code)){return ('');}$tokens=token_get_all($code);if(!sizeof($tokens)){return ('');}if(!$magic_defines){$magic_defines=Array();if(!defined('T_DOC_COMMENT')){$magic_defines['abstract']=345;$magic_defines['clone']=298;$magic_defines['const']=334;$magic_defines['final']=344;$magic_defines['implements']=355;$magic_defines['instanceof']=288;$magic_defines['interface']=353;$magic_defines['private']=343;$magic_defines['protected']=342;$magic_defines['public']=341;$magic_defines['throw']=338;}}$in_php=false;$compiled_code='';$last_token=$tokens[0];foreach($tokens as $no=>$token){$is_char=!is_array($token);if($no){$last_token=$tokens[($no-1)];}if($in_php){if(!$is_char){if($token[0]==T_STRING){if(array_key_exists(strtolower($token[1]),$magic_defines)){$token=Array($magic_defines[strtolower($token[1])],$token[1]);}}if(!defined('T_DOC_COMMENT')&&$token[0]==T_ML_COMMENT){$token[0]=366;}switch($token[0]){case (T_CLOSE_TAG):{$in_php=false;$compiled_code.=' '.$token[1];continue2;}break;case (T_WHITESPACE):{continue2;}break;case (T_EXTENDS):case (T_FUNCTION):case (355):case (288):case (T_AS):case (T_LOGICAL_OR):{$token[1]=' '.$token[1].' ';}break;case (345):case (T_CASE):case (T_CLASS):case (298):case (334):case (344):case (T_GLOBAL):case (353):case (T_NEW):case (343):case (342):case (341):case (T_RETURN):case (T_STATIC):case (338):{$token[1].=' ';}break;case (T_COMMENT):case (366):{if($strip_comments){continue2;}elseif(!$strip_comments&&$token[0]==T_COMMENT&&(substr($token[1],0,2)=='//'||$token[1]{0}=='#')){$token[1].="\r\n";}}break;}}}if($in_php){if($last_token&&$last_token[0]==T_RETURN&&$is_char&&$token==';'){$compiled_code=substr($compiled_code,0,-1);}$compiled_code.=($is_char?$token:$token[1]);}if(!$in_php&&(!$is_char&&$token[0]!=T_CLOSE_TAG)){$compiled_code.=trim($token[1]);if($token[0]!=T_OPEN_TAG){continue;}$in_php=true;$compiled_code.=' ';}}if(($flags&COMPRESS_GZIP)&&function_exists('gzdeflate')){$compiled_code='<?php ob_start(); ?>'.str_replace('<?','&lt;?',gzdeflate('?>'.$compiled_code.'<?php ',9)).'<?php eval(gzinflate(str_replace(\'&lt;?\', \'<?\', ob_get_clean()))); ?>';}return ($compiled_code);} function php_compress_file($filename,$flags=COMPRESS_STANDARD){$code=@file_get_contents($filename);if(!$code){return (false);}return ((boolean)@file_put_contents($filename,php_compress($code,$flags)));} ?>
Usage:
You may simply call php_compress() where the first parameter is a string with the php code to compress, this may contain HTML and jump in and out of the php tags, the compressor will only compress whats inside the php tags.

You may pass an secondary parameter to php_compress() that tells the compressor what you want to be compressed. Currently theres two options, this is defined using bitfields and you can use some of the constants defined in the start.

COMPRESS_STANDARD - Standard used, doesn't removes comments or GZIP
COMPRESS_STRIP_COMMENTS - Strip comments
COMPRESS_GZIP - Compress using GZIP
COMPRESS_ALL - (Same as "COMPRESS_STRIP_COMMENTS | COMPRESS_GZIP")

Theres also a second function which allows you to compress a file by only specifying the file name as the first parameter insted of the code, the function is called php_compress_file() and the secondary parameter may be passed with options just like in php_compress().



I did some testing on SimplePie if anyone knows that, with stripped comments/whitespace I got the file size from 279kb down to 193kb and with gzip I got it down to 42kb.

Ofcourse with the lowest size comes with the lowest speed because gzip has to inflate the binary data, this is around 20 times slower than just a normal compression.

My small benchmarking also indicated on my PC that the compressed (strip comments/whitespace) was about 0.0002 to 0.0003 times faster than with.


Note: I tried to implement a compability patch to make even PHP 4.0.0 tokenize PHP 5.0.0+ code properly, but its not fully tested!

Another note: I know the GZIP'ed generated code aren't the best but it was a better way that using base64 encoding for the binary data


Anyway hopes this will be any useful to some as it may become to me ;)
__________________

Last edited by Kalle : 05-28-2008 at 12:13 AM.
Send a message via MSN to Kalle Send a message via Skype™ to Kalle
Kalle is offline  
Reply With Quote
The Following 5 Users Say Thank You to Kalle For This Useful Post:
ETbyrne (05-27-2008), Matt (05-27-2008), ReSpawN (05-26-2008), sketchMedia (05-27-2008), Wildhoney (05-27-2008)
Old 05-27-2008, 03:35 PM   #2 (permalink)
how quixotic are you?
 
ETbyrne's Avatar
 
Join Date: Dec 2007
Location: Lapeer, MI
Posts: 445
Thanks: 37
ETbyrne is on a distinguished road
Default

Looks cool, but would this also compress HTML? Example:
PHP Code:
<?php
echo "Hello World!";
?>
Would that become this?
PHP Code:
<?php echo"HelloWorld!";?>
__________________
Dingo Web Systems > http://www.dingocode.com
My Website > http://www.evanbot.com
ETbyrne is offline  
Reply With Quote
Old 05-27-2008, 03:55 PM   #3 (permalink)
Moderateur
RegEx Guru PHP Guru Top Contributor Advanced Programmer 
 
Salathe's Avatar
 
Join Date: Apr 2007
Posts: 1,393
Thanks: 5
Salathe is on a distinguished road
Default

If you're not wanting to use this particular function, you can also use the built-in php_strip_whitespace which leaves a little bit more whitespace in the resulting code but not much.

ETbyrne, no the resulting code will be functionally identical to the original and any functional whitespace (cf. presentational whitespace in the code) will be left alone.
Salathe is offline  
Reply With Quote
Old 05-27-2008, 04:41 PM   #4 (permalink)
how quixotic are you?
 
ETbyrne's Avatar
 
Join Date: Dec 2007
Location: Lapeer, MI
Posts: 445
Thanks: 37
ETbyrne is on a distinguished road
Default

Great! Can't wait to compress Kudos now...
__________________
Dingo Web Systems > http://www.dingocode.com
My Website > http://www.evanbot.com
ETbyrne is offline  
Reply With Quote
Old 05-27-2008, 05:13 PM   #5 (permalink)
is cute and cuddly
 
delayedinsanity's Avatar
 
Join Date: Mar 2008
Location: Vegas, Baby
Posts: 963
Thanks: 31
delayedinsanity is on a distinguished road
Default

Well that blows my little hack away... All I've done to date as far as compression was to add this to the top of my config file (included first on every page, of course);

PHP Code:
ob_start("compress");

function 
compress ($buffer) {
    return 
str_replace(array("\r\n""\r""\n""\t"'  ''    ''    '), ''$buffer);

Looks pretty caveman now...
-m
delayedinsanity is offline  
Reply With Quote
Old 05-27-2008, 07:54 PM   #6 (permalink)
The Frequenter
Zend Certified 
 
Join Date: Sep 2007
Location: Denmark
Posts: 352
Thanks: 8
Kalle is on a distinguished road
Default

Like Salathe said it does not alter the functionallity, meaning it will not edit anything like variable values (T_CONSTANT_ENCAPSED_STRING).

And thanks for all the thanks, they are greatly appreciated
__________________
Send a message via MSN to Kalle Send a message via Skype™ to Kalle
Kalle is offline  
Reply With Quote
Old 05-27-2008, 07:56 PM   #7 (permalink)
Moderateur
RegEx Guru PHP Guru Top Contributor Advanced Programmer 
 
Salathe's Avatar
 
Join Date: Apr 2007
Posts: 1,393
Thanks: 5
Salathe is on a distinguished road
Default

delayedinsanity, it looks like your compression is something different from Kalle's. It looks like you're removing whitespace from output (HTML) whereas his is removing whitespace from the PHP code itself.
Salathe is offline  
Reply With Quote
Old 05-27-2008, 10:09 PM   #8 (permalink)
is cute and cuddly
 
delayedinsanity's Avatar
 
Join Date: Mar 2008
Location: Vegas, Baby
Posts: 963
Thanks: 31
delayedinsanity is on a distinguished road
Default

I just meant similar in nature because they both wind up outputting a one liner of text, and mine does look like caveman PHP in comparison... no elegance.

BTW, line 222, there's a misspelled variable,

PHP Code:
...
elseif(!
$stip_comments && ... 
I think this is a case of, Those who can, do, those who can't, debug.
-m
delayedinsanity is offline  
Reply With Quote
The Following User Says Thank You to delayedinsanity For This Useful Post:
Kalle (05-28-2008)
Old 05-28-2008, 12:14 AM   #9 (permalink)
The Frequenter
Zend Certified 
 
Join Date: Sep 2007
Location: Denmark
Posts: 352
Thanks: 8
Kalle is on a distinguished road
Default

Cheers delayedinsanity, I updated the source with that fix and I forgot to change the sourcecode comment to say that it also handled PERL style comments (#) =)
__________________
Send a message via MSN to Kalle Send a message via Skype™ to Kalle
Kalle is offline  
Reply With Quote
Reply



Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


All times are GMT. The time now is 07:27 PM.

 
     

Powered by vBulletin® Version 3.6.8
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.1.0
Inactive Reminders By Icora Web Design