TalkPHP
 
 
Account Login
Latest Articles
» The basic usage of PHPTAL, a XML/XHTML template library for PHP
» Vulnerable methods and the areas they are commonly trusted in.
» Simple way to protect a form from bot
» The Basics On: How Session Stealing Works
» How to keep your forms from double posting data
IRC Channel
IRC Speech Bubble Join the friendly bunch on IRC...
(#TalkPHP on Freenode)

...Also available via a web interface.

See this thread for information on the TalkPHP Free Hugs Initiative™. Subject to availability.
Associates
Associates
CSS Tutorials
Reply
 
LinkBack (1) Thread Tools Search this Thread Display Modes
Old 05-29-2008, 12:55 AM   #41 (permalink)
The Prestige
 
delayedinsanity's Avatar
 
Join Date: Mar 2008
Location: Vegas, Baby
Posts: 851
Thanks: 31
delayedinsanity is on a distinguished road
Default

Multiline regular expressions... is it best just to use the s modifier? For example, if I wanted to remove all PHP style comments from a file (no, I'm not trying to make a compression function like Kalle's, just playing around with different methods of templating), I'm currently using;

~/\*.*?\*/~s

What about without the s? I got this to work:

~/\*(.*?\r?\n?)+?\*/~

Can you think of a more efficient way?
-m

Last edited by delayedinsanity : 05-31-2008 at 02:32 PM.
delayedinsanity is offline  
Reply With Quote
Old 05-31-2008, 11:24 AM   #42 (permalink)
The Contributor
RegEx Guru 
 
Join Date: Dec 2007
Location: Belgium
Posts: 60
Thanks: 6
Geert is on a distinguished road
Default

Your second regex is a nice experiment, however, my guess is that it is horribly inefficient because of the combination of (lazy) quantifiers.

Note that there is nothing wrong at all with using the s modifier.

In Friedl's book you'll read about a more efficient way. I believe it also takes into account situations like /* echo '*/'; */.
__________________
Kohana - PHP5 framework
Geert is offline  
Reply With Quote
Old 05-31-2008, 02:31 PM   #43 (permalink)
The Prestige
 
delayedinsanity's Avatar
 
Join Date: Mar 2008
Location: Vegas, Baby
Posts: 851
Thanks: 31
delayedinsanity is on a distinguished road
Default

I reduced the lazyness a bit by changing it to this: ~/\*(.|[\r\n])*?\*/~ so now we're down to just one. I don't know much about lookahead and look behind assertations which may be incorporated in the friedl example if it's anything like his example for matching html style tags. Maybe you could post it here so we could dissect it?
-m
delayedinsanity is offline  
Reply With Quote
Old 05-31-2008, 03:14 PM   #44 (permalink)
The Prestige
 
delayedinsanity's Avatar
 
Join Date: Mar 2008
Location: Vegas, Baby
Posts: 851
Thanks: 31
delayedinsanity is on a distinguished road
Default

How about this? It doesn't count for nested comments, which means it'll still break using your haystack Geert, but it doesn't use any (I repeat any!) lazy operators. Based off an example I posted elsewhere yesterday, I've updated it - so we're looking for an opening and closing tag, but the in between is a little smarter than before where it just looked for anything and everything. It looks for any character except for another star, or a star but only if it's not the closing tag, or a new line.

PHP Code:
$szTest = <<<EOF
/*** this
is*** the
test **/
/* test*/
booga!

/* more
new lines
in this
comment
*/
EOF;

$szTest preg_replace("~/\*([^*]|\*+[^*/]|[\r\n])*\*+/~"'booga!'$szTest);
//$szTest = preg_replace("~/\*(.|[\r\n])*?\*/~", 'booga!', $szTest);
echo $szTest
  1. /\*([^*]|\*+[^*/]|[\r\n])*\*+/
  2. /\*(.|[\r\n])*?\*/
...but which one is more efficient? The second one has a lazy operator, but it matches the same as the first one. I like the second one because it's more readable, but I'm prone to using things like the first one because it's less readable and people looking at my work may think I'm actually smarter than I really am.

Other than the ego, I also like the first one because it's job description is much more defined. Despite accomplishing the same task, it's doing what it's meant to do, whereas the second, although accomplishing the same task for now, has more ability to let something slide through that it shouldn't in the future. Or will it... hmmmmmmmmmmmmm.
-m

edit: Here's a hackish version that matches newlines without mentioning newlines: /\*[\w\W]*?\*/

Last edited by delayedinsanity : 05-31-2008 at 03:55 PM.
delayedinsanity is offline  
Reply With Quote
Old 07-31-2008, 09:47 AM   #45 (permalink)
The Visitor
 
Join Date: Jul 2008
Posts: 1
Thanks: 0
xcasio is on a distinguished road
Default

Hey,

Just a small tip. A perfectly valid e-mail address (example@example.museum) would not validate in this way due to the 4 character limit on the top level domain.
xcasio is offline  
Reply With Quote
Old 04-22-2009, 11:22 AM   #46 (permalink)
The Visitor
 
Join Date: Apr 2009
Posts: 1
Thanks: 0
lazycoder is on a distinguished road
Default email validation is wrong

hey, email validation is still problematic, not valid
as it will accept _lazycoder@_.com as valid

Please try this one

PHP Code:
$pattern "/^[a-z]+[-|_|.]?[a-z0-9]+@[a-z0-9]+[-|.|_]?[a-z]+\.[a-z]{2,4}$/"
Hope it helps someone
lazycoder is offline  
Reply With Quote
Old 10-19-2009, 09:25 PM   #47 (permalink)
The Wanderer
 
bucabay's Avatar
 
Join Date: Oct 2009
Location: Fiji
Posts: 6
Thanks: 0
bucabay is on a distinguished road
Default

For email address validation there is two libraries on Google Code.

1) validating email syntax with regex
http://code.google.com/p/php-email-address-validation/

2) validating email's via SMTP
http://code.google.com/p/php-smtp-email-validation/
bucabay is offline  
Reply With Quote
Old 11-01-2009, 12:46 PM   #48 (permalink)
The Wanderer
 
nuweb's Avatar
 
Join Date: Nov 2008
Location: Yorkshire, England
Posts: 8
Thanks: 1
nuweb is on a distinguished road
Default

The Opening Post, is regarding regular expressions for many aspects where its not needed?

As filter_var works very well for me.

Validate Number
PHP Code:
$data filter_var($dataFILTER_SANITIZE_NUMBER_INT); 
Validate Email
PHP Code:
$data filter_var($dataFILTER_VALIDATE_EMAIL); 
Validate Url
PHP Code:
$data filter_var($dataFILTER_VALIDATE_URL); 
Validate Boolean
PHP Code:
$data filter_var($dataFILTER_VALIDATE_BOOLEAN); 
Validate String
PHP Code:
$data filter_var($dataFILTER_SANITIZE_STRING); 
Validate IP
PHP Code:
$data filter_var($dataFILTER_VALIDATE_IP); 
__________________
NuWeb
nuweb is offline  
Reply With Quote
Old 11-01-2009, 04:46 PM   #49 (permalink)
The Prestige
 
delayedinsanity's Avatar
 
Join Date: Mar 2008
Location: Vegas, Baby
Posts: 851
Thanks: 31
delayedinsanity is on a distinguished road
Default

"The Opening Post, is regarding regular expressions for many aspects where its not needed?"

If you have a drivers license, that's great up until your car breaks down. At that point you're left on the side of the road, maybe you have a toolkit, but what are you going to do with it when you don't know how to work a ratchet?

Not only is it beneficial to understand how these functions are working underneath the surface (afaik, filter_var is just a wrapper for nearly the same regular expressions), but it's good practice for building regular expressions you may need down the road. Same thing if you need a variation on a theme; what happens if you're looking for a specific type of email address?

Good information to add though, filter_var() can be a handy tool to know in addition to regular expressions!
delayedinsanity is offline  
Reply With Quote
Reply


LinkBacks (?)
LinkBack to this Thread: http://www.talkphp.com/advanced-php-programming/1612-8-practical-php-regular-expressions.html
Posted By For Type Date
Digg - Regular expressions for the PHP NooB! This thread Refback 12-25-2007 05:06 PM

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


All times are GMT. The time now is 10:33 PM.

 
     

Powered by vBulletin® Version 3.6.8
Copyright ©2000 - 2010, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.1.0
Inactive Reminders By Icora Web Design