TalkPHP

TalkPHP (http://www.talkphp.com/forums.php)
-   General (http://www.talkphp.com/general/)
-   -   Using regex (http://www.talkphp.com/general/4758-using-regex.html)

benton 07-20-2009 01:56 AM

Using regex
 
The following code, the way I understand it, is supposed to strip all punctuation characters from the string. From what I've found on the web, a space is not one of those characters. But the code below outputs "abstractmodern." Would someone please explain how I can strip all special characters except the spaces?
PHP Code:

$pattern "([[:punct:]])"
$str "abstract modern"
$str ereg_replace($pattern''$str)); 
echo 
$str


Village Idiot 07-20-2009 03:25 AM

Use this range [!-@]

This is because ranges go by ASCII value, where ! is the first punctuation (after space) and @ is the last.

EGYG33K 07-20-2009 03:32 AM

PHP Code:

echo ereg_replace("[^[:alpha:],[:digit:],_,[:blank:]]","",'this is 123 $ *&^_ text test'); 


Enfernikus 07-20-2009 04:38 AM

As a side note, it's best just to stop using the ereg_* function and it's POCL ( I think it is ) engine because as of 5.3.0 it's going to through an E_DEPRECIATED error. Use the preg_* family

benton 07-20-2009 03:14 PM

Thanks for the suggestions but they don't seem to be working. I changed the input string to more properly illustrate the problem. So given this string
PHP Code:

$str '#!$abstract modern-fields'

it should output
Quote:

abstract modern-fields
If I use this
PHP Code:

echo ereg_replace("[^[:alpha:],[:digit:],_,[:blank:]]","",$str).'<br>'

It displays
Quote:

abstract modernfields
And if I use this
PHP Code:

ereg_replace("[[!-@],[:punct:]]","",$str).'<br>'

It displays
Quote:

#!$abstract modern-fields
The #!S are all listed as characters in :punct: so those should be stripped. The first method does that but it also removes the hyphen. The second doesn't remove the punctuation but handles the hyphen correctly. Would someone point out my mistake, please?

Salathe 07-20-2009 04:27 PM

Do you want to use a whitelist or a blacklist. The former would remove all characters except a specific set, e.g. allow only alphanumeric and hyphen characters. The latter, a blacklist, would remove only specific characters, e.g. !@#$%^&*().

It is also advised to use the PCRE functions (preg_replace) since the POSIX family of functions (ereg_*, split, etc.) is deprecated.

With regards to why ereg_replace("[[!-@],[:punct:]]","",$str) won't work, it's just how you've constructed the character set. There shouldn't be a set of square brackets around !-@. Note that that range includes the dash/hyphen character - so that will be removed.

Back to white- and blacklists, here are a few examples:
Only allow alphanumeric (case insensitive) and hyphen characters
PHP Code:

echo preg_replace('/[^a-z0-9-]/i'''$str); 

Only remove !@#$%^&*()+= characters
PHP Code:

echo preg_replace('/[!@#$%^&*()+=]/'''$str); 


EGYG33K 07-20-2009 07:52 PM

PHP Code:

echo ereg_replace("[^[:alpha:],[:digit:],_,-,[:blank:]]","",$str).'<br>'


benton 07-23-2009 05:48 PM

My thanks to everyone. The code is working as expected now and I have a better understanding of how to manipulate the string. I do appreciate it.


All times are GMT. The time now is 01:18 PM.

Powered by vBulletin® Version 3.6.8
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.1.0