View Single Post
Old 12-06-2007, 08:49 PM   #3 (permalink)
Salathe
Moderateur
RegEx Guru PHP Guru Top Contributor Advanced Programmer 
 
Salathe's Avatar
 
Join Date: Apr 2007
Posts: 1,393
Thanks: 5
Salathe is on a distinguished road
Default

If I've understood everything correctly, you'll want to take a simpler approach to this. We're using preg_split here in an unconventional manner; usually, one would expect to retrieve repeated items split by a certain character or group of characters:
php Code:
$aWords = preg_split('/\s+/', "This is    an\texample");
// $aWords = array('This', 'is', 'an', 'example');
 

Instead, we're going to be taking a different route grabbing the delimiter itself (in the example above, it would be the whitespace) by using a slightly more complicated regular expression which will result in the split strings (what went into $aWords above) being empty. To return the delimiter matches we can use the flag PREG_SPLIT_DELIM_CAPTURE and to ignore the empty split strings we'll use PREG_SPLIT_NO_EMPTY.

To cut a long story short, I think the example below should work for what you need Wildhoney:

php Code:
$szChem = 'H2OCe5CO5L';
$aMatches = preg_split('/(\p{Lu}\P{Lu}*)/', $szChem, null,
                       PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);
print_r($aMatches);
/*
Array
(
    [0] => H2
    [1] => O
    [2] => Ce5
    [3] => C
    [4] => O5
    [5] => L
)
*/

Where it looks like you've been going wrong is that you were only using the \P escape sequence. This means 'any character which is not' whatever follows: \P{Lu} matches any single character which is not an uppercase letter. The alternative is \p does the inverse by matching 'any character which is' whatever follows: \p{Lu} matches a single uppercase letter.
Salathe is offline  
Reply With Quote
The Following 2 Users Say Thank You to Salathe For This Useful Post:
Matt83 (12-06-2007), Wildhoney (12-06-2007)