03-21-2008, 03:58 PM
|
#4 (permalink)
|
|
Moderateur
Join Date: Apr 2007
Posts: 1,393
Thanks: 5
|
Why not use DOM in this instance, it will provide a far more reliable means of grabbing the paragraph elements rather than trying to delve into the intricacies of a suitable regular expression.
For example:
PHP Code:
<?php
/* Load the HTML document. It is a good idea to cache the remote document rather than load it from the remote server every time the script is called */ $dom = @DOMDocument::loadHTMLFile('http://lipsum.com/feed/html');
/* Grab all paragraph elements in the document. $nodes is a DOMNodeList object */ $nodes = $dom->getElementsByTagName('p');
/* Quick debugging to see what we've got */ header('Content-Type: text/plain; charset=utf-8'); foreach ($nodes as $p) { // Could use $p->textContent if we only wanted // the text content (no HTML tags) var_dump($dom->saveXML($p)); }
|
|
|
|