TalkPHP
 
 
Account Login
Latest Articles
» The basic usage of PHPTAL, a XML/XHTML template library for PHP
» Vulnerable methods and the areas they are commonly trusted in.
» Simple way to protect a form from bot
» The Basics On: How Session Stealing Works
» How to keep your forms from double posting data
IRC Channel
IRC Speech Bubble Join the friendly bunch on IRC...
(#TalkPHP on Freenode)

...Also available via a web interface.

See this thread for information on the TalkPHP Free Hugs Initiative™. Subject to availability.
Associates
Associates
CSS Tutorials
Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old 11-28-2007, 10:43 PM   #1 (permalink)
The Contributor
 
webosb's Avatar
 
Join Date: Nov 2007
Posts: 41
Thanks: 24
webosb is on a distinguished road
Default Need Help with RegEx

I'm trying to code something that parses content the following content:

PHP Code:
<!-- Begin cached content "main" updated on Mon Nov 26 12:00:01 EST 2007 -->
<
div id="maincontent">
<
h2>Resources</h2>
<
ul>
GRAB THIS CONTENT HERE
</ul>
</
div>
<
hr id="seperator"/>
<!-- 
End cached content "main" --> 
I'm using this code:

PHP Code:
<?php
$url 
"http://www.somewebsite.com"
$data file_get_contents($url);
$regex '<!-- Begin cached content (.+?) <!-- End cached content "main" -->';
preg_match($regex,$data,$match);
var_dump($match); 
echo 
$match[1];
?>
Can someone help me fix the regex part so it will grab all the content above?
webosb is offline  
Reply With Quote
Old 11-28-2007, 11:01 PM   #2 (permalink)
La Vida es Sueño
Advanced Programmer Top Contributor 
 
Wildhoney's Avatar
 
Join Date: Sep 2007
Location: Oldham
Posts: 2,280
Thanks: 90
Wildhoney is on a distinguished road
Default

Something like this, perhaps?

php Code:
preg_match('/(<!-- Begin cached content .* content "main" -->)/iUs', $szContents, $aMatches);

You can then find it in $aMatches[1].
__________________
The man who comes back through the Door in the Wall will never be quite the same as the man who went out.
Send a message via AIM to Wildhoney Send a message via MSN to Wildhoney Send a message via Yahoo to Wildhoney
Wildhoney is offline  
Reply With Quote
The Following User Says Thank You to Wildhoney For This Useful Post:
webosb (12-04-2007)
Old 11-29-2007, 07:29 PM   #3 (permalink)
The Contributor
 
webosb's Avatar
 
Join Date: Nov 2007
Posts: 41
Thanks: 24
webosb is on a distinguished road
Default

that worked! thanks alot wildhoney!

*edit* is this example faster than using CURL to get the website content?
webosb is offline  
Reply With Quote
Old 11-29-2007, 09:24 PM   #4 (permalink)
La Vida es Sueño
Advanced Programmer Top Contributor 
 
Wildhoney's Avatar
 
Join Date: Sep 2007
Location: Oldham
Posts: 2,280
Thanks: 90
Wildhoney is on a distinguished road
Default

I'd say so, yes. Although file_get_contents isn't as in-depth as cURL, for something like that, it's better to stick to the simplicity side than calling a group of functions that are somewhat overkill for what you require.
__________________
The man who comes back through the Door in the Wall will never be quite the same as the man who went out.
Send a message via AIM to Wildhoney Send a message via MSN to Wildhoney Send a message via Yahoo to Wildhoney
Wildhoney is offline  
Reply With Quote
Old 12-01-2007, 12:38 AM   #5 (permalink)
Super Moderator
Advanced Programmer 
 
bluesaga's Avatar
 
Join Date: Sep 2007
Posts: 165
Thanks: 0
bluesaga is on a distinguished road
Default

Generally speaking it will be dependent on the website that you are trying to open but generally speaking, while cURL may take a bit longer to implement its by FAR faster than file_get_contents in the long run....

Quote:
I have created a script that tests all 4 methods on 20 different websites 10 times each. The results are clear, here is one set of the results:

Curl time: 43.02 seconds
FGC time: 86.48 seconds
Fopen time: 86.34 seconds
Socket time: 44.91 seconds
Quoting: http://uk2.php.net/manual/en/ref.curl.php#75126
__________________
Halo 3 Cheats
bluesaga is offline  
Reply With Quote
The Following User Says Thank You to bluesaga For This Useful Post:
webosb (12-04-2007)
Old 12-01-2007, 02:48 AM   #6 (permalink)
La Vida es Sueño
Advanced Programmer Top Contributor 
 
Wildhoney's Avatar
 
Join Date: Sep 2007
Location: Oldham
Posts: 2,280
Thanks: 90
Wildhoney is on a distinguished road
Default

If you were to have asked me before-hand about which I thought was faster, FGC or cURL, I would have been more inclined to have said FGC. I was under the false impression that it was a right little light-weight in comparison to cURL! Seems cURL is actually the master of both speed and functionality.
__________________
The man who comes back through the Door in the Wall will never be quite the same as the man who went out.
Send a message via AIM to Wildhoney Send a message via MSN to Wildhoney Send a message via Yahoo to Wildhoney
Wildhoney is offline  
Reply With Quote
Old 12-09-2007, 04:17 PM   #7 (permalink)
The Contributor
RegEx Guru 
 
Join Date: Dec 2007
Location: Belgium
Posts: 60
Thanks: 6
Geert is on a distinguished road
Default

Quote:
Originally Posted by Wildhoney View Post
php Code:
preg_match('/(<!-- Begin cached content .* content "main" -->)/iUs', $szContents, $aMatches);

You can then find it in $aMatches[1].
Small tip. You can speed up the regex by removing the capturing parentheses. You don't need them because the full pattern match is already included in $aMatches[0].
__________________
Kohana - PHP5 framework
Geert is offline  
Reply With Quote
Reply



Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


All times are GMT. The time now is 05:11 AM.

 
     

Powered by vBulletin® Version 3.6.8
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.1.0
Inactive Reminders By Icora Web Design