TalkPHP
 
 
Account Login
Latest Articles
» The basic usage of PHPTAL, a XML/XHTML template library for PHP
» Vulnerable methods and the areas they are commonly trusted in.
» Simple way to protect a form from bot
» The Basics On: How Session Stealing Works
» How to keep your forms from double posting data
IRC Channel
IRC Speech Bubble Join the friendly bunch on IRC...
(#TalkPHP on Freenode)

...Also available via a web interface.

See this thread for information on the TalkPHP Free Hugs Initiative™. Subject to availability.
Associates
Associates
CSS Tutorials
Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old 03-20-2008, 10:54 PM   #1 (permalink)
The Acquainted
 
Join Date: Sep 2007
Posts: 126
Thanks: 4
Sam Granger is on a distinguished road
Default Search spider?

How would I go about making a php script that does the following: You give various urls to check. Spider has to check all links on that domain & page besides a few links which are specified not to be searched. The spider checks for certain texts eg http://www.fixeddomain.com/********/*****.*** and saves them all.

How would I go round doing something like this? Thanks :)
Sam Granger is offline  
Reply With Quote
Old 03-20-2008, 11:02 PM   #2 (permalink)
The Acquainted
 
freenity's Avatar
 
Join Date: Feb 2008
Posts: 119
Thanks: 17
freenity is on a distinguished road
Default

well, I guess you should get the page contents, search for link probably using some regex, put those links to an array and recursively do the same...... (get a link, search for more links, etc...)

If you need to index pages, a posibility would be making 2 scripts: one for extracting links, and another for using those links to enter and index the page. They would probably need to use a database or a file to store/read those links.
__________________
http://feudal-times.net - My PBB Game
http://gwphp.feudal-times.net - My Blog "Gaming With PHP"
freenity is offline  
Reply With Quote
Old 03-28-2008, 02:20 PM   #3 (permalink)
The Contributor
 
flyingbuddha's Avatar
 
Join Date: Jan 2008
Location: Birmingham, UK
Posts: 60
Thanks: 10
flyingbuddha is on a distinguished road
Default

Sphider - a php spider and search engine
__________________
Pro. Geek
http://www.mikeholloway.co.uk
flyingbuddha is offline  
Reply With Quote
The Following User Says Thank You to flyingbuddha For This Useful Post:
freenity (03-28-2008)
Reply



Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


All times are GMT. The time now is 09:40 PM.

 
     

Powered by vBulletin® Version 3.6.8
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.1.0
Inactive Reminders By Icora Web Design