View Single Post
Old 05-08-2008, 06:59 PM   #6 (permalink)
drewbee
The Acquainted
 
drewbee's Avatar
 
Join Date: May 2008
Posts: 175
Thanks: 9
drewbee is on a distinguished road
Default

Yeah very true... its all about its personalization to the site. The spider initially came from someone telling me a PHP spider couldn't be made. Before MySQL crashed on me (lol), I indexed a forums based site for testing with well over a million rows (see my post about mysql buckeling after a million rows). This is something I need to fine tune without a doubt, as it will be able to handle far more then that. MySQL was never meant to be the indexer anyways; just a staging database to temporarily hold the data while it is being indexed. Before it crashed, the script had been only running for 6-7 hours. Not bad I guess, but I can fine tune it to be better. It was a vbulletin site too, so the pages were rather bloated.

The spider is a very rough script but the basis of it does work. Obviously if this ever grows anywhere, I will probably need a more steady OS based program, but who knows.

I have heard of sphinx as well... I was actually thinking of using it as the actual indexer in this system (why reinvent the wheel for something that works well?); the hardest part is the data gathering heh.
Send a message via AIM to drewbee
drewbee is offline  
Reply With Quote