05-08-2008, 06:59 PM
|
#6 (permalink)
|
|
The Acquainted
Join Date: May 2008
Posts: 175
Thanks: 9
|
Yeah very true... its all about its personalization to the site. The spider initially came from someone telling me a PHP spider couldn't be made. Before MySQL crashed on me (lol), I indexed a forums based site for testing with well over a million rows (see my post about mysql buckeling after a million rows). This is something I need to fine tune without a doubt, as it will be able to handle far more then that. MySQL was never meant to be the indexer anyways; just a staging database to temporarily hold the data while it is being indexed. Before it crashed, the script had been only running for 6-7 hours. Not bad I guess, but I can fine tune it to be better. It was a vbulletin site too, so the pages were rather bloated.
The spider is a very rough script but the basis of it does work. Obviously if this ever grows anywhere, I will probably need a more steady OS based program, but who knows.
I have heard of sphinx as well... I was actually thinking of using it as the actual indexer in this system (why reinvent the wheel for something that works well?); the hardest part is the data gathering heh.
|
|
|