I'm wondering if there's any good tuts out there on building a localized search within PHP/MySQL, and I'm also wondering if it's worth my time to actually build one for a system that's already MySQL based...
There's two main fields in my DB that I'd want to search on, 'summary', and 'description'. Unless there's some built in MySQL trickery that I can use with LIKE or WHERE, that it might be worth my time to build a script which will spider my database for keywords and write another table which would store the keywords, where they were found, and how often they were found for each page.
My dilemna at this point is whether or not I should do a row for row database (for each row in the first, there would be a corresponding row in the second which would just include keywords), or if I should do a keyword based table that would include a list of rows in the first db that they were found in.
Further to that, I need to figure out how to prioritize my results. I'm searching Hotscripts to look for similar things I can dissect, but so far everything I've turned up is based on spidering your site, and doing a site search. I guess the theory remains the same though, I'm just spidering a database table and doing a localized return for that table...
I'd like to build this ground up though, for the learning experience I'll get, and my primary question is are there any good tutorials or blogs for tips and tricks I might try and utilize in my endeavours. Secondary to that, do any of you have any tips or ideas from experience on how I should go about building the database for this?
Does full text automatically eliminate common words from it's index? I've been playing around with various queries to see what kind of results were returned, and so far the outcome hasn't been great - I'm going to assume more due to my lack of knowledge, not due to MySQL at this point, but I'm curious...
I did the following query, because the word 'you' turns up multiple times in multiple rows:
SELECT * ,
MATCH (author, summary, description)
AGAINST ('zend') AS Relevance
HAVING Relevance > 0.2
ORDER BY Relevance DESC
LIMIT 0 , 30
No results returned, despite knowing the word is in there. So I removed the relevance portion, now all rows with the word in it are returned, but all have a relevance of 0. So I'm going to hope this is simply because MySQL's fulltext has the ability to ignore common words? If so, this seems like something that would be better off left in the control of ... well me, than MySQL, but okay.
Next, I tried searching on 'zend' because my table includes sample data copy and pasted from these forums, and I know it's in there. No results. Wait, if I change it to Zend, I get a result. So apparently fulltext is also case sensitive....
Okay, wait again, as I'm going through the manual I found the list of stopwords, and yes, 'you' is definitely one. So bad me, I should read first, write later.
Still can't find anything about searches being case sensitive though. Seems to be a nice feature, but it may be in my best interest at this point to use it as a means to an end - I think I may continue to look into developing a seperate table based search, but using fulltext may help build the spider and the final query still.
ye its not perfect but its pretty damn good. Try full text in mssql I couldnt get shit to work. I haven't noticed any case sensitivity. The only bad thing I have found with it is that a search must be more then 3 characters so you cant simply search for 'php'.
over at roscripts they have a pretty good php tutorial that you might want to check out if this full text search isnt enough. He walks through the steps of making a good search engine so you can see how it progresses a good read regardless of if you use it or not.
I'll take a look at that, thanks. Good to have some practice with fulltext, but it just dawned on me that I eventually want to add database abstraction and using a MySQL specific search engine would become a huge speed bump when that happens.
Are there benchmarks out there for PHPs array handling? I'm wondering if I was to pull an entire database out into an array and then work on it from there to create the index, or if I should pull it out in chunks... Hmm. I just need a solid base to start with, so I guess I'll just go play around till something fits.